Beyond3D - NVIDIA G80: Architecture and GPU Analysis

NVIDIA G80: Architecture and GPU Analysis - Page 2

Published on 8th Nov 2006, written by Rys for Consumer Graphics - Last updated: 25th Apr 2007

The Chip

NVIDIA carry the GeForce brand into its 8th iteration with G80, the first SKUs to be released called GeForce 8800 GTX and GeForce 8800 GTS respectively, with NVIDIA resurrecting the GTS moniker for the first time since GeForce 2. G80 itself is probably the biggest and most complex piece of mass-market silicon ever created.

Click for a bigger version

Built on TSMC's 90GT process, G80 is some 681M transistors big with a rough die area of 480mmÃ‚Â², supporting Direct3D 10 (Shader Model 4.0) and implementing a heavily threaded, unified shader architecture. NVIDIA disguise the actual die with a package that includes a heatspreader module, for more effectively getting the heat output from the GPU to the cooling solution. Natively PCI Express, NVIDIA have announced no AGP variant and this time around we honestly don't expect one for any high-end G80 configuration either, despite the option being there with NVIDIA's own BR-series of bridge ICs designed to glue AGP to PCIe and vice versa.

Support for Shader Model 4.0 means a hefty change in specification, one which we condense down for those looking for a checkbox-style overview that scratches the surface.

NVIDIA G80 Architecture
- Full Direct3D 10 Support
- DirectX10 Shader Model 4.0 Support
  - Vertex Shader 4.0
  - Geometry Shader 4.0
  - Pixel Shader 4.0
  - Internal 128-bit Floating Point (FP32) Precision
- Unlimited Shader Lengths
- Up to 128 textures per pass
- Support for FP32 texture formats with filtering
- Non-Power of two texture support
- 8 multiple Render Targets
NVIDIA Lumenex Technology
- Full FP32 floating point support throughout the entire pipeline
- FP32 floating point frame buffer support
- Up to 8x, gamma adjusted, native multisampling FSAA with jittered or rotated grids
- Up to 16x, coverage sample antialiasing
- Transparent multisampling and supersampling
- Lossless color, texture, Z and stencil data compression
- Fast Z clear
- Up to 16x anisotropic filtering
NVIDIA SLI Support
NVIDIA Pure Video HD Technology
- Adaptable programmable video processor with GPU shader core assist and post processing
- High Definition video decode acceleration (H.264, VC-1, WMV-HD, MPEG2-HD)
- Spatial temporal de-interlacing
- Inverse 2:2 and 3:2 pull-down (inverse telecine)

Time for a quick look at the reference board and our test setup before we dive in to the architecture analysis.

NVIDIA G80: Architecture and GPU Analysis - Page 2

The Chip

Page Navigation