Benchmarks


Theoretical Performance

At a quoted 16 pipeline, GeForce 6800 Ultra should have a very large theoretical fill-rate. Here are a few theoretical numbers for 6800 Ultra in relation to FX 5950 Ultra:

6800 Ultra 400 6400 6400 320 550 35.2
5950 Ultra 475 1900 3800 190 475 30.4
5900 Ultra 450 1800 3600 180 425 27.2
5800 Ultra 500 2000 4000 200 500 16.0
5950 Ultra -15.8% 236.8% 68.4% 68.4% 15.8% 15.8%
5900 Ultra -11.1% 255.6% 77.8% 77.8% 29.4% 29.4%
5800 Ultra -20.0% 220.0% 60.0% 60.0% 10.0% 120.0%

Looking at the key differences between 6800 Ultra and 5950 Ultra we see that although 6800 Ultra has a reasonably significantly lower clock speed than than 5950, it still has a whopping 237% pixel fill-rate advantage thanks to it having 4 time the number of pipelines. The difference in texture rates is only 68% because 5950 has twice the texture samplers per pipeline than 6800 Ultra does. The triangle rate is only 68% again because though 6800 Ultra has twice the number of vertex shaders, and the clock speed difference pulls it back somewhat.

With GDDR-3 memory availability still being quite low, and in the relatively low clock rates (for GDDR-3) the difference in clock rate and hence memory bandwidth, as both boards feature a 256-bit bus, is only 16%, which is quite low considering the pixel fill-rate differences.

With GeForce FX 5800 and 5900 there was some disagreements as to how NVIDIA had classified their pipelines then, as they described them as 8x1 (8 pixel pipes with one texture per pipe) yet they transpired to be "4x2" or "8x0" (4 pixel pipes with two texture per pipe, or 8 Z / Stencil pixel pipes). NV40 is being described as a 16 pipe architecture, though there has been some reluctance to really believe this until it's proved. Let's go on to look at some theoretical performance tests to see if we can begin to put the pipeline theory to the test.

6800 Ultra 3546.8 6031.2 130.2 33.5 242.6
5950 Ultra 1724.9 3446.5 108.4 29.2 194.4
5900 Ultra 1645.4 3276.1 104.0 27.8 185.7
5800 Ultra 1504.0 3481.7 109.7 30.7 181.9
5950 Ultra 105.6% 75.0% 20.1% 14.7% 24.8%
5900 Ultra 115.6% 84.1% 25.2% 20.5% 30.6%
5800 Ultra 135.8% 73.2% 18.7% 9.1% 33.4%

All the geometry geometry tests show smaller gains for the 6800 Ultra over the 5950 than the theoretical rates would suggest. In the case of the fixed function tests this might indicate that the fixed function elements have been completely removed from the NV40 pipeline and the fixed function geometry processing is running via vertex programs.

When we look at the fill-rate numbers, though, we can still see a large disparity between the single texturing and multi-texturing fill-rates. The multi-texturing test clearly indicates that there is 16 texture samplers available, though the single texturing test is much lower (although, greater than half the theoretical rate). The reason for this is that the 3DMark2001SE fill-rate test requests a blend to the frame buffer, which increases the level of bandwidth required as opposed to just straight colour writes, so in this case the test is very bandwidth limited on the 6800 Ultra. Instead, we'll looks at the fill-rates under Marko Dolenc's Fill-rate Tester:

6800 Ultra 6096.7 11749.8 5313.0 2999.3 2014.5 1522.5

Here we can clearly see that the Pure colour fill-rate of GeForce 6800 Ultra is in the range that you would expect a 16 pipelined chip to be in. We can also see that the Z pixel rate is about double that of the colour rate, which confirms that the ROPs can output twice as many non-colour pixels (Z/Stencils) as coloured pixels, which will be useful for titles that use a Z-only initial pass and / or stencil shadows. The texture performance drop-off is also wholly consistent of an architecture with one texture unit per pipeline, as the triple texture drops the fill-rate by a third whereas if there were two texture units per pipeline the performance should drop by roughly a quarter. These results are very consistent with a 16x1 / 32x0 pipeline configuration.