Benchmarks - Theoretical Rates

Before going on to look at any actual benchmarks scores we'll take a look at the theoretical rates of the boards we're looking at in this article:

6600 GT AGP 500 2000 4000 375 450 14.4
6600 GT PCIe 500 2000 4000 375 500 16.0
5700 Ultra 475 1900 1900 356 450 14.4
5800 Ultra 500 2000 4000 375 500 16.0

6600 GT PCIe 0.0% 0.0% 0.0% 0.0% -10.0% -10.0%
5700 Ultra 5.3% 5.3% 110.5% 5.3% 0.0% 0.0%
5800 Ultra 0.0% 0.0% 0.0% 0.0% -10.0% -10.0%

The theoretical differences between the AGP and PCI Express 6600 GT's lie in their memory bandwidth differences, with the AGP version having 10% less. Although its not reflected in the above table, the other area in which they differ is the bandwidth differences in the host interface with the PCI Express version having twice the bandwidth concurrently, whilst the AGP bus can only transfer in either direction at any one time - this should, of course, only make a difference where the bus performance is taxed, and in most common, current applications this is at high resolution with FSAA as the frame-buffer requirements push some textures across to system RAM.

The 5700 Ultra was a fairly curious design in that it had 4 pixel pipelines and 4 texture units, but it appears that all of these could only be used in fixed function, single texturing cases - any other type of operation and the chip would behave with two pipelines and two texture units. The 6600's also have an interesting pipeline in that they have 8 fragment pipelines, but only 4 ROP's, so although 8 pixels can be operated on internally, a maximum of 4 will only ever be outputted - the upshot of this is that whilst we see the 6600's have about the same pixel performance and twice the texture performance, it in fact has 4 times the number of shader pipelines (and these are more flexible and capable than 5700's). The 6600 GT AGP also has the same bandwidth as the earlier GDDR2 5700 Ultra that we are using here.

Finally, we see that the top level specifications for the 5800 Ultra and the 6600 GT AGP are the same for the pixel and texture performance, whilst the 6600 GT AGP has 10% less bandwidth. The 6600 shader pipeline is much more tailored to the demands of today's games, so well see what type of performance each of these yields.

Fill-Rates

For the first test we'll take a look at some of the key fill-rate characteristics of the boards on test here:

6600 GT AGP 2000.7 3802.6 1477.2 1910.1
6600 GT PCIe 1985.6 3790.0 1633.3 1973.0
5700 Ultra 1907.6 1902.5 1041.9 958.8
5800 Ultra 2018.3 3815.1 1399.2 1890.0

6600 GT PCIe 0.8% 0.3% -9.6% -3.2%
5700 Ultra 4.9% 99.9% 41.8% 99.2%
5800 Ultra -0.9% -0.3% 5.6% 1.1%

The Fill-rates are inline with the theoretical specifications, with the 6600's having twice the Z fill-rate of its colour fill-rate. The 6600's alpha blend fill-rate is lower than the the colour fill-rate, however its greater than half indicating that this is due to bandwidth constraints rather than a hardware limitation - this is further highlighted by the fact the PCIe 6600 GT's scores are higher than the AGP versions. Finally, the 1 Floating Point Texture score is half the texture rate, indicating that it takes two cycles to sample an FP16 texture.

In comparison to the other boards we see that the 5800 Ultra has fill-rates very similar to the 6600's. The 5700 is able to use all 4 of its pipeline with the straight colour fill test, however in the other cases the performance is about half.

6600 GT AGP 1162.75 826.02 515.00 385.61 283.45 234.56 198.44 171.82
6600 GT PCIe 1191.94 886.58 538.81 385.61 297.44 242.47 207.45 178.71
5700 Ultra 1106.00 611.25 316.97 290.00 206.50 193.02 152.64 145.96
5800 Ultra 1099.56 776.15 465.37 394.24 298.05 266.12 220.16 201.25

6600 GT PCIe -2.4% -6.8% -4.4% 0.0% -4.7% -3.3% -4.3% -3.9%
5700 Ultra 5.1% 35.1% 62.5% 33.0% 37.3% 21.5% 30.0% 17.7%
5800 Ultra 5.7% 6.4% 10.7% -2.2% -4.9% -11.9% -9.9% -14.6%

Here we can see that the 6600's multi-texturing performance follows a smooth curve down when each layer is added, unlike the 5700, and to a lesser extent (visibly on the graph) the 5800, however the performance doesn't go above the colour fill-rate with one texture. This highlights what we discovered in the PCIe 6600 GT Preview, that there are two quads of internal pipelines and but only a single quad ROP - this enables the 8 texture / fragment pipelines to act as two independent quads internally, but with single texture operations only 4 pixels can be outputted per clock.