Further Pixel Shader Tests

Let's take a look at a few more Pixel Shader tests from RightMark 3D, but first the Pixel Shader tests from Marko Dolenc's Fill-rate Tester.

PS 1.1 - Simple 2092.0 938.3 891.8 988.8 122.9% 134.6% 111.6%
PS 1.4 - Simple 2092.8 887.6 843.5 935.6 135.8% 148.1% 123.7%
PS 2.0 - Simple 3115.5 448.7 427.8 475.1 594.3% 628.3% 555.7%
PS 2.0 PP - Simple 2091.9 889.8 845.6 631.3 135.1% 147.4% 231.4%
PS 2.0 - Longer 1573.0 225.9 214.7 381.3 596.5% 632.6% 312.5%
PS 2.0 PP - Longer 1573.2 450.1 428.8 381.3 249.5% 266.9% 312.6%
PS 2.0 - Longer 4 Registers 1572.6 225.2 209.2 308.2 598.4% 651.7% 410.3%
PS 2.0 PP - Longer 4 Registers 1572.6 595.1 565.7 381.3 164.3% 178.0% 312.4%
PS 2.0 - Per Pixel Lighting 420.1 95.3 90.6 61.2 340.7% 363.8% 586.6%
PS 2.0 PP - Per Pixel Lighting 626.0 127.5 121.3 73.2 390.9% 416.2% 755.2%

In general terms, once again we see GeForce 6800 Ultra having very large gains over the GeForce FX 5950 Ultra, with the PS1.x tests being over twice as fast as on the 6800 Ultra, and in some cases much greater gains for the more complicated shaders.

Looking at the 6800 Ultra results a little more closely though, we see some curious performances because in some cases the PS1.x and partial precision tests are slower than the PS2.0 tests! With the PS1.x and partial precision tests the internal precision is likely to be calculated at FP32 precision, but a type conversion must occur at some point - each of the shaders in this test are actually relatively, with the first few only using about 4 instructions - these are probably executed in two cycles and it may be the case that the the type conversion is not free and hence there is an extra cycle penalty for the partial precision and PS1.x integer shaders, meaning that the FP32 shaders are actually faster in these short shader cases.

While the very short shaders actually appear to be showing a small performance drop for partial precision, as we get to the slightly larger shaders there appear to be no performance difference at all. However, there does appear to be a performance drop on 6800 Ultra with full precision for the Per Pixel Lighting test.

6800 Ultra 420.1 626.0 -32.9%
5950 Ultra 95.3 127.5 -25.3%
5900 Ultra 90.6 121.3 -25.3%
5800 Ultra 61.2 73.2 -16.4%

The table above highlights the FP16 and FP32 rendering performance of the boards and we can see that in fact 6800 Ultra is losing more performance for longer FP32 programs than 5950 is, relatively. It appears to be the case that NV40 may have a greater register space than NV38, meaning that slightly longer programs can execute with no registry limitations, but as the registry space is used up on longer programs the performance drops off at a greater rate.

6800 Ultra 554.4 414.4 413.1 333.6 421.6 167.7 235.1
5950 Ultra 268.4 141.2 173.4 67.4 108.0 27.1 50.9
5900 Ultra 254.2 134.1 164.6 64.0 102.7 25.7 48.4
5800 Ultra 267.1 96.6 96.5 52.2 62.5 23.4 33.2
5950 Ultra 106.6% 193.5% 138.2% 395.0% 290.3% 519.8% 361.7%
5900 Ultra 118.1% 209.0% 150.9% 421.3% 310.4% 552.1% 385.6%
5800 Ultra 107.6% 328.8% 327.9% 538.8% 574.5% 618.0% 608.2%

Looking at the RightMark3D tests we've selected here we get further confirmation of what we've seen previously - the PS1.x test having about twice the performance in 6800 Ultra and further gains for the more complicated shaders, up to over 6 times the performance for the Phong Lighting test. Again, we can see 6800 Ultra getting significant performance boosts from partial precision rendering with the longer, more complex shaders.

6800 Ultra 399.1 265.1 167.9 105.5 73.3
5950 Ultra 67.6 43.1 27.1 16.7 11.5
5900 Ultra 64.1 41.0 25.8 15.9 11.0
5800 Ultra 56.3 36.4 23.4 14.4 10.0
5950 Ultra 490.4% 514.6% 520.1% 531.9% 535.0%
5900 Ultra 522.6% 547.0% 551.6% 565.8% 568.0%
5800 Ultra 608.7% 627.4% 618.5% 631.3% 634.3%

Here we can see one Phong Lighting test as it scales over the resolutions. On the 5950 the test is nigh on completely fill-rate bound even at the low resolutions. We see that the 6800 Ultra is actually very slightly system limited at the lower resolutions, but not by a great deal. The performance drop from low to high resolution is very significant, though here the 6800 Ultra is still rendering well above 60 FPS at 1600x1200.