Pixel Shader Performance
As we've noted in the architectural overview, the underlying ALU structure and the number of pipelines for the R520 chips stays the same as R420/R423/R480, however changes to the shader capabilities have been made, boosting it from Shader 2.x to 3.0, the instruction scheduler has been changed and many other elements surrounding the memory handling is different. Here we'll see if these changes have improved the utilisation on the ALU's within R520's Pixel Shader core.

Rightmark3D, 1024x768 (FPS) | X1800 XT | X1800 XL | X850 XT PE | X800 XT | X1800 XT % Faster than X850 XT PE | X1800 XL % Faster than X800 XL |
PS1.1 Procedural | 664.0 | 522.7 | 510.8 | 467.8 | 30.0% | 11.7% |
PS1.4 Procedural Procedural | 428.8 | 328.4 | 284.9 | 254.9 | 50.5% | 28.8% |
PS2.0 Procedural | 316.8 | 251.9 | 244.6 | 240.8 | 29.5% | 4.6% |
PS2.0 1 Light (FP) | 254.6 | 202.6 | 212.8 | 198.3 | 19.7% | 2.2% |
PS2.0 1 Light (PP) | 254.8 | 202.8 | 212.5 | 197.9 | 19.9% | 2.5% |
PS2.0 3 Lights (FP) | 143.0 | 113.9 | 120.5 | 112.2 | 18.6% | 1.5% |
PS2.0 3 Lights (PP) | 143.1 | 114.0 | 120.5 | 112.1 | 18.7% | 1.7% |
PS2.0a 3 Lights (FP) | 59.2 | 47.2 | 50.2 | 46.6 | 17.9% | 1.4% |
PS2.0a 3 Lights (PP) | 59.2 | 47.3 | 50.1 | 46.6 | 18.1% | 1.7% |
By looking at the performances of the X1800 XL in relation to the X800 XT we see that in many cases for the Rightmark tests the newer architecture isn't actually performing that much higher than the old - in these particular tests, that more or less rely solely on the Pixel Shaders, there probably isn't much room for the scheduler to optimise the shader processing as they are probably at fairly high utilisation anyway, which is why there is little difference between the two generations of architectures on the longer shaders. Perversely, were we do see the greater benefit are the shorter shaders, and the reason for this is likely because these are a little more texture bound and the improvements we've already witnessed in texture handling comes in to effect and the latencies involved in texture fetching this gives the shader dispatch processor greater leeway to be able to schedule the batches more effectively between the available processing units.
Shadermark V2.1 (FPS) | X1800 XT | X1800 XL | X850 XT PE | X800 XT | X1800 XT % Faster than X850 XT PE | X1800 XL % Faster than X800 XL |
shader 2 | 1339 | 1063.0 | 1074 | 981.0 | 24.7% | 8.4% |
shader 3 | 918 | 732.0 | 763 | 706.0 | 20.3% | 3.7% |
shader 4 | 985 | 783.0 | 818 | 754.0 | 20.4% | 3.8% |
shader 5 | 752 | 603.0 | 630 | 583.0 | 19.4% | 3.4% |
shader 6 | 917 | 732.0 | 763 | 702.0 | 20.2% | 4.3% |
shader 7 | 887 | 707.0 | 696 | 631.0 | 27.4% | 12.0% |
shader 8 | 671 | 534.0 | 426 | 376.0 | 57.5% | 42.0% |
shader 9 | 1812 | 1342.0 | 1308 | 1193.0 | 38.5% | 12.5% |
shader 10 | 1374 | 1022.0 | 846 | 753.0 | 62.4% | 35.7% |
shader 11 | 909 | 724.0 | 717 | 650.0 | 26.8% | 11.4% |
shader 12 | 619 | 426.0 | 277 | 245.0 | 123.5% | 73.9% |
shader 13 | 638 | 453.0 | 372 | 327.0 | 71.5% | 38.5% |
shader 14 | 808 | 599.0 | 419 | 370.0 | 92.8% | 61.9% |
shader 15 | 427 | 342.0 | 299 | 275.0 | 42.8% | 24.4% |
shader 16 | 569 | 454.0 | 350 | 309.0 | 62.6% | 46.9% |
shader 17 | 643 | 503.0 | 448 | 397.0 | 43.5% | 26.7% |
shader 18 | 68 | 54.0 | 48 | 43.0 | 41.7% | 25.6% |
shader 19 | 256 | 196.0 | 147 | 132.0 | 74.1% | 48.5% |
shader 20 | 82 | 66.0 | 48 | 45.0 | 70.8% | 46.7% |
shader 21 | 151 | 120.0 | ||||
shader 22 | 285 | 233.0 | 214 | 199.0 | 33.2% | 17.1% |
shader 23 | 316 | 259.0 | ||||
shader 24 | 221 | 181.0 | 149 | 139.0 | 48.3% | 30.2% |
shader 25 | 180 | 141.0 | 127 | 118.0 | 41.7% | 19.5% |
shader 26 | 190 | 149.0 | 130 | 120.0 | 46.2% | 24.2% |
The ShaderMark v2.1 performances show higher gains between the two generations of boards, with the X1800 XL showing as much as a 74% performance increase over the X800 XT, however bear in mind that along with the architectural differences the shader profile differences have to be factored in as well as these tests will be able to run under the Shader Model 3.0 profile on the X1800's but not on the older Radeons.