Pixel Shader Performance

One of the primary changes for G70 was the changes to the Fragment Shader ALU's in order to make each of the two per pipeline closer to instruction duplicates, such that there should be more occasions where two instructions can be dual issued per clock. First we'll use some of RightMarkD3D's pixel shader tests to see the effects of the changes to the pipeline.

PS1.1 Procedural 697.9 415.5 68.0%
PS1.4 Procedural Procedural 582.9 303.1 92.3%
PS2.0 Procedural 360.2 190.4 89.2%
PS2.0 1 Light (FP) 351.6 192.2 82.9%
PS2.0 1 Light (PP) 377.0 239.8 57.2%
PS2.0 3 Lights (FP) 195.5 95.7 104.4%
PS2.0 3 Lights (PP) 219.9 119.9 83.3%
PS2.0a 3 Lights (FP) 68.2 39.7 71.6%
PS2.0a 3 Lights (PP) 91.1 59.7 52.5%

Here we can see the effects that adding both extra pipelines and more instructions as there are cases where the performance increase from 6800 Ultra to 7800 GTX is greater than the theoretical test rate differences, up to the point where there is a genuine double shader performance.

shader 2 1437 958 50.0%
shader 3 1196 778 53.7%
shader 4 1196 777 53.9%
shader 5 1077 698 54.3%
shader 6 1196 778 53.7%
shader 7 1077 658 63.7%
shader 8 778 419 85.7%
shader 9 1476 1075 37.3%
shader 10 1316 838 57.0%
shader 11 1196 718 66.6%
shader 12 718 479 49.9%
shader 13 703 421 67.0%
shader 14 778 479 62.4%
shader 15 539 329 63.8%
shader 16 568 359 58.2%
shader 17 777 444 75.0%
shader 18 89 56 58.9%
shader 19 297 178 66.9%
shader 20 99 64 54.7%
shader 21 95 92 3.3%
shader 22 238 120 98.3%
shader 23 279 133 109.8%
shader 24 180 82 119.5%
shader 25 174 99 75.8%
shader 26 166 94 76.6%

The various ShaderMark tests further illustrate what we saw with RightMarkD3D, except in this case some of the gains are even larger than twice the performance for the 7800 GTX in relation to the 6800 Ultra.