Benchmarks
Theoretical Rates
Before going on to look at any actual benchmarks scores we'll take a look at the theoretical metrics of the boards we are using in this review. Because the X1900 has the same number of texture units and ROPs as the X1800, but internally is processing three times the number of pixels we shall also include the fragment rate of each of the boards here.
ATI R580 Theoretical Rates
Core Clock(MHz) | Fillrate(Mp/s) | Texture fillrate (Mt/s) | Fragment rate | Triangle rate (Mtris/p) | Memory Clock (MHz) | Memory Bandwidth (GB/s) | |
X1900 XTX | 650 | 10400 | 10400 | 31200 | 1300 | 775 | 49.6 |
X1900 XT | 625 | 10000 | 10000 | 30000 | 1250 | 725 | 46.4 |
X1800 XT | 625 | 10000 | 10000 | 10000 | 1250 | 750 | 48.0 |
X850 XT PE | 540 | 8640 | 8640 | 8640 | 810 | 590 | 37.8 |
% Diff | Core Clock | Fillrate | Texture fillrate | Fragment rate | Triangle rate | Memory clock | Memory bandwidth |
X1900 XTX to X1900 XT | 4.0% | 4.0% | 4.0% | 4.0% | 4.0% | 6.9% | 6.9% |
X1900 XTX to X1800 XT | 4.0% | 4.0% | 4.0% | 212.0% | 4.0% | 3.3% | 3.3% |
X1900 XTX to X850 XT PE | 20.4% | 20.4% | 20.4% | 261.1% | 60.5% | 31.4% | 31.4% |
Comparing the clock rates between the X1900 XTX and X1800 XT shows only a 4% difference on the core, and 3% on the memory. These clock rate differences only correspond to a 4% texture and colour fillrate increase and an extra 3% bandwidth, as all these elements are the same between the two chips. It is, of course, in the fragment pipeline were the major difference between the two boards occurs, with the X1900 XTX having a 212% pixel shader rate advantage over X1800 XT.
In comparison to the previous generation X850 XT PE the X1900 XT has a 20% core clock advantage, which translates to the same difference for colour and texture fillrates as the X850 XT PE also has 16 texture and ROP units, but only 1 pixel shader processor per pipeline, so the X1900 XTX has a 261% fragment rate advantage. The X1900 also has two extra vertex shaders than the X850, so this, coupled with the clocks, equates to 31% advantage for the X1900 XTX. The memory bus width between the two boards is the same, at 256-bit, so the bandwidth difference corresponds to their memory speed differences. Of course, what these fillrate specification hide is the differences between the underlying architectures, hence how efficient they are relative to one another at actually utilising their capabilities.
Although we won't be looking at the X1900 Crossfire board in lieu of the standard X1900 XT board in these theoretical tests, we've included the theoretical numbers just to highlight the differences between the two. With such small clock rate differences it translates into a 4% core difference (hence fill, fragment and vertex shader rates) and a 7% bandwidth difference.
ATI R580 Measured Theoretical Rates
Color Fill | Z Fill | Single Texture | Single Texture Alpha Blend | 1 Floating Point Texture | |
X1900 XTX | 9719.0 | 10209.8 | 7303.1 | 4391.4 | 4859.5 |
X1800 XT | 9447.3 | 9812.2 | 7116.9 | 4285.7 | 4670.8 |
X850 XT PE | 6195.8 | 8138.6 | 5617.0 | 3309.3 | 3628.9 |
% Diff from Theoretical | Color Fill | Z Fill | Single Texture | Single Texture Alpha Blend | 1 Floating Point Texture |
X1900 XTX | -6.5% | -1.8% | -29.8% | -57.8% | -53.3% |
X1800 XT | -5.5% | -1.9% | -28.8% | -57.1% | -53.3% |
X850 XT PE | -28.3% | -5.8% | -35.0% | -61.7% | -58.0% |
% Diff | Color Fill | Z Fill | Single Texture | Single Texture Alpha Blend | 1 Floating Point Texture |
X1900 XTX to X1900 XT | 2.9% | 4.1% | 2.6% | 2.5% | 4.0% |
X1900 XTX to X850 XT PE | 56.9% | 25.4% | 30.0% | 32.7% | 33.9% |
When putting the theoretical rates to the test we can see that generally speaking the newer X1000 class of boards are able to get closer to those peaks than the X850 XT PE is. Beyond that, the X1900 is behaving very much like we would expect it to, with the Z fillrate confirming that only one Z sample/test is performed per cycle with R580 (unlike RV530) and the FP16 texture sampling takes two cycles, which probably results in the FP32 sampling rate taking 4. The test highlights that applying a single texture cycle and outputting a colour value is bandwidth limited with 16 textures and ROPs and nearly 50GB/s of bandwidth, and that is further exaggerated when the pixels are blending, which also has to sample the value already in the frame buffer, hence requires more bandwidth.