Are the Tests Stacked in ATI's Favour?

Some concerns have been raised over the relative performance of some of the newer ATI boards in comparison to NVIDIA's current line of GeForce4 Ti boards. While comparative current game benchmark testing shows the Ti 4600 to outperform the 9500 PRO, Game Tests 2 & 3 under 3DMark03 display very much the opposite by a large margin, and you might wonder if this is fair.

What you have to remember when looking at current game benchmarks is that most of these are not reliant on shaders, but more often are utilising fixed pixel processing with multiple texture layers. When we compare the pure texture rates of the Ti 4600 and the 9500 PRO we can see that, in fact, the Ti 4600 has a higher texture rate than the 9500 PRO which, coupled with a higher memory bandwidth, will explain the higher performance under these types of heavily multi-textured titles. However, when we look at game tests 2 & 3 in 3DMark2003 we can see that with the quantity of Pixel Shader processing occurring, pure texturing isn't the major factor that contributes to the performance.

Radeon 9500 PRO's pipeline arrangement actually makes it a very powerful board in terms of pure pixel shading operations. Whereas the GeForce4 Ti board features four DX8 Pixel Shader pipelines, Radeon 9500 PRO utilises a total of eight DX9 Pixel Shaders which makes the pixel shading performance of 9500 PRO much higher than that of the Ti 4600. So, it's easy to see that 9500 PRO would have an advantage with Pixel Shaders, even before taking into account the different number of passes required for the GeForce 4 Ti needing to run these tests with PS1.1 and the 9500 PRO running PS1.4. (Note: As mentioned in our Introduction to 3DMark03 article Futuremark had initially targeted this one pass method for DX9's PS2.0 Shader model, but being a DX9 board 9500 PRO would still take advantage of doing the lighting in one pass since it's a fully DX9 compliant board)

In preparation for writing our 3DMark03 article I quizzed Futuremark over support for a specific element in the use of the Stencil buffering used in GT 2 & 3. DirectX9 has functionality to expose a hardware feature known as double sided stencilling which speeds up the stencil operations by halving the number of passes during stencil calculation. Now, as GeForce 4 does not support this feature it obviously can't take advantage of it, though Radeon 9500 PRO can. You might question whether the inclusion of this is fair given that, for almost everything else, GT 2 & 3 are DX8 tests. A few things to consider here is that these two tests were supposed to have DX9 support in them by reducing the number of lighting passes (this got pushed down to PS1.4, though, meaning PS1.4 and DX9 PS2.0 boards take advantage of that) and whether games may do something similar. Well, we already know that Doom III uses this double sided stencilling, and the upcoming DirectX title 'Eve Online' will as well. Support for this type of feature isn't something that requires a considerable amount of effort to implement into a title, even late in its development, so this is a good and easy optimisation to make for developers that are using Stencil's in their titles.

Sticking with the relative performance of GeForce 4 Ti and Radeon 9500 PRO under GT 2 & 3 and stencils, we must once again consider the further differences between the two architectures. First off, the Stencil calculations require lots of vertex processing and with 4 vertex processors the 9500 PRO has an advantage here. Also filling the stencil buffer requires lots of non-textured fillrate, meaning that the 4x2 (pixel pipe x texturing units) of GeForce 4 Ti is at a distinct disadvantage to the 8x1 configuration of Radeon 9500 PRO.

Seemingly these tests are stacked in ATI favour at the moment; however, is that by choice, coincidence or something else?

Both manufacturers know the directions that their hardware is taking and what principles they are evangelising to developers and we can see from the configuration of ATI's current line and the direction, signalled by the configuration of GeForce FX that NVIDIA is headed, that both the major manufacturers are going along similar paths, so it's probably no fluke that these game tests are coded in this fashion. The fact that both of the manufacturers appear to be following similar design principles it could be said this isn't bias against or for one single manufacturer. The problem for NVIDIA is that they are just behind the curve a little in DX9 hardware.

One thing that is for sure is that the rendering principles behind these two tests are exactly those that both the major IHV's are evangelising to developers. During the Dawn-till-Dusk developer event, NVIDIA presented to developers optimisation routes for DX9, which included a Z-Fill pass rendering operation, and an entire presentation dedicated to Stencils volumes. So, it's probably no fluke that Futuremark decided on these types of rendering characteristics and, indeed, we know that popular titles such as DoomIII will also feature similar rendering properties in some areas. It’s highly likely that as we get more DX9 capable hardware that features similar rendering abilities and configurations as ATI’s line of DX9 boards it’s very likely that they will perform much closer or perhaps better than ATI’s current line. Indeed, some preliminary tests on GeForce FX are actually displaying this to be the case, once NVIDIA offered a driver update.