Yes NVIDIA says that they don't agree, and there is a good reason for not agreeing. Want to know what the reason is? Well as the numbers at the start indicate, both Pentium III's and AMD Athlon's software T&L is faster than NVIDIAs special hardware T&L. If software is beating your hardware you are in big trouble, and you'll do anything to convince the public that something is not right. It's called damage control people!

So, what are they using in their defense to explain why their hardware seems to have a low peak number?

Point 1

(Taken from a text written by NVIDIA's PR manager, Derek Perez):
3DMark2000 uses a custom Transform and Lighting (T&L) engine. Since it's a custom and proprietary engine to 3DMark2000, it can only give an accurate estimate of how games would perform if they use this custom engine. NVIDIA is not aware of any games based on this engine. If the goal of the test is to estimate how real games would perform with a particular graphics card, the more accurate approach would be to choose a common and widely available T&L engine. Unfortunately, 3DMark2000 chooses its custom T&L engine as the default for all cards that rely on software for T&L. We consider this a major flaw in 3DMark 2000.

Ok, so basically NVIDIA says that using an optimized T&L software implementation is illegal because not all games might have such an optimized implementation. Unfortunately, I would have to disagree. The optimization code for the various CPUs is freely available. So if a game doesn't use an optimized engine they were either a) lazy, or b) lacking funds. The fact of the matter is that the large majority of games that we have today have a custom optimized T and/or L engine. Every major software house has designed such an engine simply because two to three months ago we had no hardware implementation. If they wanted to have a game that rocked they needed fast software T&L, and most companies have spent a lot of time and money to make that a reality. The reason why NVIDIA wants us to use the Microsoft implementation is because this is the least optimized one available. Top games, like Quake3: Arena, use their own custom optimized engines (light is done in software), and Messiah uses software T&L for most major parts of the geometry (characters). There are many more that also fall under this category.

Another point mentioned here is "how real games would perform." This shows that NVIDIA doesn't even know what this benchmark tests. The High Polygon Bench tests PEAK performance, not GAME performance! The first two "game" scenes (the benchmark even calls the game) are for testing in-game performance.

In my humble opinion, there are many games out there with optimized software implementations. Developers that don't offer one are lazy and their product will never rock as hard as it could have if they would have used them. The difference in peak performance shows that you can reach more, so developers should use this. And there are games out there that will use the software engine from 3Dmark - Max Pane anyone?

The main point is simple: we are testing maximum throughput - a synthetic test! 
So, what can we really conclude from this test?

GeForce has a lower peak performance than some high-end general-purpose CPUs running optimized software T&L engine. But does this have any meaning for a game? Not really. In a game, your CPU is doing much more than just T&L. The whole game is running on your CPU. Now what hardware T&L does is remove load from the CPU. So the simple fact is this: a CPU running nothing but T&L is faster than the GeForce, but the essential thing to remember is parallelism. So, in a real game where the CPU has to do "so" much more than just T&L, the GeForce would end up winning. Say that 25% of the time is spent on T&L in a software situation, and during that 25% you can only reach 25% of the peak number we measure here. Our conclusion is this: in a game situation, GeForce will be faster. The CPU software T&L combo is only faster when 100% of the time is devoted to T&L, which is never the case in a real game anyway. So, although in this peak test the GeForce is lacking, it will win in all true games.

I personally do believe that NVIDIA is trying too hard to win every battle. I mean, instead of explaining that slower is not necessarily worse, they chose to twist the situation in such a way that they are faster anyway. There was no reason for this. All they had to say was that it's a peak test, and in no game will your CPU ever have 100% its time geared toward T&L. This would have solved the whole issue.

Point 2

The official testing guidelines for 3DMark2000 recommends that all tests be run with VSYNC turned off. This is the correct way to run the benchmark. However, there is an additional recommendation to run the test in "Triple Buffer" mode. In fact, the default configuration is to use Triple Buffering. This is the third major flaw in 3DMark 2000.

Yes, you noticed that I skipped his second point, but essentially, if Point 1 is not valid, Point 2 becomes invalid as well.

Again, this has nothing at all to do with the test. Using triple buffering or double buffering with Vsync OFF only has an impact on memory available. Available memory mainly has an impact on fill-rate, not so much on the actual T&L speed. While it does have some impact, it is not really a big deal today. Using double buffering is not going to double their T&L performance.

Point 3

The simplicity of the lighting conditions allows for additional optimizations in a custom T&L engine. Without specular lighting, it is possible to significantly reduce the amount of work required of the lighting engine. It is unfortunate that specular lighting was not used. Games that use multiple dynamic lights would typically incorporate this kind of lighting. For example, it would be common to see the specular highlights on the walls when you fire a rocket in a first person shooter. Similarly, games that would use a flashlight or car headlights to light up a portion of a scene are not modeled by these tests, since no spotlights are included. None of these superior and computationally complex lighting situations are represented in the lighting for the "High Polygon Count" tests. Again, tuning for the simple cases will result in consumer confusion.

So, NVIDIA says use more complex lights, like spotlights. Read on to see what happens when you use more and more complex lights on the GeForce chip. He does, however, have a point about specular lighting. This is also a mathematically intense process and probably is not done in the software T&L version they used, but could if it had been added. But still, there are always better tests.

Also, I think that 3Dmark created a T&L Software implementation that gives acceptable image quality - it doesn't seem to be a hack or a trick or a shortcut. It's always easy to say "do this and that and it will be slower." My point is, as long as the game scenes look good then it is OK by me…and the game scenes do look OK. Have a look at some software T&L images. They look fine.

The same thing can be said for the following statement:
The custom T&L engine in 3DMark2000 is proprietary. It is not possible to verify the precision used in the lighting calculations of this engine.

The game scenes look fine. Game optimized engines have existed for years, and now NVIDIA wants to change that and call it illegal? So what if GeForce uses, for example, a 64-bit internal precision. What if Matrox, or any other company, comes out with a board that uses 96-bits internal precision? Should Matrox then say that you shouldn't compare it with GeForce since their accuracy is lacking!?! The game scenes look OK, just as the game scenes in all games that use a custom T&L software engine look OK, and that is the bottom-line.