Solution For NVIDIA
When asked when Valve started using the FX boards for development, Gary McTaggart replied that they first started using the FX series a little before their demonstration at E3, though at first they had some trouble in getting the game to render in DX9 at all, and so they had to work with NVIDIA to bring it up. Once they had the game running it became clear that there was a surprising performance discrepancy and because of the installed user base of NVIDIA boards they felt they had to do something. Valve then proceeded to optimise the performance for NVIDIA’s FX’s series with a rendering path they term as “Mixed Mode”.
Mixed Mode, so titled because it utilises a mix of precisions, is a special path created purely for the FX series. The mode uses integer precisions (PS1.3 and 1.4 shaders), partial precision (PS2.0 FP16) and full precision, as well as a other changes. One such change is that vector normalisation is done via cubmaps rather than math in the Pixel Shader, which trades off texture read and writes and memory bandwidth for ALU performance. Under the mixed mode Valve have the following performances:

Clearly this improved the performance for the FX series; however, Gabe states that they have spent up to five times as much time optimising specifically for NVIDIA’s FX series as they have the generic DX9 path, under which ATI’s hardware required no optimisation at all. The other issue is that these performance improvements will be removed as newer DX9 functionality will require the full DX9 precision, thus decreasing the use of partial precision shaders – Higher Dynamic Range is one such feature that calls for this.
Gabe went on to suggest that the best optimisation may have been to treat the FX series as DX8 boards, for this would have saved Valve lots of time and users could have been given the option of utilising the full precision DX9 mode if they wished. In fact, the default path for the 5200 and 5600 will be DX8. While development houses such as Valve may have the resources to expand lots of time and effort into creating special paths for NVIDIA's boards, smaller houses might not.

Indeed, looking the performance under DirectX8, the GeForce4 Ti seems to have quite a high FPS in comparison to its newer siblings.