Texture Filter Performance
The Following table show a comparison of Bilinear and Trilinear texture filtering rates, using Quake 3's Demo001 and Demo002:
As we can see the performance hit for enabling trilinear filtering under normal circumstances is quite high for both 16Bit and 32Bit, going as high as a 35% performance penalty. However, when enabling S3TC in conjunction with Trilinear filtering we see that drop minimised quite a lot in 16Bit mode, and even translated into a gain at 32Bit.
In essence Trilinear filtering with S3TC enabled is free on KYRO, assuming the data is there. The hardware itself is able to produce 2 Trilinear pixels per clock, however in normal operations there is no guarantee that the textures will be ready; with S3TC it is more likely that the data is in the cache, and hence the hardware is able to optimise its Trilinear abilities. In scenes with large polygons then the cache efficiency is optimised, and so you will nearly get Trilinear for free with S3TC, however this will decrease as the geometric complexity increases.
UPDATE (17 Feb 2001): Anisotropic Filtering (Mercedes Benz Truck Racing)
Unlike NVIDIA based cards, the 'Anisotropic' filter option in the OpenGL 3D control panel of the KYRO drivers only enables Anisotropic filtering, it doesn't 'force' it, meaning games without an Anisotropic option cannot use it in OpenGL; this is the reason why I was unable to take any Quake 3 Anisotropic benchmarks. However Mercedes Benz Truck Racing Demo has an Anistropic Filter option, which can be enabled, so I decided to use this for some comparisons.
All options were the same as previously tested, only, obviously, Anisotropic filtering was enabled for the Anisotropic figures.
Well, as we can see the drop for enabling Anisotropic filtering is pretty huge! At best we loose 35% of performance going up to over half the performance; very large hit indeed.
To the best of our knowledge (although this is currently unconfirmed) KYRO's Anistropic filtering takes 16 texture samples; in comparison to GeForce's 8 and Radeon's 12, it would appear that it has one of the highest quality (in terms of samples taken) Anisotropic filters available in the consumer space; this is also the reason for the high cost.
As said earlier KYRO is able to perform 2 trilinear textured pixel per clock; Trilinear texturing requires 8 texture samples. To achieve Anisotropic filtering at 16 samples KYRO has to takes two clock cycles - this explains the large performance hit.
Below are screenshots taken of MBTR with Anistotropic filtering disabled and enabled.
All benchmarks used in this review are available for download in MS Excel format here.