Anisotropic Filtering
As I did with the KYRO Vivid! review, I'll use the MBTR demo for a little analysis of Anisotropic filtering on KYROII Vivid!XS.
Again, as we saw with KYRO, enabling Anisotropic filtering equates to a huge performance drop. As was mentioned in the KYRO review, Anisotropic filtering on KYRO/KYROII utilises 16 samples, which is reasonably impressive for a card in this market segment; it's just a shame the performance isn't.
On the KYRO architecture Anisotropic filtering is achieved by taking the samples over two clock cycles per pixel pipe, given the card is already bandwidth constrained in many cases doubling the texture requirements is only going to further exacerbate the issue. Utilising texture compression, although you will not be able to circumvent the use of two cycles per pixel, may lessen the performance penalty.
S3TC Performance
I've used Quake3's Demo001 to show the performance increase that can be gained by using S3TC texture compression. Quaver is also used to such the effects of compression with high quantities of textures.
Given that we've seen this card to be fairly texture bandwidth limited we would expect to see some performance gains by enabling S3TC, and the bandwidth alleviation 1:6 compression ratio of DXT1 brings; as you can see, we do. On Demo001 in 32bit mode we can gains as much as 18% in performance. Quaver shows even larger gains in 32bit because the compressed textures are now small enough to all fit in the cards local RAM and hence AGP texture transferring is no longer required.
Multipass Vs Multitexturing
KYRO based chipsets have the facility for 8 layer multitexruing, which, if utilised, should cut down the work required over multipass rendering. Given the nature of KYRO's texturing abilities (just texturing over multiple clock cycles) it may be interesting to see if KYRO actually has any performance gains from true multitexturing rather than multipass rendering. Fortunately Serious Sam provides us with an opportunity to test this.
As I mentioned before Serious Sam can utilise up to 5 texture layers. The game engine also has the facility for altering the number multitexuring units available per pass, from none (one texture unit per pass) to Quad (4 texture units). The following table shows the performance of Vivid!XS KYROII in the Dunes demo in each of the texture layers available; the most fillrate/bandwidth and geometry intensive situations are tested:
With going to a full 5 passes with only one texture cycle in use to only 2 passes with 4 texture cycles the is about and 11% performance difference. Its not much of a surprise that there is little difference between Quad and Tri multitexturing, because if Serious Sam is using 5 texture layers both of these cases require two passes (4 + 1, or 3 + 2 textures). I find it slightly odd that there is very little difference in the predominantly CPU limited case of 640x480x16 though.
Multipass rendering on a traditional architecture is usually achieved by sending the geometry once, texturing that geometry pass (usually with 2 texture layers, as this is the most common number of texture layers employed by 3D chips right now) passing it to the frame buffer, then resending the geometry again processing the next texture layers and blending the result with the previous pass from the frame buffer; this process is repeated dependant on the number of texture layers required (in the case of Serious Sam a card with 2 texture units per pipe will requires 3 passes). If we think about it this can't be occurring on KYRO in this fashion; the frame buffer bandwidth is minimal and the tile is there to cut down frame buffer calls, and it would also 'break' KYRO's FSAA technique of averaging the results at the tile level before passing the tile data to the frame buffer.
In fact, KYRO achieves its multipass rendering by buffering all layers of geometry before rendering the data. So, if three passes are required the geometry will be sent to the card three times and buffered before the rendering occurs. As each tile is calculated the triangles for each pass are processed sequentially, and so the blending occurs at a tile level, and external frame buffer calls are still kept down to once.
The reason multitexturing is desired on KYRO is the fact that by utilising it the storage requirements for the geometry data are reduced considerably. Also, there are far fewer state changes occurring because there are fewer passes.
Another puzzler is why, when the KYRO architecture supports up to 8 layers of textures and Serious Sam uses 5, was the Serious Sam Engine coded to utilise the full 8 layers. Evidently it turns out that, for some reason, the PowerVR drivers only expose 4 layers through the OpenGL ICD. Presumably this is purely because few games currently utilise more than four layers (Serious Sam being the exception), and at some point in the future a newer set of drivers will expose all 8.
FSAA Performance Analysis
As I've done previously I'll use Quake 3 Demo001 for a bit of extra FSAA analysis.
The following table shows how the FSAA performance equates to is normal resolution. For instance in Demo001 at 1024x768x32 the Vivid!XS scores 71.7FPS normally and 23.5FPS with 4XFSAA; however in the FSAA settings it is internally drawing 4 times the number of pixels, and so the relative performance table takes that into account. To achieve the results I divided the normal resolution scores by the relative FSAA depth, and gave that as a percentage increase (or decrease) over the normal resolution - i.e. for the above scores: 71.7/4 = 17.93; (23.5-17.93)/17.93 = 0.3106 = 31% performance increase in comparison to the normal resolution.
As with KYRO Vivid! the KYROII Vivid!XS is always giving a better FSAA performance relative to the nominal resolution. This is can be attributed to the fact that with KYRO's FSAA mechanism (as explained here) is actually using less bandwidth than its high resolution equivalent.
Conclusion
This being a preview, I haven't really had adequate time to give much of an impression of how this board "feels" during game play. Hopefully, I'll be able to revisit this in the future.
But what of KYROII Vivid!XS itself? Honestly I would have to say that on the face of it this is a thoroughly unremarkable card; its feature set would have looked much more favourable had it been released in late 1999, or even early 2000. The performance is only knocking on the door of some of last years mid/high end boards (assuming the game doesn't require T&L). You might even suggest that KYROII is what KYRO should have been in the first place.
The KYROII Vivid!XS is a card that can't be taken at face value though. Features such as its Tile Based Deferred rendering are already giving this card the opportunity to get close to cards with double the raw pixel fill rate (even quadruple the texture fill rate) and it will continue to prove beneficial in future games. As the texturing complexity of newer tiles increases (as has already been seen with Serious Sam) KYRO's 8 layer texturing will come into play (assuming the OpenGL ICD exposes it). EMBM has already been utilised in a number of titles, and Dot3 is bound to be utilised more frequently in newer ones. KYROII also has unparalleled FSAA performance for the market segment it's in.
However, possibly the most important thing Vivid!XS KYROII has going for it is the price, and this is what suddenly makes this card to go from mundane to remarkable. At the UK RRP of £84 it brings all its features to a low end price, and yet can still achieve a performance far above boards considerably more expensive.
Img Tech, now that you've proven what can be achieved at the low end, how about shooting for the high end?
Oh, and one final thing…