Other Benefits Of KYRO's Arichitecture

8 Layer Texturing

Although the KYRO core only has 2 texture mapping units it is still able to achieve up to 8 layer multi-texturing, in a single geometry pass, by virtue of its tiling architecture.

Many 3D cards these days are able to achieve 2 or 3 texture layers per pixel by the use of multiple texture mapping units per pixel pipe; however when the number of textures required exceeds the number of texture units, these cards have to fall back into 'multipass rendering' which is very costly in terms of performance (see here in my VisionTek GTS review for a further explanation). KYRO alleviates the need for mulipass rendering because it can use its on-chip 'tile' cache as an extremely fast mini frame buffer. If a scene calls for more than two texture layers, then the KYRO can render the first two texture results to the tile, then in the next clock it can recall those results from the tile, feed the colour in using one texture unit, combine it with the next texture, and feed it back to the tile. This results in multiple texture layers over multiple cycles, but with the need for only one geometry pass.

This process can be done up to 8 layers of textures - presumably, however, it could be infinite, with infinite cycles if Videologic desired; there has to be a cut off at some point for performance sake though, and 8 is as good as any seeing as this is currently the max number of textures DirectX supports. The advantage of this technique is obviously reduced AGP traffic, it avoids double the geometry calculation, and relieves the associated bandwidth hit of multipass rendering.

Something to note here: To get the benefits of this feature developers are required to code for it. It would appear that many applications are taking the 'lowest common denominator' route and forcing multi-pass rendering if more than two texture operations are required. Imagination Technologies developer support team are going to have to evangelise this feature to game developers if we, the end user, are going to see any benefit from it.

For More information see the PowerVR Multi-texturing White Paper.

True Colour Rendering

It's long since been known that most 3D cores these days render internally at 'True Colour' (or 24/32bits) even though the Frame Buffer (and hence final) output is only 16bits. However, even though the core may be rendering in 24/32bits, as soon as it goes to the frame buffer it is dithered to 16bit quality, meaning that if any subsequent colour operations occur, such as a blending operation as the result of an 'explosion' in the scene, that requires the information to be retrieved from the frame buffer and a new layer added then there will be a loss of accuracy.

The process path for a traditional architecture is:
Generate 24bit colour value --> Dither to 16bit and pass to Frame Buffer --> Retrieve 16bit colour value from frame buffer, blend with 24bit colour --> Dither new value back to 16bit and pass to frame buffer.

The scope for colour inaccuracies in this process becomes obvious!

By virtue of its on-chip tile, KYRO can circumvent these potential blending inaccuracies. The tile acts as a mini frame buffer and always remains at full colour (24+ bits), regardless of the colour depth selected by the user or application. Because, as described earlier, all the geometry data due to be rendered has already been binned, the chip is able perform such blending operations on the tile, and hence the 24bit accuracy is maintained throughout; it is only dithered to 16bit when the tile is fully rendered and is ready to be dithered and passed to the actual frame buffer.

The Process path in KYRO's case is:
Generate 24bit colour value --> Pass 24bit value to Tile Buffer --> Retrieve 24bit colour value from tile buffer, blend with 24bit colour --> Dither new value to 16bit and pass to frame buffer (when full tile is complete).

So, you can see that there is much less scope for errors in the colours that are being displayed. Even though the final output may still only be 16bit, because of the design of rendering, the final displayed colour should be much closer to what the original artist/developer would have envisioned even when running in full colour.

For more information read the PowerVR FSAA White Paper.

Full Scene Anti-Aliasing

KYRO supports Ordered Grid Super-sample Anti-Aliasing, in either 4X (2x2) or 2X modes (selectable between 2x1 or 1x2!). However, FSAA is another area where KYRO can benefit from innovative use of its Tiling abilities.

On most cards currently available that are Super-sample enabled (GeForce, Voodoo5 and Radeon for example) we know that they render 4 (or 2) times the number of pixel samples, and, as a result, 4 times the quantity of bandwidth is required if FSAA is used. PowerVR, basically goes through exactly the same process, however the use of its on-chip tile enables it to make significant savings in terms of bandwidth.

During rendering with FSAA enabled, the tile is rendered to in the 'Super-sample' resolution, however because of the deferred rendering process, this data need not be revisited, and so the data from the tile is scaled down to the normal resolution before it gets sent to the external frame buffer. If 4X FSAA is enabled then the 32x16 pixels are rendered to the tile, then scaled down (2x2 blocks are averaged) and a 16x8 segment is sent to the frame buffer.

The upshot of this process is that there are no extra bandwidth overheads in comparison to rendering normally; it only requires the extra fillrate. Another benefit is that there are no extra memory storage overheads, only the displayed resolution is stored in the frame buffer; other cards need to store the super-sample resolution (i.e. for 4X FSAA in 800x600, a 1600x1200 frame buffer size is required) - this potential benefit is not realised though, as testing shows that FSAA is disabled in resolutions above 1024x768; reasons for this limitation have not been established but it is probably a driver setting.

FSAA image quality appears to be fairly good, however it is limited by its Ordered Grid nature. Below are example in-game screenshots of KYRO's FSAA, including shots with both types of 2XFSAA (2x1 and 1x2).