Pixel Output (ROP's)

The basic pixel output engines remain unchanged from R300. In normal operation each pixel pipeline can output 1 colour value and 1 Z/Stencil value, but there is no optimised Z/Stencil rendering performance.

The multi-sample FSAA engine is capable of writing two subsamples per pixel, with two loops back through the pixel engine to create up to 6 samples per pipe for 6X FSAA. Although colour compression doesn't operate without FSAA, it does with and the compression ratio scales with the level of FSAA. As with R300, R420 features Gamma corrected FSAA edge blending, resulting in what looks like smoother intermediate steppings upon output.

The FSAA mechanism ATI have adopted is actually a sparse sampling system. The level of sub-sample positions goes to, what we believe to be, a 12x12 grid, and samples are chosen from these points to achieve what should be the maximum level of edge coverage for the number of samples chosen. The sparse sampling is also programmable such that the sampling positions can be altered as required - this also gives rise to a feature that ATI are terming as Temporal Anti-Aliasing, which we'll take a look at in a little more detail later.

Memory Bus

Finally, the memory interface is one of the most important elements in the pipeline design and can dictate a lot of the overall performance.

 

The memory interface of R420 remains the same structure as R300, with a four 64-bit crossbar bus, however it has been refined in two specific areas. First, whilst R300's memory bus was fairly efficient, its maximum operation frequency wasn't particularly high, so R420's bus has been overhauled for a much higher maximum operation and this appears to have paid some dividends as, in its first implementation, its running at 560MHz, nearly 200MHz greater than any R300 based implementation. The other area of refinement was to ensure that the efficiency of the bus was increased as well - with 16 pixel pipelines and memory speeds not keeping pace, an efficient bus is required to ensure that effective utilisation is maintained, as it's likely that the graphics board is going to be bandwidth constrained in many situations.

Finally the memory bus is now compatible with GDDR-3 memory types, whilst still maintaining compatibility with DDR and GDDR-2 memories.