R420 Pipeline Overview
R300 set many of ATI's current principals out, being that of plenty of parallelism, thanks to its 8 pixel pipeline (2 quad) operation. R420 draws on those design principals and takes them on further.
The overview of the chip highlights that the R420 features a total of 6 Vertex Engines and 16 rendering pipelines (organised in 4 quad rendering pipelines), similar to R300's arrangement.
Due to the pixel pipelines taking up the largest single area of die, any defects that may occur in a silicon wafer has a higher probability of occurring in one of the rendering pipelines. Rather than wasting this chip, the quad that the defect turns up in can be turned off and then sold as a 12 pipeline chip, or two quads turned off for an 8 pipeline chip. It doesn't effect the operation of the chip which one(s) of the 4 quads are turned off from chip to chip as they will all perform similarly to another board with the same number of pipelines left enabled.
ATI also sold the Radeon 9500 with a similar "silicon redundancy" scheme, however they rapidly began to find the demand outstripping the number of R300 chips that had a defect in one of the quads resulting in them selling full operation R300 parts with a quad turned off for the 9500. The 9500 soon became a popular favourite among modders as it was found that the full 8 pipelines could be enabled with software modification, thus the lucky ones that had a fully operational R300 could get tantamount to a Radeon 9700 (or at least a Radeon 9500 PRO) for the cost of a mid end board. This time ATI have changed how they disable quad pipelines and they suggest that this is now done in hardware as well as software - although we expect that someone will find a way of circumventing it eventually! However, with chips this size we would expect to see far more chips that have a defect in a quad hence fewer opportunities to find chips sold with less then 16 pipelines actually having all of those operational.
With R300 ATI had a regionalised quad dispatch system, such that the screen-space is tiled and assigned to different pipelines - this is how the R300 can be scalable internally and externally in that the tiles can be allocated to quad rendering pipelines internally in a chip and externally across multiple chips. R420 adopts the same type of quad dispatch system, which is how the system was easily extended to 4 quads, however it has been slightly altered to allow for programmable tile sizes so the load balancing between the pipes can be controlled in a much finer way and potentially altered according to resolution. Reducing the tile size allows for higher efficiency with smaller triangles, while larger triangles favour texturing efficiency. This quad dispatch system can result in different quad pipelines operating on the same triangle if the triangle coverage is larger than the currently set tile size, or quad rendering pipelines will be operating on completely different triangles at any one time.
We'll take a look at various elements of the pipeline in a little greater detail.