Memory Interface, UVD2

As most ladies will tell you, rings aren't always the best thing to get/use...on one hand, they simplify certain aspects of life, on the other they can generate much chaos and lead to missed opportunities if they're too visible at the wrong time. Whilst GPUs aren't exactly ladies, the ring bus that ATI implemented long ago in the R520, and expanded upon in the R6 generation of cards behaved pretty much like other, less mouthwateringly geeky on paper, rings.

It simplified routing to a certain extent, and created a more even distribution of heat around the die, but opened up the door for entropy. A ring bus is elegant in theory, and when you have a high number of clients and aim for a wide interface, it just becomes a more natural solution rather than laying out a billion muxes for a crossbar (OK, not a billion). However, you pay a worst-case latency price, and arbitration of memory accesses is more difficult, and bad things happen as different clients want to place data unto the bus at the same time or packets queue up in the reservation stations. Overall, for GPUs at least, an optimised crossbar produces better performance than a ring bus if you've got the wire density to achieve it.

Apparently, ATI shared this opinion, since they threw the ring away for the R7xx generation, and implemented a throughput-oriented crossbar. In the case of RV740, this translates into having 2 64-bit memory controllers neatly placed near the bandwidth burners of doom -- the ROPs -- with the L2 cache also nearby. The less gourmand clients hook into a centralised hub that interfaces with the memory controllers -- here we have things like the UVD2 engine and the PCI Express interface, for example. Memory tiling has also been tweaked versus prior efforts, and both GDDR3 and GDDR5 are supported, albeit we've not yet seen a GDDR3 RV740-based SKU. ATI quotes an improvement of around 10% in terms of peak bandwidth utilisation versus RV670 (and we'll have a chance to partially see if this is the case in some of our tests, albeit in a somewhat third-handed way).

Since we mentioned UVD, we'd better flesh out that topic a bit, since nowadays HD video is an important topic. RV740 features a slightly reworked UVD2 engine, albeit the rework hasn't been marketed all that much, if at all. There are some minor tweaks in place, mostly dealing with how the UVD silicon (which is pretty an on-die integrated Xilleon processor) interacts with the hub and thus memory, as well as aspects of decode assist. To be honest, we've not been able to quantify the impact of these changes, but hey, we may lack the proper sensitivities. Other than that, it features the whole slew of features that we've become accustomed to, offering H264/VC-1/MPEG-2 decoding, post-processing (de-noise, de-interlace, scale/resize, contrast/edge enhancement) and dual-video stream/PiP support.

By setting this on the table we've finished going through the architecture in theory, so now it's time to see if we can make it reach its theoretical limits via the medium of colourful bar graphs.