Architecture Summary

Well well, graphics fans, it's finally here! Years in the making for AMD, via the hands of 300 or so engineers, hundreds of millions of dollars in expenditure, and unfathomable engineering experience from the contributing design teams at AMD, R600 finally officially breaks cover. We've been thinking about the architecture and GPU implementations for nearly a year now in a serious fashion, piecing together the first batches of information sieved from yon GPU information stream. As graphics enthusiasts, it's been a great experience to finally get our hands on it and put it through the mill of an arch analysis, after all those brain cycles spent thinking about it before samples were plugged in and drivers installed.

So what do we think, after our initial fumblings with the shader core, texture filter hardware and ROPs? Well arguably the most interesting bits and pieces the GPU and boards that hold them provide, we've not been able to look at either for time reasons, resource reasons, or they simply fall outside this article's remit! That's not to say things like the HDMI implementation and the tesselator overshadow the rest of the chip and architecture, but they're significant possible selling points that'll have to await our judgement a little while longer.

What remains is a pretty slick engineering effort from the guys and guys at AMD's Graphics Products Group, via its birth at the former ATI. What you have is evolution rather than revolution in the shader core, AMD taking the last steps to fully superscalar with independent 5-way ALU blocks and a register file with seemingly no real-world penalty for scalar access. That's backed up by sampler hardware with new abilities and formats supported to chew on, with good throughput for common multi-channel formats.  Both the threaded sampler and shader blocks are fed and watered by an evolution of their ring-bus memory controller. We've sadly not been able to go into too much detail on the MC, but mad props to AMD for building a 1024-bit bi-directional bus internally, fed by a 16-piece DRAM setup on the 512-bit external bus.

Who said the main IHVs would never go to 512? AMD have built that controller in the same area as the old one (whoa, although that's helped by the process change), too. Using stacked pads and an increase in wire density, affording them the use of slower memory (which is more efficient due to clock delays when running at higher speeds), R600 in HD 2900 XT form gets to sucking over 100GB/sec peak theoretical bandwidth from the memories. That's worth a tip of an engineer's hat any day of the week.

Then we come to the ROP hardware, designed for high performance AA with high precision surface formats, at high resolution, with an increase in the basic MSAA ability to 8x. It's here that we see the lustre start to peel away slightly in terms of IQ and performance, with no fast hardware resolve for tiles that aren't fully compressed, and a first line of custom filters that can have a propensity to blur more than not. Edge detect is honestly sweet, but the CFAA package feels like something tacked on recently to paper over the cracks, rather than something forward-looking (we'll end up at the point of fully-programmable MSAA one day in all GPUs) to pair with speedy hardware resolve and the usual base filters. AMD didn't move the game on in terms of absolute image quality when texture filtering, either. They're no longer leaders in the field of IQ any more, overtaken by NVIDIA's GeForce 8-series hardware.

Coming back to the front of the chip, the setup stage is where we find the tesselator. Not part of a formal DirectX spec until next time with DX11, it exists outside of the main 3D graphics API of our time, and we hope the ability to program it reliably comes sooner rather than later since it's a key part of the architecture and didn't cost AMD much area. We'll have a good look at the tesselator pretty soon, working with AMD to delve deep into what the unit's capable of.

With a harder-to-compile-for shader core (although one with monstrous floating point peak figures), less per-clock sampler ability for almost all formats and channel widths, and a potential performance bottleneck with the current ROP setup, R600 has heavy competition in HD 2900 XT form. AMD pitch the SKU not at (or higher than) the GeForce 8800 GTX as many would have hoped, but at the $399 (and that's being generous at the time of writing) GeForce 8800 GTS 640MiB. And that wasn't on purpose, we reckon. If you asked ATI a year ago what they were aiming for with R600, the answer was a simple domination over NVIDIA at the high end, as always.

While we take it slow with our analysis -- and it's one where we've yet to heavily visit real world game scenarios, DX10 and GPGPU performance, video acceleration performance and quality, and the cooler side facets like the HDMI solution -- the Beyond3D crystal ball doesn't predict the domination that ATI will have done a year or more ago. Early word from colleagues at HEXUS, The Tech Report and Hardware.fr in that respect is one of mixed early performance that's 8800 GTS-esque or thereabouts overall, but also sometimes less than Radeon X1950 XTX in places. Our own early figures there show promise for AMD's new graphics baby, but not everywhere.

It's been a long time since that's been something anyone's been able to say about a leading ATI, now AMD, graphics part. We'll know a fuller story as we move on to looking at IQ and performance a bit closer, with satellite pieces to take in the UVD and HDMI solution and the tesselator to come as well. However after our look at the base architecture, we know that R600 has to work hard for its high-quality, high-resolution frames per second, but we also know AMD are going to work hard to make sure it gets there. We really look forward to the continued analysis of a sweet and sour graphics architecture in the face of stiff competition, and we'll have image quality for you in a day or two to keep things rolling. RV610 and RV630 details will follow later today.

What do you make of R600? Let us know in the forums.