Beyond3D - AMD RV610 and RV630 Introduction

AMD RV610 and RV630 Introduction - Page 2

Published on 28th Jun 2007, written by Rys for Consumer Graphics - Last updated: 28th Jun 2007

Given the building blocks available in R600, in terms of unit configuration, redundancy and the availability of UVD, we can build up a decent picture of the two other GPUs in the current family, at least from our somewhat high-level perspective.

Unit config wise, AMD are able to scale their latest generation products not just in terms of the number of clusters (or SIMDs if your name is Dave Baumann) a chip has, when talking about general shading, but also in terms of the number of ALUs per cluster. They can do that while maintaining a redundancy scheme that's broadly "working units you want plus one", too, for yields and other reasons related to making the chips a manufacturing goodness.

Add to that scaling in terms of on-chip memories (some are per cluster, some aren't) and ROPs and we can imagine RV610 and RV630 shaking out as they did as AMD sought to fit a configuration into an area and power sense (affecting clocks and price and all the other variables in The Grand Dice Equation). The process used to create RV610 and RV630 is TSMC's 65G+ offering, which isn't their most balls-to-the-wall application of 65nm transistor technology, but it does offer some really appealing power characteristics and clock/voltage options, making it the best choice for production.

What AMD have finally produced is thus. We'll do diagrams, not least for those that love to give their printer ink supply/eyes a workout, soon, so keep your lids peeled back and firmly on beyond3d.com in order to spot those when they appear.

RV610

Starting with the baby of the family, RV610 features two clusters/SIMDs, but each one is one quarter of each that you'll find inside R600. So four ALUs per SIMD, each with the 5-way scalar sub setup that R600 sports. 20 scalar ALUs per SIMD, then, giving 40 total (one eighth R600 in terms of unit count for general shading). After that, there's one of R600's four data samplers, one of its four quad ROPs and some tweaking of the on-chip memories (texture and vertex transform cache are shared now, where they're discrete elsewhere in the family). The external memory bus is 64 bits wide (one of R600's eight channels), and setup rate is 0.5tri/clk.

All of that runs to 180M transistors on the aforementioned 65G+ process variant at TSMC, and a 7.7x10.7mm die. Part of those 180 million are two on-chip output engines. The first mates to a dual-link DVI TMDS, the second can drive one link, and all links are HDCP protected with on-chip crypto keys. Analogue outputs are presented by the built-in Xilleon HDTV encoder which takes care of scaling. The HDMI output supports 1080p, the analogue outputs are driven by a pair of 400MHz DACs and support 2Kx1.5K, and the video output stages support up to 10 bits of colour resolution for each and every output pixel.

The motion video block supports the latest popular formats in terms of decode, and while AMD's documentation mentions transcoding, as far as we're aware there's nothing been exposed there and no plans to, either. So that's VC-1, H.264, H.263 and related variants all supported by the hardware decoder, which on RV610 is still capable of supporting full-rate H.264 AVC (the hardest format to decode that the hardware supports) from a Blu-Ray movie disk at some 50Mib/sec. The shader core can assist for performance and quality reasons too, if need be.

UVD and Avivo could take up an entire paper or two's worth of discussion, but the salient points AMD would like mentioned are the entropy decode for VC-1 which NVIDIA don't do in their latest video silicon, and that the post processor logic and software support are very high quality. We'll take a look at that in due course.

So what you have in front of you with RV610 is a tiny little chip with a video focus that outstrips its gaming aspirations, or that you could reasonably argue anyway. An eighth of R600's shader core and memory bandwidth for the same given clocks was never going to set the performance world alight, AMD's two-way unit scaling allowing them some scaling in front-end logic in order to create the perf/area product they think the market wants.

We'll have to leave it there in terms of architecture discussion, so we'll pick it back up on the forums if you're keen. Remember that the chip front-end still features the same tesselator and that the rendering features offered by the family are static on all chips, so every one supports D3D10 and related goodies. Oh, and the native output is PCI Express, before we forget.

AMD RV610 and RV630 Introduction - Page 2

RV610

Page Navigation