Many people may consider that 16 texture units will actually be the bottleneck for R580 in many current applications, and had there been more texture units performance may scale up more. Obviously, in relation to R520, R580's texture units are going to have a higher utilisation for a greater percentage of time - is a 256-bit bus at memory speeds only a little greater than the core clock speed an inhibiter to sampling more textures?

[Eric Demers] Perhaps in some applications. We generally want to be balanced, where there is no single bottleneck, and where all units are operating all the time. By increasing the ALU ratio, we certainly pushed more bottlenecks onto the texture unit, and we do expect to be more texture limited than on R520. When we look at R520 performance on newer apps, we find that it's most often ALU limited. This shows up in new apps, such as FEAR and others with strong shader content (even our own toy shop demo), where R580 performance is doubled over R520. That tells me that for newer apps, we did the right thing and are more balanced now. Certainly having more memory BW would certainly help to hit the next bottleneck, but memory technology doesn't evolve at the same rate as our core gfx does.

[Richard Huddy] If you’re bandwidth limited, having more texture units there doesn’t help at all - you’re still bandwidth limited - and that’s the judgement that we’ve made. We all want to really revise our current architectures when we go to Vista, not just now, but it is interesting, this idea of whether you should have more than 16 textures and is something that we thing we’re right on because we’re not saying that we’ve got to double that next generation. We can think about difference ratios rather than 16:48, but there probably isn’t another integer ratio, other than 2:1 but then we’d be under-spec.

The ATI white paper on X1900 / R580 makes the statement that a part needs to be designed to be best balanced towards the applications that it will be running when its available. Obviously this is key in the design phase, but with development timescales such as they are it would suggest that you are trying to predict application usage 1-3 years down the line, which, even with the best research, developer contact and developer steering must still end up with an element of "crapshoot" it?

[Eric Demers] We talk to ISVs to get an idea of what they think will happen in the next few years. As well, we look at shaders for titles coming out next year or two. With both of those pieces of data, we can get a pretty good idea at what will be the newest titles at the introduction time. Those are definitely the types of titles we shoot for, much more so than the older titles (which people will play less of anyway by then). Having said that, there is some guesswork involved as well. But it's also a chicken and egg thing, in that ISVs will tell us what they are doing, but they will also be influenced to designing games with our new technology in mind. If we come out with 3:1 ALU:TEX ratio HW, then designers will tend to add more ALUs for next games, and so it's a mutually influenced evolution.

[Richard Huddy] We tell games developers where we’re going and they’re happy with the ratio. Most of them don't do that kind of detailed analysis though, its up to us to make sure we’ve got their early shaders.

In the case of R520 to R580, though, in less than 4 months (OK, perhaps it should have been a little longer, given the issues R520 suffered) you've tripled the Pixel Shader ALU processing capabilities, yet applications will be on more of a gradual scale upwards in terms of ALU utilisation. With such a large jump in such a short period this tends to lead one to the conclusion that either R520 or R580 isn't best balanced towards the applications that were available at its release, at least in terms of ALU utilisation - can you give us some insight as to the development decisions of the R5xx series which lead to R520 being released as with the same ALU to texture ratio as all ATI's desktop parts since R300, then suddenly jumping with R580?

[Eric Demers] When we designed the R520, we were already starting the design of the R580. The R580 (X1900) was designed to be both the next high end after R520, but was also designed to be more balanced overall for the newest applications. We designed the R520 first with a Spring 05 introduction target, and it was a reasonable design for that time, that introduced many new technologies. We might of done a 2x ALU, but there was already so much new technology in R520, that we decided against that. But it ended up being delayed, and so lost some of its steam due to that. I feel that for the current newest applications today, R580 is a better balanced part (given that it often doubles the performance of R520, I think that's reasonable). But in Spring 05, R520 would have rocked :-)

[Richard Huddy] It depends on when you measure things. If you compare R580 against R520 right now you might find they are very similar on some applications as they perfectly suit the R520 balance, but 6 months down the line games have moved more towards R580, so that’s one thing you might notice – take the same hardware now and measure it 6 months or a year from now and the percentage differences between them will increase because this is a forward looking chip rather than just trying to hit today and yesterday. However, already we see that clock-for-clock the R580 wins on every application, it never looses and, indeed, it doesn’t even draw. So the evidence is that more math power is the right way to be. It might draw on something like Quake 3 or GL Quake, but then who cares, as we already have the frame-rates there, the stuff to focus on is future games.