With 512MB and 1GB boards out there, memory footprint issues must be considered, but are generally secondary to the performance of the functional units. I remember when we decided to focus on FP16 as â€œthe basicâ€ unit for most things on the R600. At the time, HDR rendering was just starting, but it certainly seemed the way of the future. Not only for floating point â€œimageâ€ textures, but also for all types of floating point textures, such as normal maps and other more complex floating point data sets. In retrospect, the industry is probably moving a little slower than we expected, though I would say that games like Oblivion in 2006, and a few of the upcoming titles are really changing the tides, at least at the avant-garde of games. I'd like to think that our chips are very forward looking.
The top 2 things that come to mind are a separate stencil buffer (from Z) and hierarchical Stencil. The first one prevents Z decompression from occurring as the stencil buffer gets used and updated. This was a prime cause of slowdowns on Doom3 for the R5xx, for example. That's gone now. As well, HiS, just like HiZ in the past, allows for skipping whole tiles of pixels when appropriate. This has lead to some significant clock-for-clock improvements on stencil heavy operations, compared to previous generations. In game improvements of 30% or even up to 100% have been seen, for stencil shadow heavy games.
All the derivatives announced are TSMC 65G+ process chips. I won't comment on future chips, but we are continuously re-evaluating processes, and we work a lot with our manufacturing friends, such as TSMC, to get the very best process we can.
I can't help but be a little disappointed that we did not have enough time to get more optimizations into our drivers in time for the launch. I still cringe when I see poor performance (especially if it's compared, to, say, our previous generation products), and send daily emails to the performance and driver team, begging for this or that. In fact, I do believe that they all hate me now. They should join the club.
Also, on the feature side, we weren't able to expose the adaptive edge detect CFAA modes to the public, or even to the press, until very late in the review process. This means that most reviewers did not have a chance to look at the amazing quality this mode offers â€“ There is nothing comparable to it out there.
We also had some last minute performance possibilities, which is always incompatible with stability, and we did not have enough time to get those tested and integrated in time for launch. We see in the latest driver, some 2x to 3x improvement in adaptive AA performance, for example, which is great but came later than I would have liked. But, I'll admit, there's so much to do still, that I haven't really spent that much time on reviews and such. The reality is that I expect things to continue to improve and be much better in a few months.