Beyond3D - NVIDIA Fermi GPU and Architecture Analysis

NVIDIA Fermi GPU and Architecture Analysis - Page 15

Published on 23rd Oct 2010, written by Alex Voicu for Consumer Graphics - Last updated: 28th Oct 2010

Conclusion

So, is GF100 a new NV30? Absolutely not: it's great hardware in our opinion that performs well in general and implements some interesting architectural choices. Is the architecture an orgasmic changer of the computing landscape? Absolutely not, irrespective of what you might've been told: it's not a monumental departure on any front (even the GEOD, which we like quite a bit, doesn't qualify for that).

What it is is a good GPU, with competitive performance characteristics, some competitive advantages and some competitive weaknesses. Our experience with the GTX 470 has been quite lovely, to be honest, albeit we'd not have run out to buy one were it not for our job as unsampled, cast-out internet hacks, considering we were already equipped with a 5870. If you're in the market for a card in that price bracket though, we consider both equally valid options, and no, we don't know anything about Cayman or NVIDIA's next parts that we want to share just yet.

Fermi's biggest problem was its time to market, and its biggest strength is NVIDIA's great software prowess. Many people tend to intentionally or unintentionally fumble this, by using things like CUDA as arguments for hardware superiority -- make no mistake, CUDA is an outstanding software product, for which many talented software engineers sweat blood.

Great performance with the newest games is in great part-owed to developer relations superiority -- we're sure we'll get some hate-mail over this statement, but we're at a point where we can call them as they are, rather than cater to anyone's sensibilities -- rather than some intrinsic mystic hardware unit that exists only in NVIDIA's products.

And all of this put together underlines the truth that some still choose to ignore: great hardware is nothing without software. We dare say that NVIDIA's great software stack made Fermi look a bit better competitively than its hardware would have allowed (just look at sheer measured throughput!) -- luckily for them, it appears that this particular situation won't be changing soon, with no competitive pressure on the horizon on the software front.

We think that going forward, the major challenges for NVIDIA's architects will be to get all of the nice ideas they've implemented trimmed down in order to descend into the more palatable die-size/thermal envelope area required by smaller chips, as well as smooth off some of the rougher corners: more L2 bandwidth, better GDDR5 memory controllers and better performance for atomics depending on where they're stored in the memory hierarchy spring to mind as viable avenues to pursue, and all would benefit both graphics and compute workloads.

In closing the Beyond3D would like to thank Splash Damage's Dean Calver, Intel's Andrew Lauritzen, The Internets' Farhan Ali and our own Alex Goh, Richard Connery and William Stumpf for their support and understanding. Thanks guys!

Bibliography

Andrew Lauritzen, Marco Salvi, Aaron Lefohn. Sample Distribution Shadow Maps. Computer Graphics(SIGGRAPH 2010 Conference Proceedings), 2010
Matthew Eldrige, Homan Igehy and Pat Hanrahan. Pomegranate: A Fully Scalable Graphics Architecture. Computer Graphics(SIGGRAPH 2000 Conference Proceedings), 2000
Steve Molnar, Michael Cox, David Ellsworth and Henry Fuchs. A Sorting Classification of Parallel Rendering. IEEE Computer Graphics and Applications, pages 23-32, 1994
Stuart F. Oberman and Michael Y. Siu. A High-Performance Area-Efficient Multifunction Interpolator. 17th IEEE Symposium on Computer Arithmetic (ARITH'05), pages 272-279, 2005

NVIDIA Fermi GPU and Architecture Analysis - Page 15

Conclusion

Page Navigation