NVIDIA Tesla: GPU computing gets its own brand
Wednesday 20th June 2007, 08:30:00 PM, written by Rys
Being able to witness the earnest birth of a new computing industry is
a special thing. In the last few years, efforts to use programmable
commodity graphics hardware for things other than graphics have gained
pace, and now there''s a legit multi-million dollar industry surrounding
it.
Today sees NVIDIA further legitimise what they call GPU computing as they introduce Tesla, their third brand based around discrete PC GPU production. Using the famous G80 graphics processor, the first round of Tesla products joins GeForce and Quadro in their product ranks. CUDA is the conduit of course, and we have an article that lets you know what Tesla is all about.
We've also got interviews with Andy Keane, General Manager of the GPU Computing Group at NVIDIA, and Dave Kirk, Chief Scientist, regarding Tesla, CUDA, the future of GPU computing, and better incorporating parallel programming into education, among myriad other things.
Following CUDA's 1.0 release in a week or so, we''ll cover that and interview Ian Buck, Software Manager for CUDA, to talk more about the software side.
Today sees NVIDIA further legitimise what they call GPU computing as they introduce Tesla, their third brand based around discrete PC GPU production. Using the famous G80 graphics processor, the first round of Tesla products joins GeForce and Quadro in their product ranks. CUDA is the conduit of course, and we have an article that lets you know what Tesla is all about.
We've also got interviews with Andy Keane, General Manager of the GPU Computing Group at NVIDIA, and Dave Kirk, Chief Scientist, regarding Tesla, CUDA, the future of GPU computing, and better incorporating parallel programming into education, among myriad other things.
Following CUDA's 1.0 release in a week or so, we''ll cover that and interview Ian Buck, Software Manager for CUDA, to talk more about the software side.
Tagging
nvidia ± tesla, gpu computing, gpgpu, cuda, kirk, keane
Related nvidia News
CUDA 4.0 and Parallel Nsight 2.0 released
NVIDIA Fermi GPU and Architecture Analysis
NVIDIA's Parallel Nsight finally released
NVIDIA GeForce GTX 460 - GF104 breaks cover
PhysX87, ancient tragedy in 5 acts by RWT
So long, Chris, and thanks for all the fish
NVIDIA GF100 graphics architecture details
NVIDIA Fermi: new GPU architecture, starting with GF100
NVIDIA release OpenCL GPU drivers for Linux and Windows
NVIDIA GeForce GTX 275 at $250 to fight HD 4890
NVIDIA Fermi GPU and Architecture Analysis
NVIDIA's Parallel Nsight finally released
NVIDIA GeForce GTX 460 - GF104 breaks cover
PhysX87, ancient tragedy in 5 acts by RWT
So long, Chris, and thanks for all the fish
NVIDIA GF100 graphics architecture details
NVIDIA Fermi: new GPU architecture, starting with GF100
NVIDIA release OpenCL GPU drivers for Linux and Windows
NVIDIA GeForce GTX 275 at $250 to fight HD 4890


Depending on how it is implemented, if part of the extra FP64 cost was neatly divided as another logic block (I'm really not sure how that'd work but heh) then they could just make it redundant, so that even if it increased die size by 5%, it wouldn't affect yields but only the number of chips on the wafer.
As I said, I completely fail to see how that kind of division could be implemented, but perhaps an EE would have a better idea of whether it is possible. Another thing to take into consideration is that only G92 will be sold as a GPGPU part. G94 and G98 will not. So this could just mean they're not including FP64 on the die of G94 and G98.
In theory, you could create test vectors that don't cover the lower bits of multipliers, adders etc, and a strap that ties the outputs of the LSB's to zero. So the unnecessary logic wouldn't lower yield.
In practice, it may not be worth the effort.
Also, stuck-at DFT coverage is typically in the upper 99%, say 99.6%. That's high, but it still leaves millions of connections uncovered. However, one counts on the fact that, statistically, a fault will also impact surrounding locations and that this increases effective coverage.
For an ALU, the unnecessary logic is probably placed very close to necessary logic. Masking out unused logic may indirectly also mask real problems in useful logic.