The Need for Education


What people are going to discover, though, is that CUDA is hard. Writing the code isn't hard--CUDA really is just C with a few added keywords--but designing algorithms to really utilize the architecture can be fantastically difficult. One concern that NVIDIA has is that students in computer science won't get enough training with parallel algorithms and massively parallel architectures to be able to make the best use of CUDA. This certainly isn't unjustified. A year ago, if we were to mention a massively parallel architecture, we'd be talking about a supercomputing cluster. Now, some of the same difficulties in designing software for a cluster apply to every G8x chip.

To try to improve the situation, David Kirk, chief scientist at NVIDIA, taught a class at UIUC on data-parallel programming using CUDA, the previously mentioned ECE 498AL. All of the materials for the class, including lecture slides, audio recordings of all lectures, and assignments, are freely available. We've gone through all of the materials to get a better understanding of CUDA ourselves, and we highly recommend it to anyone interested in data-parallel programming or CUDA. NVIDIA, and Kirk in particular, are hoping that ECE 498AL can be used as a template for classes in data-parallel programming at other universities.

Of course, many will claim that NVIDIA is pushing classes in data-parallel programming as a way to push CUDA, and there's certainly some element of truth to that. The problem with that view, though, is the lack of other widely available massively parallel machines. As Kirk told us, it's not possible to be entirely platform agnostic when teaching any low-level language like CUDA. Teaching C usually involves some explanation of what's happening on an x86, and teaching data-parallel programming using CUDA isn't too different. Most importantly, though, students need to be exposed to these types of architectures as early as possible. Massively parallel architectures won't be some fad that goes away in five years, and any exposure, even if it's centered around CUDA, is better than leaving students totally clueless once other vendors introduce similar products.

The Future


First, it's important to keep in mind Tesla's position in the marketplace. CUDA on consumer products isn't going away. Developing CUDA is perfectly possible on GeForce cards, and we expect that the importance to CUDA in mainstream applications as well as gaming will quickly grow with the number of G8x chips. Tesla is instead focused at the users of high-end GPU computing products that will be certified for use with specific Tesla products.

However, that does not mean that there won't be Tesla-specific features. One of the problems preventing the adoption of GPU computing in some markets is the need for double precision. As we look to G92, which has been stated to be close to 1 TFlop for single precision processing as well as being capable of double precision processing, we can say that double precision on G92 will be limited to the Tesla line. While this will surely disappoint some, the need for double precision processing on a consumer GPU is questionable at best, and NVIDIA sees this as an excellent way to differentiate the two product lines. We expect that there will be other Tesla-specific features, such as ECC RAM, that simply don't make sense in the consumer market.

As far as CUDA goes, 1.0 should launch next week, and we'll have in-depth coverage of just what that means for developers. Among the new features are asynchronous kernel calls (freeing up the CPU while the GPU runs), improved FFT and BLAS libraries, and a Matlab library to offload whatever processing it can to the GPU. Most importantly, though, 64-bit versions of CUDA will be available, correcting a major complaint with the earlier betas. In addition, the separation between CUDA drivers and normal device drivers will soon end, making CUDA realistically available to end users. Finally, the specification for PTX, the intermediate assembly language used by CUDA, will be opened, allowing other languages to gain the same access to the chip as CUDA has. However, it also means that backends for different architectures could be developed, potentially changing the GPU computing game considerably. Keep an eye out for that.

All in all, 2007 looks to be the year when GPU computing starts to make inroads into the market. With Tesla, NVIDIA is making a serious bet on the future of the company (and certainly looking at waging war with the CPU guys). Considering the disruptive effects that GPU computing could have on some markets, it's a very exciting time for the industry, and with GPU computing capabilities stealthily introduced to more and more computers, we're definitely looking forward to seeing just what happens.

Also, be sure to check out our interviews with David Kirk, NVIDIA Chief Scientist, and Andy Keane, General Manager of the GPU Computing group regarding Tesla, the future of GPU computing, and better incorporating parallel programming into education.