Competitive Dynamics Analysis

One thing should be clear: adoption and software development take a lot of time, especially for such extremely different hardware architectures, as do deployment and generating momentum in general. While it might seem like NVIDIA doesn’t have much of a head start in the field because they’re still not generating substantial revenue, that’s just wrong.

While it may be easier to, say, port CUDA programs to OpenCT or Brook+ than start from scratch, it doesn’t happen overnight and neither does generating interest or mass deployment. It’s good that AMD keeps supporting Brook+ and releasing new Firestream cards, and that’ll certainly help them very much down the road by making sure everyone remembers they still care about GPGPU and the HPC market, but sadly they don’t have much to show for it except some interest from academics and universities.

In practice, nobody’s going to buy a hundred of your boards just because you can run binomial option pricing on them. The real applications most people want are much more complex than that – or, alternatively, for science/research they may often be highly customized and developed by the scientist who needs to run the algorithm, so a simpler/more intuitive API helps as does having a large install base to help you solve problems. AMD’s disadvantage with Brook+ is far from overwhelming, but it is there.

In the longer term, however, the dynamics are much more complex to analyze. First of all, the ‘fundamentals’ of the hardware will start having a greater impact. Performance per mm² won’t matter that much because ASPs are margins are inherently much higher in this market, but performance per watt very much will as well as its indirect impact on performance per unit of rackspace in clusters. Therefore, higher ALU:TEX/ROP/… ratios would provide an advantage because even if the rest of the chip is idling, it still takes power due to leakage. However, as long as the differences aren’t too extreme, the impact of this should not be overstated.

Therefore, the most important factor there is simply the overall efficiency of the ALUs/shader core. This goes along with the software API to determine how attractive the overall solution is to a potential application developer. It is important to remember, however, that the difference between various GPU architectures (including Larrabee) are likely to be much smaller than between GPUs and CPUs; therefore, while it may be worth porting a solution from x86 to CUDA, it’s possible that it’s no longer worth the trouble to port it to Brook+ or Ct even if AMD or Intel delivers perf/watt or perf/$ twice as high as NVIDIA. This is only true, of course, for existing applications – not new ones.

The discussion above assumes that every hardware vendor has its own API. But of course, if one API supported multiple vendors’ hardware efficiently, the dynamics would change substantially. In theory, everyone would just buy the fastest hardware at a given price. In practice, however, it’s still a bit more complex: the optimization process for one GPU may be very different, the supported features may be slightly different, and in some cases there may be a better API to program the hardware in.

It may be fair to say that software adoption momentum is linked to the API (assuming the feature-set and optimization differences aren’t too massive) while the deployment momentum outside of individual users and universities (i.e. for Oil & Gas, medical systems ala TechniScan, etc.) is much more linked to the hardware. Therefore, the financial consequences of a currently vendor-specific API opening up to more HW would not show up much in the short term. Furthermore, while it may help overall adoption, being vendor-independent is unlikely to be the primary concern of much of anyone in the HPC world so the primary short-term benefits would be for consumer applications.

Furthermore, in the consumer world, it is important to understand that the number of applications that would most benefit from acceleration is actually very limited, and therefore there is actually little need for third party assistance. In fact, the latter may even be undesirable because it is much harder to generate momentum for hardware based on software than on freeware. Ideally, you want a balanced mix of both, and it is not obvious that a vendor-agnostic solution is truly necessary for that.

Either way, there are compelling reasons why opening up an API to hardware with sufficient functionality makes sense; however, from a financial perspective, it is difficult to justify (especially for an existing API with significant penetration) unless you are absurdly confident that your solution is and will remain by far the best in the market. For HPC alone (which is of course not the proper way to look at it), the benefit just isn’t worth the cost and things like the highly optimized x86 path are much more likely to turn heads if they deliver in terms of performance.

Overall, while the short-term isn’t all that hard to predict, the long-term is absolutely impossible to accurately forecast in any way whatsoever, both for specific vendors and for the overall GPGPU market. It will be interesting to look into the impact of specific dynamics as they develop, but right now little can be said without risking likely getting everything wrong. Anyone claiming the contrary is likely just being overconfident.

Financial Prospects

So, just how much money is at stake here? In early 2007, NVIDIA predicted the GPGPU market to have an addressable market of several billion dollars (at well above corporate gross margins) as early as 2011. Was that a realistic prediction? Yes and no. They calculated that by considering what markets they could serve in theory (i.e. bring a substantial performance advantage to) and assumed every CPU-based solution would be replaced by a GPU-based solution of similar value. They also argued that by boosting performance so much, they could actually grow the market.

There is definitely some truth to this, but there are two problems. First of all, it is unrealistic to assume that every potential application in the major HPC fields could be offloaded to the GPU in that timeframe; so while the addressable of the market of the hardware might be that large, the overall ecosystem’s addressable market won’t be. Secondly, in the short-term, such a huge improvements in performance/$ means that the amount of money that needs to be spent for many applications will go down, and the overall market (assuming GPGPU does take-off) will likely shrink a bit.

In the long-term, however, there definitely are plenty of new and very computationally intensive applications that could represent very large revenue opportunities. If ‘nano assembly’ replaced lithography for semiconductor (or semiconductor-like) manufacturing, for example, the amount of performance required would be truly stunning. From a more incremental point of view, medical and research equipment is definitely still rapidly increasing in precision which may require performance improvements higher than what CPU scaling can deliver.

So while the claims of (HPC-centric) GPGPU potentially becoming an industry worth several billion dollars are credible, the timeframe feels too aggressive to us; it’s not strictly impossible, but would definitely surprise and impress us. The market opportunity in the short/mid-term remains very large, it’s just that we suspect it might take a bit more time than that (2013+?) to achieve such colossal numbers. There’s another side to this which wasn’t really talked about much in 2007 though: consumer applications and how they might affect the CPU-GPU mix.

We already described that a bit on one of the previous pages, so we won’t do it again, but it’s important to understand the scale we are talking about here. The CPU business in 2007 was worth more than $30B. If estimate that 30% of that money could shift to GPUs and consider that only 30-50% of the price of a GPU board goes to the chip manufacturer, then that’s still about $3.5B in new GPU revenue. In 2007, that’d have nearly doubled the size of the discrete graphics market!

Is 30% realistic? Yes, it is, but not for just a few applications. For value to start shifting really fast, the number of advantages for one solution over the other must be noticeably higher. Right now, this is far from being the case outside of the gaming market and significant investment in consumer applications of GPGPU, ideally to be released as freeware, is required. There is definitely a big opportunity here, especially as OEMs sometimes appear eager to differentiate their products on more than just CPU speed, but it remains to be seen whether things move fast enough to make a truly significant impact.