GPGPU and 3D luminaries join 3D graphics heavyweights

Monday 07th January 2008, 06:06:00 PM, written by Rys

A couple of superstars in their own spaces have made professional newsworthy leaps. GPGPU pioneer and attractive hire for both companies we're about to mention, Mike Houston, and long-time (Direct)3D heavyweight, Tom Forsyth, have both taken up key positions at two 3D graphics giants.

Mike, key adviser to Beyond3D as we explored GPGPU in its infancy, is joining the AMD Graphics Products Group under the stewardship of Raja Koduri, one of AMD's most talented graphics engineers. He takes up an architecture position, helping AMD build future generations of graphics products, while also helping to steer their software direction when it comes to general compute on the GPU. AMD have to engineer themselves out of a hole right now, and Mike will be key to that effort.

TomF's well known by anyone with an interest in 3D graphics programming, via his output on 3D and game programming mailing lists web-wide. Just about to leave RAD Game Tools for the Larrabee team at Intel, we reckon that signals Intel's intent to do good things with Direct3D with their upcoming architecture, since TomF's expertise in that field is crystal clear.

We wouldn't ordinarily mention hires as news, but given who the guys are, where they're going and what they're going to do, it's well worth a mention for anyone with a keen eye on the future direction of programmable 3D graphics hardware. Both are key to their respective employer's success in the field.

No idea who Mike and Tom are? Shame on you. How about a visit to Mike's Stanford Uni homepage, or TomF's blogwiki to get up to speed. We've also interviewed Mike on Folding@Home on the GPU. Pick apart the ramifications of the hires in the comments thread below.

Discuss on the forums

Tagging

graphics ± amd, intel, mike, houston, tom, forsyth, tomf, larrabee, gpg


Latest Thread Comments (24 total)
Posted by Killer-Kris on Wednesday, 09-Jan-08 05:20:57 UTC
Quoting 3dilettante
I still expect growing pains for Larrabee I, though its successor should do better.
Wouldn't a successor have to begin development long before the precursor was released, and would they have been able to identify weaknesses in time to be fixed?

Posted by NocturnDragon on Wednesday, 09-Jan-08 11:51:22 UTC
Quote
The SuperSecretProject is of course Larrabee, and while it's been amusing seeing people on the intertubes discuss how sucky we'll be at conventional rendering,* I'm happy to report that this is not even remotely accurate.*
(Bold mine)
Now the question is.. what are they comparing Larrabee to? G80? G92? future theorical performance extrapolated from current one, taking in account the ETA for Larrabee?

Posted by 3dilettante on Wednesday, 09-Jan-08 15:35:42 UTC
Quoting Geo
That's always been a question for long time observers of Intel. . . will they have the sticktoitiveness to hang around and make rev 2 and rev 3 better, and in timely fashion.

There's a mountain to climb there, no question, and they can't climb it all at once. Everything we know about graphics history of the last 12 years or so tells us that conclusively.
Larrabee has a few things going for it that I feel should be sufficient to carry it through a revision or two.

CPU designers admit that the utility of symmetric multicore drops to near zero after 4 standard cores, outside of certain lucrative but still limited market segments.

Intel admits there is a need for different methods to further the usability of massively multicore architectures.
In that respect, the work put into Larrabee is work Intel needs to do anyway.

Larrabee's core design may also share the same design philosophy as Intel's Silverthorne core, and in-order 2-wide x86 (besides the vector unit). The core itself is simple enough that design costs are small, with the big difference being in the cache and uncore: areas Intel needs to improve anyway.

I think that means that a fair amount of Larrabee's expense is incremental to things Intel is doing anyway. Better, it's an attempt to get revenue on work that would otherwise not bear fruit for many more years.


The possible upside?
Larrabee's very likely to be a very strong competitor for GPGPU. In addition, it will compete in HPC in a number of areas that AMD's Bulldozer should have gone in (according to early slides showing it's prowess in HPC).

With one design effort, Intel furthers massive multicore design, hurts the revenues of GPGPU, makes some money in HPC and possibly graphics, and sucker-punches AMD's next-gen design effort on two fronts (Three, if we discover Silverthorne is distantly related to it and hurts Bobcat. Via gets nailed too).

Even if Larrabee fails in consumer graphics, it should be of interest to HPC. Even if it fails there, the other benefits would be enough that Intel could swallow a few weak iterations and still do well as a whole.

This might be an Itanium-type situation. As poorly as it did early on, the product is now profitable on an operations basis and it helped kill off several wobbly RISC lines in the high-end. POWER and SPARC are primarily the remaining RISCs that still exist in non-embedded or telecom. SPARC isn't growing and its new products are in a niche (one that Larrabee could also target...).
IBM is working seriously hard to maintain a leap-frog relationship with a design that is far more intensive on all levels of the system than Intel's.

Quoting Killer-Kris
Wouldn't a successor have to begin development long before the precursor was released, and would they have been able to identify weaknesses in time to be fixed?
The product should be taped out long before the successor's design is frozen.
They'll have some good ideas from running engineering silicon where some improvements could be made.
Larrabee II should also benefit from the software and driver snafus that might pop up for the design that first wades into the real world.

Posted by nAo on Wednesday, 09-Jan-08 16:33:46 UTC
Quoting Geo
What are you thinking, nAo? That they might do DX10 entirely in software other than texture samplers?
IMHO it wouldn't be such a good idea, rasterization per se doesn't map well to CPUs.Though I wouldn't be surprised if Larrabee has a hw rasterizer but not a dedicated setup unit, as it can be implemented in software (does Larrabee support double precision math? ...)

Posted by 3dilettante on Wednesday, 09-Jan-08 16:58:29 UTC
Larrabee should support x86, and some slides show it as a system processor, which hints at full support.

Slides on Larrabee indicate it should be capable of 8-16 DP operations per clock using SSE. I'm not sure why there's a range, perhaps it hadn't been decided when the slides were created or there's a different throughput depending on whether the code uses standard SSE or the expanded vector set.
Aside from that, it was stated to support 2 DP non SSE ops.

This extra support does point to the greatest internal threat to Larrabee: the everything and the kitchen sink syndrome that hit the IA64 Merced.

If McKinley is an indicator however, Larrabee II should come about after the "we can do anything" phase ends and designers can focus on what it can do well. This is where things go from pie-in-the-sky to interesting.

Posted by dkanter on Sunday, 20-Jan-08 11:36:40 UTC
Quoting 3dilettante
Larrabee has a few things going for it that I feel should be sufficient to carry it through a revision or two.

CPU designers admit that the utility of symmetric multicore drops to near zero after 4 standard cores, outside of certain lucrative but still limited market segments.
I'd strongly disagree with this point of view. Perhaps some (maybe even many) do, but there are some who do not. In general, I think the most aggressive architects have not given up on higher performance. Actually, the most aggressive architects have not given up on higher single threaded performance.

Quote
Intel admits there is a need for different methods to further the usability of massively multicore architectures.
In that respect, the work put into Larrabee is work Intel needs to do anyway.

Larrabee's core design may also share the same design philosophy as Intel's Silverthorne core, and in-order 2-wide x86 (besides the vector unit). The core itself is simple enough that design costs are small, with the big difference being in the cache and uncore: areas Intel needs to improve anyway.

I think that means that a fair amount of Larrabee's expense is incremental to things Intel is doing anyway. Better, it's an attempt to get revenue on work that would otherwise not bear fruit for many more years.
Yes Intel needs to figure out how to get 16 cores to work on a die from a SW perspective. However, that doesn't require a product.

There is a huge difference in cost between an internal research project and a product that you sell to end-users.

Additionally, doing something too early can be very costly if the ROI isn't high enough.

Quote
The possible upside?
Larrabee's very likely to be a very strong competitor for GPGPU. In addition, it will compete in HPC in a number of areas that AMD's Bulldozer should have gone in (according to early slides showing it's prowess in HPC).
Right now AMD's primary concern should be Nehalem and Sandy Bridge, not Larrabee.

Quote
With one design effort, Intel furthers massive multicore design, hurts the revenues of GPGPU, makes some money in HPC and possibly graphics, and sucker-punches AMD's next-gen design effort on two fronts (Three, if we discover Silverthorne is distantly related to it and hurts Bobcat. Via gets nailed too).

Even if Larrabee fails in consumer graphics, it should be of interest to HPC. Even if it fails there, the other benefits would be enough that Intel could swallow a few weak iterations and still do well as a whole.
That would be a disaster, the good news is that Intel can survive disasters.

Quote
This might be an Itanium-type situation. As poorly as it did early on, the product is now profitable on an operations basis and it helped kill off several wobbly RISC lines in the high-end. POWER and SPARC are primarily the remaining RISCs that still exist in non-embedded or telecom. SPARC isn't growing and its new products are in a niche (one that Larrabee could also target...).
IBM is working seriously hard to maintain a leap-frog relationship with a design that is far more intensive on all levels of the system than Intel's.

The product should be taped out long before the successor's design is frozen.
They'll have some good ideas from running engineering silicon where some improvements could be made.
Larrabee II should also benefit from the software and driver snafus that might pop up for the design that first wades into the real world.
I certainly hope that Larrabee turns out better than Itanium. The last time Intel took their eyes off the ball, they ended up letting x86-64 slip through the door.

DK

Posted by 3dilettante on Tuesday, 22-Jan-08 20:17:24 UTC
Quoting dkanter
I'd strongly disagree with this point of view. Perhaps some (maybe even many) do, but there are some who do not. In general, I think the most aggressive architects have not given up on higher performance. Actually, the most aggressive architects have not given up on higher single threaded performance.
Just going by various admittedly dated presentations a while back where both AMD and Intel drew the line for homogenous multicore of big OoO cores at 4-8 for the desktop, with the provision that server and other segments could do with more.

Quote
Yes Intel needs to figure out how to get 16 cores to work on a die from a SW perspective. However, that doesn't require a product.
But it does mean Larrabee's design cost is incremental with research that would be ongoing anyway.
Intel has to put the pedal to the metal sometime.
Itanium already showed what can happen if you only rely on internally formulated means of evaluation for a design.

There's a decent chance that it will gain a foothold in a few areas, and it can serve as a bulwark against competitors that are rapidly running out of choices when it comes to competing.
AMD and Nvidia have a high flops part that can establish a niche if not contested, and ceding a potentially lucrative market is handing them free money.

Quote
There is a huge difference in cost between an internal research project and a product that you sell to end-users.
There's more gradation on the scale from internal project to a massive cross-market product ramp.
The plans are tentative enough and far enough ahead that Intel can scale back production and expectations as needed (unless its ray-tracing uber alles marketers go unchecked for the next year or so, that is).

Quote
Additionally, doing something too early can be very costly if the ROI isn't high enough.
I suspect this is the likely outcome, but if that's the case they may only be off by a year or so.

Quote
Right now AMD's primary concern should be Nehalem and Sandy Bridge, not Larrabee.
Sure, but it's not in Intel's best interest to refrain from adding more worries.
It's not Intel's fault that Bulldozer's been delayed into that timeframe.

For AMD's GPGPU line, Nehalem and Sandy Bridge are obstacles, but they are not directly targeted at similar workloads. The order of magnitude performance gap for customers that have tasks amenable to GPU boards won't be removed by the next CPU generation or two.

Larrabee, if targeted correctly, can undercut both Firestream and AMD's attempt to appeal to HPC with SSE5 and its new instructions. This of course assumes that AMD's delay of Bulldozer isn't because AMD chickened out and is yanking its second set of FP extensions Intel refused to recognize.

Quote
That would be a disaster, the good news is that Intel can survive disasters.
It would be a disaster if all of that fails, but it's more of a form of graceful degradation from "not so good" to "pretty bad" to "utter disaster".
The failure would be mitigated in part by the fact that a fraction of its expense is overlayed with otherwise unavoidable expenditures.

Intel has to fail to gain a foothold in multiple market segments for Larrabee to fail.
For a complete disaster, it has to fail to garner any devrel or mindshare among developers that serves as groundwork for future software that will most likely tread the same path again.

Quote
I certainly hope that Larrabee turns out better than Itanium. The last time Intel took their eyes off the ball, they ended up letting x86-64 slip through the door.

DK
Depending on the competitive environment, it may only need to keep its eye on the ball part of the time.

I don't seen any yawning gaps Larrabee creates in Intel's product lines, and it goes along way in plugging up a few avenues of attack.

Posted by Geo on Wednesday, 23-Jan-08 17:37:55 UTC
The available evidence suggests to me that Larrabee was aimed more at Nvidia and ATI than AMD, and that the AMD acquistion of ATI and launching of the Fusion project was, in part, AMD's reaction to figuring out where Intel was going.

We just knew more, sooner, about AMD's efforts because it involved a multi-billion dollar acquisition that forced them to talk about them at length in public much sooner, almost certainly, than they'd have done so if that factor wasn't involved.

Posted by heliosphere on Wednesday, 23-Jan-08 23:57:53 UTC
Quoting nAo
I thought Pixomatic was an Abrash's solo effort
I heard from a reliable source that Abrash is working on Larrabee as well.

Posted by Frank on Friday, 25-Jan-08 19:22:47 UTC
Well, we have one "proven" design (Cell) and one that goes the same direction, but foregoes the things that make Cell work: large local stores instead of a unified (cached) memory structure, and a new ISA.Without a close look at a first sample in action, there isn't much we can say about the validity of having all those cores crunch away in a meaningful manner.


Add your comment in the forums

Related graphics News

Travelling in Style: Beyond3D's C++ AMP contest
Beyond Programmable Shading CS448s first slides available
Khronos release OpenGL 3.3 and 4.0
Mazatech release AmanithVG 4.0, supporting OpenVG 1.1
OpenGL 3.0 is here (finally)
[Analysis] TSMC 40G to deliver up to 3.76x the perf/mm^2 of 65G & Power Implications
Old News: AMD CTO resigns, NVIDIA CFO retires, DDR3 for MCP7A, S3, etc.
SwiftShader 2.0: A DX9 Software Rasterizer that runs Crysis
S3 launches DirectX 10.1 Chrome 400 GPUs
The Technology of a 3D Engine - Part One