Intel Larrabee @ SIGGRAPH 2008

Monday 02nd June 2008, 09:25:00 AM, written by Arun

Starting in August, part of the shroud of mystery around Larrabee is going to dissipate: A paper called 'Larrabee: A Many-Core x86 Architecture for Visual Computing' will be presented at SIGGRAPH by its authors, which include Doug Carmean, Tom Forsyth, Michael Abrash, Pat Hanrahan and many others.

The paper's abstract describes Larrabee as using 'multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as fixed-function co-processors. This provides dramatically higher performance per watt and per unit of area than out-of-order CPUs on highly parallel workloads and greatly increases the flexibility and programmability of the architecture as compared to standard GPUs.'

Nothing revolutionary or that we didn't know before there, but we'll definitely be looking forward to this. No promise that I/we go to SIGGRAPH this year, but it's still relatively likely - plus, this likely won't be the only event where Intel presents Larrabee this year. It's worth pointing out that Larrabee will be competing head-on against NVIDIA and AMD's DX11 GPUs, not their current ones; sadly it seems unlikely that either company will be willing to disclose anything substantial about their next-generation architectures until well into 2009.

[Thanks to nAo for the tip!]


Discuss on the forums

Tagging

intel ± larrabee


Latest Thread Comments (480 total)
Posted by MfA on Sunday, 17-Aug-08 19:28:16 UTC
Well, they said so :)
Quote
All instructions can issue on theprimary pipeline, which minimizes the combinatorial problemsfor a compiler. The secondary pipeline can execute a large subsetof the scalar x86 instruction set, including loads, stores, simpleALU operations, branches, cache manipulation instructions, andvector stores.

Posted by nAo on Sunday, 17-Aug-08 19:33:12 UTC
Quoting MfA
Well, they said so :)
Ok, fair enough, I didn't remember that part of the paper.

Posted by Jawed on Sunday, 17-Aug-08 20:15:22 UTC
Quoting MfA
I wonder why only the primary pipeline can do vector loads ...
Anything to do with the ability of the VPU to read one operand directly from L1? Jawed

Posted by PeterT on Tuesday, 19-Aug-08 13:44:44 UTC
Quoting armchair_architect
DX11 compute shaders and OpenCL both appear (from what we know) to essentially be the CUDA programming model. Not CUDA-the-language exactly, but very similar decomposition of parallelism and combination thread/memory hierarchy. To be more clear, what I like is the CUDA programming model (which includes DX11 compute and OpenCL), not just NVIDIA's current implementation of that programming model.
In the terminology you use here, are the manual shared memory management and coalescing requirements of CUDA part of the programming model or just of NVs current implementation? Because, as someone who's used both traditional GPGPU and CUDA on HPC problems that's the part of CUDA that I can't see as part of any future more-or-less "mainstream" parallel programming language/model. I also think those should be mentioned before talking about how well CUDA scales -- because a CUDA program that's actually optimized (and when it comes to coalescing and shared memory we're not talking about single-percent type optimizations but potentially orders of magnitude differences) won't come close to porting to another architecture optimally.

It's also interesting (and I haven't seen anyone explicitly mention it in this thread) to see the different programming trade offs between Larrabee and NV GPUs. As I currently understand it, on the former, you get automatic/hardware cache management but no hardware thread scheduling. On current NV hardware it's just the other way round. I'm not yet sure what's preferable, ideally you'd have (the option of using) both.

Posted by Kaotik on Wednesday, 20-Aug-08 02:23:27 UTC
Slightly related, Larrabee (A many core Intel Architecture for Visual Computing) presentation was apparently *too* popular at IDF Fall 2008, a big portion of press got left outside due lack of space. Due this, the presentation will get another round thursday.

Posted by nAo on Wednesday, 20-Aug-08 02:35:36 UTC
Quoting Kaotik
Slightly related, Larrabee (A many core Intel Architecture for Visual Computing) presentation was apparently *too* popular at IDF Fall 2008, a big portion of press got left outside due lack of space. Due this, the presentation will get another round thursday.
Hopefully they released some more details..

Posted by kyetech on Wednesday, 20-Aug-08 02:51:33 UTC
Tech demo vid for us laymen to enjoy would be good!

Posted by nAo on Wednesday, 20-Aug-08 02:57:00 UTC
Quoting kyetech
Tech demo vid for us laymen to enjoy would be good!
Tech video on emulated hardware? not exactly exciting :)

Posted by kyetech on Wednesday, 20-Aug-08 03:45:15 UTC
just cos samples arnt out till november doesnt mean thay aint got nuthin now?

besides, i like seeing pretty graphics, real time or otherwise.

Posted by armchair_architect on Wednesday, 20-Aug-08 06:55:57 UTC
Quoting PeterT
In the terminology you use here, are the manual shared memory management and coalescing requirements of CUDA part of the programming model or just of NVs current implementation?
Coalescing would be implementation. You could relax those rules significantly and (a) existing code would still run without change, and (b) apps that do uncoalesced access would automatically go faster. Whether global memory is cached or not is also in this category.

The on-chip shared memory is part of the programming model. You can implement it with normal cache and good cache eviction policy controls (like Larrabee apparently has). But your code has to be written to deal with it explicitly or you get no benefit, and it affects your algorithms and data structures. So I'd call it part of the programming model.

I wouldn't be upset if shared memory evolved in the Larrabee direction. The important thing is to be able to guarantee very low latency and very high bandwidth to a chunk of data used by a group of cooperating threads. Generic caching isn't good enough, IMHO.


Add your comment in the forums

Related intel News

Intel's Aaron Coday talks to Develop about Larrabee
Larrabee to also be presented at Hot Chips
Larrabee's Rasterisation Focus Confirmed
Nehalem Article @ RWT + 3.2GHz samples(?)
Opinion: Silverthorne fails but PowerVR impresses (+Montalvo trouble)
Belated Analysis: Intel Atom/Silverthorne
Havok physics software on PC soon-to-be free for non-commercial use
Intel purchases young game development house Offset
Larrabee: Samples in Late 08, Products in 2H09/1H10
Intel results indicate consumer spending strength; investor ignorance