PeakStream acquired by Google

Tuesday 05th June 2007, 04:04:00 PM, written by Tim

GPGPU middleware developer PeakStream has been acquired by Google, according to The Register. Where this leaves the GPGPU development platform is unclear, although it has stopped selling the product for the moment.

This could certainly be a blow for AMD, since PeakStream was a launch partner and one of the premiere GPGPU packages to use the R580. If Google turns the product into an internal-only platform, AMD will be left with only RapidMind and BrookGPU for high-level GPGPU support. This is certainly not an ideal situation for AMD. RapidMind uses GLSL, making it susceptible to driver issues that have traditionally plagued GPGPU applications, and even though it has a CTM backend, BrookGPU supports no features beyond those available in DX9/OGL2.0--any GPGPU-specific features such as scatter must be performed via assembly tuning. It's possible that AMD had advance knowledge of this acquisition; as of a few months ago, they had developers contributing to the CTM backend of BrookGPU.

However, it is certainly possible that PeakStream will reemerge as an open source, cross-vendor GPGPU platform. With our sources indicating that the specifications to NVIDIA's PTX ISA are due to be released in the next few weeks, it shouldn't be too long before a PeakStream-to-PTX compiler is written, which would make it the only language to support native GPGPU features on both AMD and NVIDIA cards. With CUDA already causing a significant stir in the oil and finance industries as well as academia, a free, stable platform could definitely trigger an explosion in the number of consumer-level GPGPU applications on the market today.

Discuss on the forums

Tagging

gpgpu ± ati, peakstream, cuda, google


Latest Thread Comments (27 total)
Posted by Rufus on Wednesday, 06-Jun-07 09:39:21 UTC
Quoting Geo
Last estimate I saw had Google at 450k servers. Tho my understanding is that is unofficial. Presumably they wouldn't turn them all over in two months or somesuch, but still. If that all went to one specific high-end GPU in a one year period. . . sweet Jaysus.
There's no way they can drop in GPUs to their current servers. Their current servers are just solid CPUs + RAM, no expansion slots of any sort. Plus they're 1U so there's no space to fit a GPU.If they're going to do any GPGPU, they're building a cluster from the ground up. The problem is with density. A standard 2U server will let you have 2 GPUs, but that's fairly lame (lots of wasted space). With 2 Quadro Plexs (3U together) and a 1U server controlling the pair you can have 4 GPUs in 4U, or with a GX2 style card 8 GPUs in 4U. 1GPU/U really doesn't seem very compelling. 2GPU/U does, but there is no GX2 G80 (yet?).Google must have done this as a fairly forward-looking purchase. The market simply is not mature enough at this point to go out and build a large (100+) GPGPU cluster. It'll get there fairly soon (especially if lots of money starts being thrown at problems), but we're talking on the order of a year or two out.

Posted by Bouncing Zabaglione Bros. on Wednesday, 06-Jun-07 10:11:10 UTC
An interesting question is whether Peakstream is now "off the table" and will exclusively be working on proprietary tech for Google's internal use, or if the cash being poured in will see Peakstream products available for the whole market at some point in the future.

Posted by Arwin on Wednesday, 06-Jun-07 10:44:56 UTC
Quoting Bouncing Zabaglione Bros.
An interesting question is whether Peakstream is now "off the table" and will exclusively be working on proprietary tech for Google's internal use, or if the cash being poured in will see Peakstream products available for the whole market at some point in the future.
I think that actually doesn't even matter that much. If Google uses it only internally, but it is perceived by the external market to be successful, then more money will pour into R&D with rivalling companies all over the place. But for now there is little reason for Google to keep it all to themselves. There is always a right time and price. ;)

Posted by NocturnDragon on Wednesday, 06-Jun-07 13:18:53 UTC
Just do add more ideas in the thread:http://glinden.blogspot.com/2006/06/four-petabytes-in-memory.html
Quote
Google reportedly (http://glinden.blogspot.com/2006/04/100k-new-servers-per-quarter-at-google.html) had an estimated 450k machines two months ago and adds machines at roughly 100k per quarter. In 2004, each of these machines had 2-4G (http://en.wikipedia.org/wiki/Google_platform) of memory, and, two years later, likely are up to 8G standard.
While it's true that they cannot just add GPUs to their current servers, they could start using servers with GPUs in them, or just whole racks full of GPUs. (isn't lasso supposed to be just that?)I think the 100k also count as upgraded machines not only new ones.So 400K - 800K gpus a year wouldn't be too exagerated from this POV.Creating some new GPGPU friendly algorithms for indexing and data mining certainly is going to be difficult, but if they bought Peakstream they sure know how to use it.But some other workloads could start to take advantage of it right away.http://arstechnica.com/news.ars/post/20070530-facial-recognition-slipped-into-google-image-search.htmlToday is facial recognition, tomorrow? Objects identification?

Posted by Geo on Wednesday, 06-Jun-07 13:25:47 UTC
Quoting Rufus
There's no way they can drop in GPUs to their current servers. Their current servers are just solid CPUs + RAM, no expansion slots of any sort. Plus they're 1U so there's no space to fit a GPU.

If they're going to do any GPGPU, they're building a cluster from the ground up. The problem is with density. A standard 2U server will let you have 2 GPUs, but that's fairly lame (lots of wasted space). With 2 Quadro Plexs (3U together) and a 1U server controlling the pair you can have 4 GPUs in 4U, or with a GX2 style card 8 GPUs in 4U. 1GPU/U really doesn't seem very compelling. 2GPU/U does, but there is no GX2 G80 (yet?).
They were an AMD launch partner for professional gpgpu, and surely AMD has given this some thought and made progress on the matter. Rumor has it that Nvidia will be launching a productized gpgpu solution in the near future, which would also indicate they must have been giving some thought to how to package the physicals for industrial kind of use. Surely it would come as a surprise to neither that a company interested in gpgpu applications would likely be looking at implementing more than 2 or 3 gpgpus to support it.

Posted by Voltron on Thursday, 07-Jun-07 05:41:01 UTC
This article claims that reason for the acquisition is x86 multi-threading rather than GPGPU. Not that that The Reg or anyone alse has a perfect track record on speculation concerning Google.

http://www.theregister.com/2007/06/06/google_peakstream_server/

Posted by Rufus on Thursday, 07-Jun-07 07:00:38 UTC
Quoting Voltron
This article claims that reason for the acquisition is x86 multi-threading rather than GPGPU. Not that that The Reg or anyone alse has a perfect track record on speculation concerning Google.

http://www.theregister.com/2007/06/06/google_peakstream_server/
Wow that was amazingly melodramatic, but I simply don't see it. The point of PeakStream is to make parallel programming targeted at multi-core and GPGPUs easy. Google has absolutely no need (that's worth buying a company) for the compiler or multi-core work the article talks about. Sure if the X86 compiler guys they got could optimize stuff to make it run 5% faster it'd be nice, but that really doesn't matter for Google. What matters is getting it to run on 2x as many machines with 2x as much data, which is the point of their MapReduce I linked above. Also any sort of multi-core work that PeakStream has does is a complete joke compared to MapReduce, running something on a single quad core vs across a 10,000 node cluster.

Now I'm not saying PeakStream didn't have smart x86 guys that'll be put to good use at Google, but they're just a nice thing on the side. The only thing that made PeakStream unique from any other compiler company was the GPGPU work, and that's the only sensible reason Google would buy them.

Edit: I was looking through some google papers and found a nice quote (page 7) in this mapreduce (http://www.cs.virginia.edu/~pact2006/program/mapreduce-pact06-keynote.pdf) presentation that shows my point:
Quote
Single-thread performance doesn't matter
We have large problems and total throughput/$ more important than peak performance.
The only reason Google cares about PeakStream is if their work can give them higher throughput/$ than their current MapReduce. The only way it can do that is GPGPU.

Posted by DemoCoder on Thursday, 07-Jun-07 07:58:06 UTC
Actually, Google cares about throughput per watt more than anything else. They are facing serious power density problems, and Google is not interested in being an environmental power hog just because they've got enough money to build their own power plant if they wish.MapReduce is great at distributing functional bits over a large cluster (actually, it's MapReduce + GFS + a component not known by the public yet) but it does nothing to help boost the throughput per watt per cluster node.It is highly wasteful to keep scaling by adding more cases, more PSUs, more mainboard chipsets, more RAM, more local disk, more everything because you are not leveraging Moore's law in your favor.Google knows the future is in throughput computing. They've looked seriously at Sun Niagara, at Azul Vega, at BlueGene, in trying to pack more throughput per box. Eventually, Google will need to move in the future to nodes with throughput oriented CPUs or a high number of non-SMP cores.There's no way they're going to install GPUs on 450k machines (their current machines have no GPU acceleration) and there's a fat change they're gonna install a G80 or R600 power-sucking GPU on any of their nodes.My guess is the acquisition boils down to the following:1) google is always looking to hire very smart teams, they buy companies sometimes just to get their engineers2) peakstream may be useful to them for the purpose of distributing mapreduce jobs *written in a platform independent* mostly-functional kernel-oriented language that compiles regardless of the target (x86, power, sparc, etc) Remember, MapReduce programs still need to run on a node, and there's no reason that they have to be written in C.3) prevent anyone else from using peakstream (e.g. MS)They may not have a GPU plan in mind at the moment, and frankly, given the kinds of work that Google does today with M/R, I don't see those jobs running in GPUs. There's tons of scatter/gather going on within a MR job. MapReduce != Stream Computing. The "kernels" of work that a given node deals with in MR can involve dozens of operating on gigabytes of work (for example, graph traversal jobs) Remember, Google is tossing around *Petabytes* of data on MapReduce.

Posted by Tim Murray on Thursday, 07-Jun-07 16:09:47 UTC
Quoting DemoCoder
Actually, Google cares about throughput per watt more than anything else. They are facing serious power density problems, and Google is not interested in being an environmental power hog just because they've got enough money to build their own power plant if they wish.
I agree with you there, which is why I don't agree with this:
Quote
There's no way they're going to install GPUs on 450k machines (their current machines have no GPU acceleration) and there's a fat change they're gonna install a G80 or R600 power-sucking GPU on any of their nodes.
Google is constantly expanding its search capabilities, and if they want to get into things like searching inside pictures or video they could certainly use GPUs for some of that.

I think what's most likely is that we'll see a GPGPU-esque MapReduce. GPGPU workloads are generally data-parallel to begin with, and the PeakStream execution model already created kernels on the fly. Translating this to something that could run across, say, 1000 RV630s--there's no reason to use something like R600/G80, since you're looking at 4x the power consumption for 2-3x the performance--doesn't seem like much of a leap to me. And if you have 1000 RV630s, you're looking at a theoretical peak of 192TFLOPS for 45kW.

Sure, you aren't going to use RV630s, but I think you see my point.

Posted by NocturnDragon on Thursday, 07-Jun-07 16:51:37 UTC
I've found a nice presentation about MapReduce to understand it a bit better:Experiences with MapReduce, an abstraction for large-scale computation (http://www.google.com/search?lr=&ie=UTF-8&oe=UTF-8&q=Experiences+with+MapReduce%2C+an+abstraction+for+large-scale+computation+Dean), Jeffrey Dean (http://labs.google.com/people/jeff/), Proc. 15th International Conference on Parallel Architectures and Compilation Techniques, 2006


Add your comment in the forums