NASA Evaluates Cell for Climate Modeling

Thursday 31st July 2008, 05:18:00 PM, written by Carl Bender

At last month's International Supercomputing Conference in Dresden, NASA presented the results of an internal study exploring the suitability of the Cell BE architecture towards accelerating key aspects of climate modeling.  Centered on the computationally intensive solar radiation component of the Goddard Earth Observing System Model v5 (GEOS-5), the study required that roughly 2,000 lines of Fortran be ported to C in order to run on the Cell's SPEs.  As the data in question benefited from computational independence, it was vectorized as well to result in four columns of data mapped to each of the eight SPEs present on-chip. The Cell was in turn able to work through over 3,000 columns of data per second, across all test cases.

When run on a QS20 Cell blade provided by the UMBC Multicore Computational Center, significant single processor performance gains were recorded against Intel architectures presently in use by the NASA Center for Computational Sciences.  Running the maximum test case of 1,024 data columns, Cell showed a 6.76x improvement against Woodcrest@2.66GHz.  Against Dempsey@3.2GHz the improvement was 8.91x, and against an Itanium 2 running at 1.5GHz, Cell demonstrated a 9.85-fold improvement.  The comparisons were kept single-core due to the non-linear performance scaling of the Intel architectures, with the Intel architectures running the baseline Fortran as opposed to the ported C. 

All comparisons were for single precision performance, as the study predates the introduction of the PowerXCell 8i.  The study was conducted before the introduction of a Cell Fortran compiler and auto-SIMD tools as well, both of which have since seen release by IBM in calendar 2008.  It was with that knowledge in mind, however, that a relatively light ~2,000 line portion of code was chosen to serve as a testbed. 

Having shown positive results on solar radiation, the team is now exploring further code porting for the computationally similar moisture, chemistry, and turbulence physics models; collectively these types of physics problems account for ~50% of GEOS-5 compute time.  In addition, project lead Shujia Zhou has been working on a hybrid solution towards allowing Fortran to be run directly on the Cell processor. 

A session at the 2008 American Geophysical Union Fall Meeting this December entitled "Emerging Multicore Computing Technology in Earth and Space Sciences" should serve as the next window into their progress.

Discuss on the forums

Tagging

±


Latest Thread Comments (6 total)
Posted by patsu on Thursday, 31-Jul-08 18:22:34 UTC
Is it only speed up ? How about power consumption, density and such ?

Posted by Carl B on Thursday, 31-Jul-08 18:37:40 UTC
Since they only evaluated a single Cell processor against single core iterations of the competing 'control' processors, those other metrics simply didn't come into play whatsoever in terms of the study results. Wattage and density would obviously be derived by the form the respective solutions took, so shouldn't be difficult to come up with estimates based on shipping solutions.

Posted by wingless on Thursday, 31-Jul-08 22:44:02 UTC
NASA should evaluate GPGPU from ATI or Nvidia to handle these tasks. In GPGPU task a GPU can justify it's power draw with it's high performance.

Posted by ShaidarHaran on Friday, 01-Aug-08 00:32:01 UTC
Quoting wingless
NASA should evaluate GPGPU from ATI or Nvidia to handle these tasks. In GPGPU task a GPU can justify it's power draw with it's high performance.
Assuming they can even handle the workload... DP performance only achieved parity or surpassed single-socket CPUs this generation, and the flexibility (i.e. range of supported instructions) still isn't on the same level as CPUs.

Posted by Carl B on Friday, 01-Aug-08 13:20:28 UTC
Quoting wingless
NASA should evaluate GPGPU from ATI or Nvidia to handle these tasks. In GPGPU task a GPU can justify it's power draw with it's high performance.
A couple of things.Firstly your statement above makes it seem as if the Cell doesn't offer a compelling power/performance ratio; it does. Note that upon its launch Roadrunner was not only the fastest supercomputer in the world, but also the most efficient as well. http://www.beyond3d.com/content/news/651In addition for DP performance, the new HPC Cell revision offers roughly double the DP performance of NVidia's new offerings for much less power draw.... so on a DP Flop/watt basis, it's significant. ATI's Firestream actually itself is the DP leader, but suffers from having a lesser ecosystem surrounding it at the moment. Which is really one of Cell's greatest advantages in the high-end space: it has IBM supporting it, along with IBM's tools. That means something for folk willing to spend hundreds of thousands to millions on an HPC solution.The above is to clarify that "justifying power draw with high performance" isn't something limited to GPUs.That said, as I was browsing around the forums earlier today I stumbled across this thread by Jawed in the GPGPU sub-forum: http://forum.beyond3d.com/showthread.php?t=49266Uncanny timing, to be sure. I guess porting weather models is the thing to do these days. These test cases were run on a larger scale of hardware than were the tests with Cell, so it's hard to come up with an apples-to-apples comparison (and plus it's a different model), but obviously GPU's do a great job in this realm as well.

Posted by patsu on Friday, 01-Aug-08 16:59:43 UTC
Quoting Carl B
Since they only evaluated a single Cell processor against single core iterations of the competing 'control' processors, those other metrics simply didn't come into play whatsoever in terms of the study results. Wattage and density would obviously be derived by the form the respective solutions took, so shouldn't be difficult to come up with estimates based on shipping solutions.
That's right. Since it was a NASA evaluation, I was hoping for more thorough reports. :-PTheir results seem to focus on speed-up, a line or two on computational efficiency/scalability and a little on porting. I was hoping they do another round of update comparing the latest competitions in more areas such as per-core power efficiency; and the general applicability of Cell vs regular CPUs (or even GPGPU).
Quoting wingless
NASA should evaluate GPGPU from ATI or Nvidia to handle these tasks. In GPGPU task a GPU can justify it's power draw with it's high performance.
Higher power also means more expense (due to cummulative power bill, additional cooling and perhaps space). If they are willing to spend more, they could also redirect these "overheads" into more Cell units to bring up the overall performance even further.For easily parallelizable applications, I think GPGPU stands a good chance to proliferate. For others, more work needs to be done.


Add your comment in the forums

Related News

Diving into Anti-Aliasing
RWT explores Haswell's eDRAM for graphics
ATI shoots a Bolt through its GPU compute stack
AMD releases CodeXL 1.0
Travelling in Style: Beyond3D's C++ AMP contest
Analysis of Ivy Bridge Graphics Architecture at RWT
RWT analyzes Kepler's architecture
Nvidia 680 GTX (Kepler) Released
Microsoft Releases C++ AMP Open Specification
Nvidia's 2x Guaranteed Program