IBM presents 45nm Cell B.E. at ISSCC

Thursday 07th February 2008, 10:10:00 PM, written by Carl Bender

IBM introduced the 45nm version of the Cell B.E. processor at ISSCC earlier this week, fabbed on IBM's 45nm SOI line at East Fishkill and targeted primarily towards future revisions of the Playstation 3 gaming console and Cell blade servers. Notable among the achievements in the process transition is a greatly improved power profile, which has come down 38% in TDP relative to the 65nm shrink, and by over half compared to the original 90nm version of the chip. If previously revealed figures for the 90nm Cell are an indication, this would put the 45nm Cell at under 50 watts peak power draw at an operational frequency of 3.2GHz. Key in achieving these gains was the introduction of an independent SRAM voltage supply in the previous 65nm shrink, which has allowed core voltage to be lowered while maintaining stability and high operational frequencies in the L2 cache and SPE local stores. With the 45nm shrink, core voltage has been reduced to 0.8v from 0.9v for the 65nm processor.

The increased efficiency has allowed for significant gains in clockspeed as well, with stable operation at 4GHz registering below the power demands required of the 65nm Cell@3.2GHz, and speeds of 6GHz reached using a voltage of 1.15v. Although such chips will clearly not see use within Sony's console, the option would be open to IBM and/or Mercury to ship updated hardware taking advantage of the much improved Flop/watt ratio. The PS3 should benefit, however, from the associated reduction of cooling and power supply overhead; areas that have already seen initial cost reductions in the 40GB with the introduction of the 65nm Cell.

Less exciting than the thermal and frequency gains has been the pace of die size reduction, which after the second full node shrink has only now come down to an area of ~115mm2, a size that would not have seemed out of place had it been reached on the previous 65nm node (the original 90nm chip had a die area of 235mm2, and the 65nm chip an area of ~174mm2). Plaguing the rate of reduction are the analog circuits and I/O logic associated with the Rambus interfaces, which place a higher cap on the dimensional reduction of the die than could otherwise be achieved. Having gone through an automated scaling process in order to reduce costs associated with heavy optimization, the 45nm Cell now features small bands of 'dead' silicon to the north/south of the chip topography, bordering the SPE arrays - a reflection of the constraints faced in reducing the dimensions of the FlexIO and XDR interfaces located on either end of the processor.


Discuss on the forums

Tagging

±


Latest Thread Comments (12 total)
Posted by AlStrong on Friday, 08-Feb-08 03:39:32 UTC
What can be done about those analog parts :?:

Posted by Carl B on Friday, 08-Feb-08 05:01:31 UTC
Quoting AlStrong
What can be done about those analog parts :?:
Well... not much. Actually, it's interesting to note though that the shrink from 65nm to 45nm went better than the shrink from 90nm to 65nm; a die move to 66% 65nm-->45nm vs 75% for 90nm-->65nm. The I/O is just more difficult to scale, and to get the chip to where it would presumably be capable of being area-wise at that transistor count, there would need to be a full redesign/optimization.It'll be interesting to see what happens at 32nm... whether they shrink the present chip again, or whether they introduce Cell 2, or whether they stick with the present architecture but do a full revision. But if they do just shrink it down again, I think those spaces of dead silicon are going to start getting comically large.Interesting to note that the 65nm SPE on Toshiba's bulk CMOS comes in at 7.07mm^2; only 0.6mm^2 larger than the SOI/Cell SPE on 45nm(!), and 4mm^2 smaller than the equivalent 65nm SOI/Cell process (11.07mm^2). So... I think it can be assumed that the process differences aside, there's a lot of constrained potential at present due to the IO/analog circuits for the IBM/PS3 version. Obviously, it's worth mentioning that the SpursEngine has a different interface layout.

Posted by Panajev2001a on Friday, 08-Feb-08 10:27:32 UTC
Quoting iwod
So Flex IO and XDR isn't so good after all the hype.
Without FlexIO and XDR you would have needed either many more pins or much higher base clock to make up the same amount of effective data rate. Both solutions would have increased main PCB manufacturing costs as well as the chip's complexity...

In the case of more data pins it would have been even more difficult to shrink the chip down considerably because you would be limited by the distance between I/O (including VDD and GND ones) pins on the chip's die.

Posted by Carl B on Friday, 08-Feb-08 16:06:30 UTC
Quoting Panajev2001a
Without FlexIO and XDR you would have needed either many more pins or much higher base clock to make up the same amount of effective data rate. Both solutions would have increased main PCB manufacturing costs as well as the chip's complexity...

In the case of more data pins it would have been even more difficult to shrink the chip down considerably because you would be limited by the distance between I/O (including VDD and GND ones) pins on the chip's die.
And I think that's exactly the reasons for why the 65nm HPC/DDR2 version of Cell is so large relative to the 90nm baseline, achieving something like a net 10% shrink when all was said and done.

Posted by Crossbar on Thursday, 14-Feb-08 09:38:29 UTC
Quoting Carl B
Interesting to note that the 65nm SPE on Toshiba's bulk CMOS comes in at 7.07mm^2; only 0.6mm^2 larger than the SOI/Cell SPE on 45nm(!), and 4mm^2 smaller than the equivalent 65nm SOI/Cell process (11.07mm^2). So... I think it can be assumed that the process differences aside, there's a lot of constrained potential at present due to the IO/analog circuits for the IBM/PS3 version. *Obviously, it's worth mentioning that the SpursEngine has a different interface layout.*
And it runs in a different frequency range and we don´t know how much the IPC of the two types of SPUs might differ as well.

Posted by FutureCTO on Saturday, 23-Feb-08 07:29:04 UTC
The 45nm Cell enter mass production in April 2008.
And the RSX enters 45nm production in December 2008.

So logicalling a single unit Cell+RSX should appear at the next ISSCC.

I say logically because the Rambus FlexI/O interface will by December 2008 account for nearly 20% of each chip.
By combining them into one chip the FlexI/O and costs associated with PCB traces and IC will be cut.

Wafer production and savings would increase quite a bit.
I posted elsewhere that a single Cell+RSX would arrive by June 2009.
But I would not be surprised if it were by the end of the first quarter,

Posted by Carl B on Wednesday, 27-Feb-08 00:50:19 UTC
Quoting FutureCTO
I say logically because the Rambus FlexI/O interface will by December 2008 account for nearly 20% of each chip.By combining them into one chip the FlexI/O and costs associated with PCB traces and IC will be cut.Wafer production and savings would increase quite a bit.I posted elsewhere that a single Cell+RSX would arrive by June 2009.But I would not be surprised if it were by the end of the first quarter,
I don't expect to see a unified die for these chips in all honesty, at least not for the foreseeable future. SOI vs bulk, GDDR3 vs XDR, the packaging/pin-out complexities... Frankly I don't see how it's not just better for the RSX and the Cell to continue to each go through a relatively inexpensive shrink process.

Posted by ADEX on Thursday, 28-Feb-08 01:54:14 UTC
Quoting Carl B
Interesting to note that the 65nm SPE on Toshiba's bulk CMOS comes in at 7.07mm^2; only 0.6mm^2 larger than the SOI/Cell SPE on 45nm(!), and 4mm^2 smaller than the equivalent 65nm SOI/Cell process (11.07mm^2).
There's a number of reasons they could be so different. The main one I suspect is that it's not designed to run at 3.2GHz - clock has an impact on size.

Posted by FutureCTO on Thursday, 28-Feb-08 05:46:19 UTC
Quoting Carl B
Interesting to note that the 65nm SPE on Toshiba's bulk CMOS comes in at 7.07mm^2; only 0.6mm^2 larger than the SOI/Cell SPE on 45nm(!), and 4mm^2 smaller than the equivalent 65nm SOI/Cell process (11.07mm^2). So... I think it can be assumed that the process differences aside, there's a lot of constrained potential at present due to the IO/analog circuits for the IBM/PS3 version. Obviously, it's worth mentioning that the SpursEngine has a different interface layout.
I am a little unclear on your point.
The spursengine is a Four SPE & four ASIC design, where Cell is 8 SPE and a PPE design.

http://www.engadget.com/2007/09/20/toshibas-cell-derived-spursengine-chip-to-process-video/

A lot of people are confused by the idea of cell like fish being both singular and plural.
Cell is an abstract architecture whose name became synonymous with the PS3.
And for marketing reasons they chose not to rebrand/rename the version design for the PS3.

So both applications use a Cell chip.
But only one includes PowerPC ISA I-III and uses the SPE ISA it like a DTS audio extension.
Note: SPE ISA is the real and official name.
Yet another example of a lack of marketing or branding leading to poor product recognition.

As for the Cell+RSX idea.
I don't see a problem with designing for two different memories and a southbridge.
But your thinking makes me wonder about the independent parties.
How difficult would it be to get Nvidia and Cell teams to co-develop a single chip?
Technically it is not at all out of the question.
But company or technical clique issues might limit the possibility.
My only feelings on this is that real engines find solution to challenges.
Only the people that are not science or math driven can stop them from achieving a singular chip.

Posted by Carl B on Thursday, 28-Feb-08 08:39:59 UTC
Quoting FutureCTO
I am a little unclear on your point.

The spursengine is a Four SPE & four ASIC design, where Cell is 8 SPE and a PPE design.

http://www.engadget.com/2007/09/20/toshibas-cell-derived-spursengine-chip-to-process-video/

A lot of people are confused by the idea of cell like fish being both singular and plural.
Cell is an abstract architecture whose name became synonymous with the PS3.
And for marketing reasons they chose not to rebrand/rename the version design for the PS3.

So both applications use a Cell chip.
But only one includes PowerPC ISA I-III and uses the SPE ISA it like a DTS audio extension.
Note: SPE ISA is the real and official name.
Yet another example of a lack of marketing or branding leading to poor product recognition.
Yes, you were a little confused by my point.

My point was that the SpursEngine shows how a different process and lack of I/O constraints allowed for a greater degree of scaling on the CMOS5 SPEs vs the SOI SPEs. (Adex made the valid point that lowered clockspeed requirements may have aided as well.)

Nothing to do with ISA's... and as the SPE's on Cell don't support POWER either (they are identical SPE's functionally afterall), I'm not even sure how that angle entered your post to begin with.

Maybe people outside of a sub-forum entitled 'CellPerformance' are confused as to what Cell "is"... but I doubt you'll find anyone here - or even at B3D in general - that is not plainly aware of the obvious. I'll mention also that there is an article on Spurs here on the site that offers a more complete take on the chip than the one you linked to via Engadget.

Quote
As for the Cell+RSX idea.
I don't see a problem with designing for two different memories and a southbridge.
But your thinking makes me wonder about the independent parties.
How difficult would it be to get Nvidia and Cell teams to co-develop a single chip?
Technically it is not at all out of the question.
But company or technical clique issues might limit the possibility.
My only feelings on this is that real engines find solution to challenges.
Only the people that are not science or math driven can stop them from achieving a singular chip.
The problem again is as I mentioned: Packaging/pin-counts, process disparity, and questionable cost/benefit in forcing the migration in light of the above. All Cell's going into the PS3 (or anything) at this point are fabbed on SOI. The RSX is on CMOS4, then to 5. In addition we already know that IBM is going to be fabbing Cell at 45nm on SOI for Sony at Fishkill, and Toshiba will be pursuing the RSX at 45nm at Oita. So - all the theory-based hurdles I mentioned aside, common sense alone indicates that at 45nm, these chips will remain separate. Several of us expect RSX/Southbridge integration in the future, as well as a Cell redesign of sorts providing a 'catch-up' die shrink as a spin-off of Cell2 development. Perhaps somewhere in there, the cost/benefit engineering case will be made for a PS3-specific redesign of the chips to enable such a unification. But not before 32nm, as is obviated from the present fabrication plans of these companies.

PS - Discussion in this sub-forum is to focus on Cell and Cell programming models rather than Playstation and the 'usual' console-centric discussions, which is why duplicate threads presently exist in the console section on the topic. If these posts remain focused on PS3/RSX, then they will be moved to one of those other threads and the conversation will continue there.


Add your comment in the forums

Related News

Beyond Programmable Shading course notes available
NVIDIA's Huang admits to underestimating ATI
NVIDIA release beta OpenGL 3.0 driver following SIGGRAPH session
Intel's Aaron Coday talks to Develop about Larrabee
ATI Radeon HD 4870 X2 released and reviewed
OpenGL 3.0 is here (finally)
NVIDIA release Cg 2.1 Beta and Gamefest presentations
AMD 790GX review at The Tech Report
FirePro workstation accelerators launched by ATI
It came from QuakeCon 2008