Barcelona overview article at RWT

Friday 18th May 2007, 12:12:00 PM, written by Arun

Real World Tech posted a very nice architectural overview of AMD's upcoming Barcelona architecture (also previously known as K8L or K10, with the apparently final nomenclature being 'AMD Family 10h processors'), comparing it both to its predecessor and Intel's competition in terms of Conroe and Penryn.

This comes at the same time as DigiTimes is reporting that the chip might be slightly delayed and is unlikely to arrive until the end of Q3, along with an interview with AMD's Patrick Patla (Part 1 & 2).

David Kanter of RWT also mentions his own educated performance expectations for Barcelona, which we would tend to fully agree with based on everything else we've read ourselves. He mentions, among other things, that it is likely Barcelona will have an advantage in multi-threaded workloads but it is also very likely that Intel will keep its lead in terms of single-core IPC. Either way, if you're interested in Barcelona, we think it's well worth the read.

Discuss on the forums

Tagging

amd ± Barcelona


Latest Thread Comments (27 total)
Posted by nutball on Thursday, 24-May-07 22:48:51 UTC
Quoting 3dilettante
What overriding business interest does Microsoft have with Fusion?
Who knows. Seems to me that MS is interested in more things than just the bitted-ness of the Windows kernel these days.

Posted by Albuquerque on Friday, 25-May-07 22:27:16 UTC
Quoting nutball
Who knows. Seems to me that MS is interested in more things than just the bitted-ness of the Windows kernel these days.
The have interest in getting their software to as many people (and thusly, platforms) as possible. Picking favorites in the PC hardware space isn't going to further their goal, so I don't think Microsoft will have *any* vested interest in whatever Intel, AMD, or NVIDIA have up their sleeves. At least not in the already-entrenched PC space.

Posted by dess on Sunday, 27-May-07 13:37:33 UTC
Quoting swaaye
http://arstechnica.com/articles/paedia/cpu/will-barcelona-cure-what-ails-amd.ars

Jon Stoke's vision of AMD seems accurate to me. Especially after reading the Realworldtech article.
Jon's article is more or less based on David Kanter's performance summary and conclusions.

Here comes the micro-op/macro-op question into the picture. David is using Intel's terminology all along, so that macro-ops are the x86 instructions, and micro-ops are the low-level RISC ones. In contrast, for AMD, the macro-ops are lower-level ops already, whose hold up to two micro-ops.

So, f.ex. the "µops" on the diarams should be considered as macro-ops in AMD's terms, so two times that many micro-ops. This way you get 3-6 micro-ops, versus 4 on C2D's side. Taking into account there is the micro-op fusion on Intel's part, it's really "up to 5" (two of them fused), according to Agner (http://www.agner.org/optimize/microarchitecture.pdf). This all is true for the retirement phase, too. There are another things to consider here, too: f.ex. that the "6 instructions" below the C2D's predecode is really "up to 6 instructions" (because of the size of the buffer), or that 3 of 4 decoders in C2D are rather simple ones that can emit only one micro-op at a time, and so on. So there are several limiting factors here and there at which Barcelona is just better.

All in all, the RWT article's remark, so that C2D is 33% wider because of the "4 vs. 3 µops" thingy is not really accurate, I think.

And so the conclusion that Barcelona will probably be ahead only in multithreaded server applications, and not in single-threaded, usually desktop applications is IMHO questionable. Okay, David probably counted on the 2.5 GHz number of earlier roadmaps. But more recently slipped roadmaps shows 4-500 MHz higher debuting clocks...

Posted by 3dilettante on Tuesday, 29-May-07 12:24:32 UTC
I doubt Kanter decided solely from the wider peak retirement rate that Conroe would lead in single-threaded performance. Both AMD and Intel have admitted that their chips are doing fantastically if they break an IPC of 1 for most code.

Conroe is more aggressive at pursuing single-threaded performance than Barcelona. For desktop loads, its cache is better (offset in various scenarios by AMD's IMC), its memory reordering is more aggressive, and its clock speeds are higher.

A possibly more telling source of information on the Barcelona derivatives is AMD's silence when it comes to benchmarks of single-threaded performance.

They wouldn't only be showing Specrate numbers if the single-threaded numbers were also good.

Posted by swaaye on Wednesday, 30-May-07 00:24:00 UTC
Does anyone else wonder if Conroe was actually intended as primarily a mobile chip?After they figured out Tejas wasn't going to work out, they quick changed a few things (higher FSB and voltage perhaps for more clock speed), but it's still an awful lot like Yonah when you consider things. Obviously Conroe had been in development for probably years before Tejas was known to be a flop....

Posted by Skrying on Wednesday, 30-May-07 00:57:46 UTC
Quoting swaaye
Does anyone else wonder if Conroe was actually intended as primarily a mobile chip?

After they figured out Tejas wasn't going to work out, they quick changed a few things (higher FSB and voltage perhaps for more clock speed), but it's still an awful lot like Yonah when you consider things. Obviously Conroe had been in development for probably years before Tejas was known to be a flop....
Pentium M was heavily based on the P6 (Pentium Pro to P3) architecture and then Core Duo/Yonah was just a evolution of that. Conroe is a "more of everything" approach on that with some nifty little changes here and there. So it is certainly derived from a mobile family. I have my doubts Conroe has been in development for longer than Tejas. I think Conroe was a mixture of needed enhancements that were in the works and then also a really beefed up Yonah. I get all weird feeling inside saying this but I think Conroe has some advancements that would have been going into what was originally going to be Nehalem, which itself was originally the successor to Tejas. That is all just a like of crazy stuff though, super chip and robot over lords come to mind saying that...

Conroe is a representation of what I believe will become the norm for CPU production in the mainstream markets. Chips designed for use in the mobile market and then derivatives moved to the desktop market.

Posted by dess on Wednesday, 30-May-07 01:25:26 UTC
Quoting 3dilettante
I doubt Kanter decided solely from the wider peak retirement rate that Conroe would lead in single-threaded performance. Both AMD and Intel have admitted that their chips are doing fantastically if they break an IPC of 1 for most code.
He didn't. Indeed, he wrote the same as you:
"While it does appear that the Core 2 is 33% wider than Barcelona, in reality, neither processor comes close to peak capabilities on real code, so the performance will be much closer than the block diagrams imply. Barcelona's 3-wide issue, execute and retire capabilities are not a performance problem."
Seems I remembered this part wrong, sorry. He mentioned also the clock-rate, as an important factor here, so how high AMD could scale it, and I think he counted on 2.5 GHz - while according to more recent informations it will reach 2.8 GHz and beyond.

Thing is, Conroe's really not wider. Indeed, it's up to 5 micro-ops wide (two of them fused), according to Agner, while Barcelona is up to 6 micro-ops wide (in AMD's terms). Okay that it alone rarely counts much. But there are other differences that helps Barcelona increase the average IPC, and especially taking advantage of the 128 bit FPUs, in this one certainly better than Conroe.

Quote
Conroe is more aggressive at pursuing single-threaded performance than Barcelona. For desktop loads, its cache is better (offset in various scenarios by AMD's IMC), its memory reordering is more aggressive, and its clock speeds are higher.
This is perhaps true for the integer performance, but I wouldn't say it regarding the SIMD performance. So Barcelona could still be better f.ex. in media-processing, which is indeed a desktop application.

Quote
A possibly more telling source of information on the Barcelona derivatives is AMD's silence when it comes to benchmarks of single-threaded performance.

They wouldn't only be showing Specrate numbers if the single-threaded numbers were also good.
Possibly.

Posted by 3dilettante on Thursday, 31-May-07 12:46:54 UTC
Quoting dess
This is perhaps true for the integer performance, but I wouldn't say it regarding the SIMD performance. So Barcelona could still be better f.ex. in media-processing, which is indeed a desktop application.
This seems possible, though Conroe and Barcelona have different advantages and disadvantages.
Conroe's peak SSE capability is twice that of Barcelona's, though this requires a pretty specific mix of instructions.
Barcelona has a number of measures that allows for theoretically higher sustained performance over a wider set of circumstances.

Per-clock, it may very well be that Barcelona is capable of a higher sustained level of performance, but it's not just per-clock that matters.
Vector performance has done well on highly-clocked architectures.
The P4 was very good at SSE.

Intel is likely sandbagging on one or two speed grades for Core2.

Posted by doob on Thursday, 31-May-07 17:10:41 UTC
The Tech Report managed to get a phenom x4 die shot (http://techreport.com/onearticle.x/12583) straight from AMD but no Benchmarks.

Posted by Albuquerque on Thursday, 31-May-07 17:14:48 UTC
Quoting 3dilettante
Intel is likely sandbagging on one or two speed grades for Core2.
I think we can be *sure* that Intel is sandbagging on more than a few speedgrades on the Yonah platform. Even the original Core Duo (non-"2" models) would overclock pretty handily; the C2D's are seeing overclocks into the stratosphere relatively speaking. Even on stock voltage you can see C2D's doing more than 25% overclocks, even on their top-of-the-line Quad EE models.

As much as I'd love to have the competition to keep everything in-check for prices, I still don't think Barcelona is going to be performance competitive with Intel's product during it's released timeframe.


Add your comment in the forums

Related amd News

RWT Analyzes Bulldozer Benchmarks
AMD Bulldozer microarchitecture analysis
Say hello to GLOBALFOUNDRIES
AMD completes deal with ATIC to create The Foundry Company
AMD Propus to be released in Q2 & Q3
AMD launch 45nm Phenom II processor
AMD goes Asset Smart; splits into two
Beyond Programmable Shading course notes available
AMD launches FireStream 9250 with 200Gflops DP via RV770
AMD GPGPU solutions get extra support from industry partners