Opinion: Silverthorne fails but PowerVR impresses (+Montalvo trouble)

Wednesday 02nd April 2008, 03:24:00 PM, written by Arun

All the Silverthorne information you'll ever want is now available in articles from The Tech Report and AnandTech - but while the coverage is decent in terms of architecture, they both miss the mark in terms of market dynamics. And in other news, Montalvo looks like it's in big trouble...

First, the Silverthorne core. If Intel's management and marketing personnel had bothered to properly analyze the handheld market rather than trying to figure out how to impress the press with empty rhetoric, they'd have concluded they shouldn't even bother. Silverthorne is a good design for UMPCs/MIDs/Ultraportables, but that's about it really. Any mobile phone manufacturer or carrier which is seriously considering making designs based on this architecture should seriously reconsider its strategic planning process.

Intel claims that the 1.6GHz Silverthorne is 4.1-6.5x faster than an ARM11 400MHz core at Internet Browsing. Great - too bad that doesn't seem to be much faster, if it's even faster at all, than 40nm Cortex-A9 implementations coming out in the same timeframe as Moorestown (which is the first chip based on the Silverthorne architecture that Intel will try to sell in the smartphone market). And with a TDP of 2W, it would take 4-6 times more power at peak and as much more on average. If you compare it to slower versions Silverthorne (and presumably Moorestown), Intel's chip will actually be substantially slower and still take more power.

Let's put it this way: I'm not impressed. The fact they went through so much effort on the architectural side of things (going back to CISC etc.) to improve power efficiency is even more telling of the x86 architecture in general. Even the 800MHz variant is very underwhelming in terms of power efficiency. This kind of problem won't stop the Intel marketing department from hyping this to infinity and back though by using the most ridiculous of metrics; their 'average' power consumption is measured under the maximum sleep mode (C6) on 80-90% of the time. Errr, yeah, sure, why not - but that's not very comparable to anything else, now is it?

Shifting gear completely, Intel also revealed much more information on the Poulsbo chipset. There's good news, and then there's bad news. The latter first: it's on 130nm. Yes, you read that right, the memory controller, 3D and video cores are on 130nm. Ugh. But then there's the good news: Intel is using PowerVR IP for the 3D & Video cores, and the video decoding capabilities and power efficiency are incredibly impressive and beat everything else in the market despite the process node. 120mW for H.264 Main Profile @ HD! One day, one day, I'll figure out why so much of the best semiconductor technology is designed in the UK (CSR, PowerVR, Icera, PicoChip...)

This roughly matches PowerVR's claims of 30-50mW @ 90nm (presumably pre-layout) for H.264 High Profile @ HD. Now, given that Moorestown will have the video core in 45nm, it does look like it will ironically have at least one of the lowest-power video decode processes in the industry despite not having a power-competitive processor. So much for benefiting from Intel's traditional strengths! We look forward to seeing how competing SoCs on the 40nm process node will compare. And obviously, we'd like an independent party to be able to verify Intel's claims on this subject.

On a very slightly related note, it looks like x86 start-up Montalvo is in big trouble and may be acquired by Sun - if so, this will make it substantially less likely for Asymmetric x86 to become mainstream. This is positive for GPUs, as it may make GPGPU more attractive for consumer workloads (as opposed to many-core x86) - however, in NVIDIA's specific case, it possibly removes a safety net and puts their x86 integration strategy at the mercy of VIA and a small number of (much leaner) stealth-mode startups.

I am honestly surprised NVIDIA doesn't seem to be bailing out Montalvo (cash infusion, not acquisition); I certainly hope they've thought that through and are sufficiently confident this course of action is in their best interests. It is difficult for us to say whether it is without more real data, but it is a very complex stategic discussion of which I obviously am not part. And in either case, we'll see how this all works out in the next few years - in the mean time, I wish the best of luck to everyone at Montalvo during this difficult period.

UPDATE: Made it clear that I'm referring to the Silverthorne architecture in general and all of its derivatives, not just the specific chip announced today which I know is not aimed at the handheld market. However, future derivatives will be and those are the chips which I claim will not be competitive at all there.


Discuss on the forums

Tagging

intel ± silverthorne, powervr, montalvo


Latest Thread Comments (29 total)
Posted by Arun on Thursday, 03-Apr-08 14:49:03 UTC
Quoting DavidC
This is where the Cortex presentation pulled the Dhrystone 2.1 figures from:
http://homepage.virgin.net/roy.longbottom/dhrystone%20results.htm
Cheers, that's interesting! :)

Quote
I would take the benchmarks with grain of salt, unless some are willing to believe the 4 A9 cores will be faster than the Core 2 Duo.
And why not? Remember those scores are for a SINGLE core in the Core 2 Duo case and FOUR cores in the Cortex-A9 case. Also remembe that the Cortex would benefit from having what's basically an Integrated Memory Controller and much lower relative clock speeds, so memory latency is less of a problem. This also makes more advanced forms of OoOE less necessary IMO...

Now, I do think that Dhrystone likely isn't incredibly representative of real-world performance, but if the question is whether a 4x1GHz Cortex-A9 should beat a 1x2.4GHz Conroe, my answer would be that it should. Remember that ILP extraction is a game of diminishing returns, especially in terms of integers, and that the x86 ISA certainly doesn't make things any easier for Intel...
Quote
Here's Intel's numbers for Silverthorne against A8:
http://pc.watch.impress.co.jp/docs/2008/0402/kaigai432.htm
EEMBC Suite v1.1(compared to ARM 11 400MHz)
Cortex A8 600MHz: 3.3x
Cortex A8 1GHz: 5.4x
Intel Atom Z510 1.1GHz: 6.8x
Intel Atom Z530 1.6GHz/w HT: 13x
Once again that's very interesting, thanks. Hmm, let's scale the Z510 result to 1GHz, so it becomes 6.2 - now, if the A8 has a Dhrystone score of 2/MHz and the A9 has a score of 2.3/MHz, that would give us a performance result of... 6.2 again! So the ILP between Silverthorne (without IMC) and the Cortex A9 (with IMC) seems comparable.

I would actually expect the A9 to have a 10-20% ILP advantage, so that's slightly more positive for Intel than I thought. Of course I wouldn't be surprised if, in real-world scenarios, the A9 was more than 15% faster than the A8... Once again though, perf/watt for Intel's core will be massively lower than that of A9 cores on 40nm SoCs.

Quote
You know how IGP in Poulsbo performs?? It also says there: "Intel told us to expect a 3DMark '05 score around the 150 point mark."
I will refrain myself from judging the SGX IP based on that number since it's on 130nm and yet still quite small, so it wouldn't exactly be fair... :)
_*EDIT*_: According to my calculations based on public pictures, Poulsbo is 146mm² and the '3D' part of it is 42.5mm² (once again, on 130nm).
_*EDIT2*_: Oh and this also isn't very impressive: http://pc.watch.impress.co.jp/docs/2008/0402/kaigai01_10.gif

Posted by crystall on Thursday, 03-Apr-08 16:40:27 UTC
Quoting Arun
Yes, that was also my conclusion after studying the documentation a bit... :) This presentation is interesting, especially Page 22: http://www.jp.arm.com/event/pdf/forum2007/t1-2.pdf

Based on that graph, I estimated the Cortex-A8 to have 2.0 Dhrystone/MHz while the Cortex-A9 has 2.3 Dhrystone/MHz. ARM11 MPCore delivers 'only' 1.2 Dhrystone/MHz. Heck, Page 23 is also really impressive assuming those numbers are real and not too creatively selected. I'd be very very interested in the same programs being run on, say, Yorkfield or Barcelona... (btw, I'm not saying Dhrystone is the most representative benchmark around, it likely isn't, but it's the only real datapoint we have sadly!)
Dhrystone is really not representative at all of real world performance, it's basically unaffected by branches and memory latencies. The difference in IPC between the A9 and the A8 in non well-behaved codes will be much larger than what you see there.

Quote
_EDIT_: Oh, and I'm also not convinced Cortex-A9 would be clocked substantially lower than the A8 for a given process; maybe 10% or so though, I wouldn't exclude that. Not that we'd know anyway since I don't think anyone will synthetize it for 65nm, TBH...
Well, let's put it this way, the A9 is likely to clock significantly lower than what the A8 was supposed to. But only slightly lower (or not at all) compared to real A8 silicon ;)

Posted by INKster on Thursday, 03-Apr-08 17:27:31 UTC
Early "Moorestown" reference motherboard surfaces:http://www.engadget.com/2008/04/03/video-intel-reveals-moorestown-pc-motherboard-possibly-worlds/I especially like this photo, as it gives us a sense of scale of a hypothetical smartphone built on it:Image: http://img232.imageshack.us/img232/2337/moorestowninhand7c1269qc5.jpg (http://imageshack.us)

Posted by Arun on Thursday, 03-Apr-08 18:25:34 UTC
Quoting crystall
Dhrystone is really not representative at all of real world performance, it's basically unaffected by branches and memory latencies. The difference in IPC between the A9 and the A8 in non well-behaved codes will be much larger than what you see there.
Thanks for the tip - I didn't know it didn't even have any branching, ouch.
INKster: Hmmm, that *is* quite small, nice. Although once again footprint isn't what I'm worried about, what I am worried about is power.

Posted by INKster on Thursday, 03-Apr-08 18:43:08 UTC
Quoting Arun
INKster: Hmmm, that *is* quite small, nice. Although once again footprint isn't what I'm worried about, what I am worried about is power.

Seeing what Intel can do with mature 65nm process Core 2's (Macbook Air, and Core 2 ULV's come to mind), i wouldn't dismiss a mature 45nm "Moorestown" just yet.
More competition can only be good for consumers, even if handheld x86 it's still just an "interesting" proposition for now.

Of course, i also agree with you that ARM-based products "are" very deeply entrenched in the handheld market right now, so i assume that this battle will be similar to the one Intel is facing with Itanium in the big-iron server market.
Being able to run Linux is just not a huge differentiator anymore, and Windows is practically out of the question due to interface and power/performance issues.

Speaking of ARM, because Nvidia stated at the APX 2500 launch that they intend to be very aggressive with product launches, how would they handle the Cortex A9 architecture and integrate it (if at all) with their in-house designs based on ARM11 ?
This would put them yet again head-to-head against several "Intel's" all at once. ;)

Posted by Arun on Thursday, 03-Apr-08 18:53:15 UTC
Quoting INKster
Speaking of ARM, because Nvidia stated at the APX 2500 launch that they intend to be very aggressive with product launches, how would they handle the Cortex A9 architecture and integrate it (if at all) with their in-house designs based on ARM11 ?
Hmm? My understanding is that this is NVIDIA's handheld roadmap. Please note that this is NOT based on ANY information from NVIDIA except old presentations and thus should NOT be considered of ANY competitive value or of sufficient reliability to be taken as anything else than my own speculation:
- 3GSM 2008: 65nm ARM11 High-End
- 2H08: 65nm ARM11 Derivatives
- 3GSM 2009: 40nm Cortex A9 High-End
- 2H09: 40nm Cortex A9 Derivatives
- 3GSM 2010: 32nm High-End
- 2H10: 32nm Derivatives
- 3GSM 2011: 32nm Half-Node High-End
- 2H11: 32nm Half-Node Derivatives
Quote
This would put them yet again head-to-head against several "Intel's" all at once. ;)
Indeed, they are part of the couple of companies competing directly against Moorestown and its future derivatives.

Posted by crystall on Thursday, 03-Apr-08 20:25:00 UTC
Quoting Arun
Thanks for the tip - I didn't know it didn't even have any branching, ouch.
Actually that's not true, I should have put it another way: it has branches and other things which were supposed to be representative at the time (late '80s) but a modern compiler makes minced meat of it. Besides it's got a minuscule footprint (thus no memory effects), it has lots of compile time constants (which lead the compiler to turn many parts into specialized code) and finally ARM supports conditional execution which - coupled with the fact that the benchmark is also very small - can still remove quite a few branches which survived after the optimizations.

Posted by randomhack Web page "benchmark" details? on Friday, 04-Apr-08 04:06:19 UTC
Well if Intel wants it to be a benchmark, then treat it like a benchmark it should.
Does anybody know the browsers involved in the webpage render benchmarks?
And the amount of RAM on the devices used for the comparision?
Does anybody have a link to more info on it?

Posted by Ailuros on Saturday, 12-Apr-08 21:42:56 UTC
OT:
Quote
One day, one day, I'll figure out why so much of the best semiconductor technology is designed in the UK (CSR, PowerVR, Icera, PicoChip...)
Probably because they drink a lot of tea :D

Posted by Arun on Sunday, 13-Apr-08 10:41:06 UTC
Quoting Ailuros
Probably because they drink a lot of tea :D
Ohhh! I just drank my first cup of tea in a while this morning, guess I should make that more of a habit then! :D


Add your comment in the forums

Related intel News

32nm sixsome over at RealWorldTech
Intel Core i3 and i5 processors launched
Analysis: Intel-TSMC announcement more complex than reported
Intel and TSMC join forces to further Atom
Fudzilla: Intel 45nm Havendale MCM replaced by 32nm+45nm MCM
Intel announce Core i7 processors, reviews show up
Intel's Aaron Coday talks to Develop about Larrabee
Larrabee to also be presented at Hot Chips
Intel Larrabee @ SIGGRAPH 2008
Larrabee's Rasterisation Focus Confirmed