The ARM, the PPC, the x86, and the iPad…

Last updated August 26, 2013 by Mahmoud Al-Qudsi

Hot on the heels of the iPad release comes news that Apple has just (very likely) purchased another processor design firm (via EDN). Intrinsity, the chip design company in question, is a designer of RISC-based CPUs and is rumored to have had something to do with the design of Apple’s new A4 processor. The A4 is Apple’s key ingredient for a smooth user experience in the much-hyped iPad.

Those keeping track of Apple’s purchases will remember that, almost exactly 2 years ago to the day, Apple bought California-based CPU designer PA Semiconductors. However, PA Semi specializes in PowerPC-based designs – a platform that Apple abandoned almost 5 years ago now. But Apple’s most recent acquisition is directly applicable to its current needs in the hardware market, and in particular, its forays into the ARM market. In the official iPad video, Apple engineers and executives discuss their need for a custom CPU in order to let them dictate where the ooomph and power will go, and to what purposes the transistors will be biased.

With all these buyouts and different chipsets in question, it’s easy to get confused. So what is the difference between the ARM, the PPC, and the x86, and where does it matter?

The world of CPUs is a dark, deadly, and dangerous place. After all, the CPU is said to be the literal “heart” of the PC – and as such, it’s the single most-heavily engineered component. Billions of dollars and manhours have gone into the design of these various chipsets and they’ve all been researched, optimized, fabricated, and sold in order to make your computer… better.

The biggest difference between these platforms is the design dogma they follow. The x86 is a CISC architecture: Complex Instruction Set Computer. The other two (PPC and ARM) are RISC-based designs: Reduced Instruction Set Computers. What does that mean? Well, to the end user, mostly nothing. But to the CPU designers and developers, it makes a world of a difference.

CISC architectures can have up to thousands of individual commands supported by the processor that can be used in machine code. Each of these assembly commands can range from a single operation to several hundred or more in length. On the other hand, RISC-based CPUs understand only a handful of different instructions, the minimum necessary to get the job done.

However, this in no way means that CISC is more powerful or that RISC is limited. The difference in the amount of supported instructions is easily explained away by two factors: supported modes, and wrapper operations. All the data dealt with in any computer program is stored in the memory. But in order for the CPU to actually use any of it, it needs to place variables in super-fast (but small and limited) memory locations built into the CPU itself, called registers.

Imagine trying to run the following line of code:

z = x + y

Each of the three variables in the above example is located in the memory. But in order to carry out the operation, x and y will need to be copied from the memory to the CPU, the addition instruction carried out, and the result then copied from the CPU to the location of z in the memory.

A CISC-based CPU like the x86 would have a single instruction that – when given the address of X, Y, and Z in the memory – would do just that. But in a RISC-based CPU, the assembly code would have to explicitly spell out the individual steps: first copy x to a register, then copy y to a register, then add them together, and finally copy the result back into the memory.

At first blush, it would seem that CISC is a much better option. After all, one instruction and the entire line of code is done. But it’s not about operations, it’s about time. Sure, a RISC-based program will need to carry out four distinct operations in order to do the same, but that doesn’t mean it’ll take any longer. In fact, RISC CPUs are consistently faster than their CISC counterparts.

If CPUs were day laborers, it would make sense that CISC is more efficient. After all, a single instruction gets the job done. But, thankfully, CPUs aren’t underpaid interns, they’re over-engineered miracles. The simpler design of the RISC CPU allows it to more efficiently optimize and carry out long sequences of code. The way things are broken down into short, simple, and clear instructions lets it carry out multiple operations at the same time (pipelining) and with less effort.

In fact, it’s now a universally accepted truth that RISC is better than CISC! Actually, because of how much more efficient RISC machines are than their CISC counterparts, most CISC CPUs convert their CISC instructions into RISC instructions internally, then run them!

So why are we still using x86? That’s mainly because of business matters. Intel had x86, Intel had money, and CISC won out. Today, with the optimizations and internal RISC conversions that take place, CISC vs RISC isn’t really about the performance any more. It’s about the business, the politics… and the power consumption.

The complexity of the CISC datapath and pipeline mean that it takes more power to get things done. Intel has worked some incredible miracles and accomplished some amazing things to get the power consumption down, ranging from dynamic scaling of the CPU clock to shutting-down parts of the CPU core when they’re not in use. But x86 remains a power hog. Intel’s Atom platform was an attempt at re-engineering the x86 to make it fit for mobile devices, but at the cost of performance.

There’s no reason that RISC can’t be used for the desktop. ARM or no ARM, there’s a plethora of RISC-based CPUs out there that can be easily adapted for desktop use. But the problem isn’t with the hardware: it’s with the software. Programs written for x86 aren’t compatible with anything else, even other CISC CPUs. That has prevented just about the entire world from trying any other platforms, mainly because Windows only supports x86 on the desktop. The last copy of Windows to support different CPU architectures was Windows NT, which shipped with versions for Intel’s x86, MIPS, Alpha, and PowerPC.

For anyone not on the Windows platform though, there’s nothing really binding them to the x86 platform. Apple chose x86 because, with PowerPC out of the running, x86 was the only viable option back in 2005. Keep in mind, just because ARM can run on the desktop, that doesn’t mean that ARM will run on the desktop: optimizations in the CPU world are always a compromise between performance and power consumption. And the current generation of ARM and other RISC-based CPUs is meant for portable equipment.

It would take some work to create a high-performance ARM CPU meant for the desktop, but that doesn’t mean it won’t happen. With Apple’s just-declared purchase of Intrinsity, it’s clear that it’s a possibility. With the tight grip Apple has over its platform and the strong hardware-software bond, it wouldn’t be too difficult to make the switch to yet another platform – after all, they did it 5 years ago and things worked out. But will they? Most likely not, it’s not exactly in their customers best interest and x86 really is a decent platform. But for the myriad of mobile devices that Apple is getting itself into, x86 isn’t the key. So look forward to more ARM goodiness for your iPad and iPhone in the years to come, but your MacBook is safe in Intel’s loving hands.

31 thoughts on “The ARM, the PPC, the x86, and the iPad…”

Peter on April 4, 2010 at 12:33 am said:

Ummm… no. This was correct in 1995. Computers have evolved a long, long way since. x86 compilers do not use any complex, long-running instructions, since modern x86 processors run them very, very slowly. x86 processors have backwards compatibility with the old string operations, but this is implemented under the assumption it will almost never be used. The processor slows down tremendously when it hits one of these instructions, but the backwards compatibility uses almost zero die space (remember, a complete 80386 can fit in under 1/1000th of a Core 2 Duo), negligible power, and gives zero performance hit to the instructions that are now used.

Now, the bigger difference between instruction sets is that CISC instructions are variable-length, while RISC instructions are fixed-length. This makes the fetch/decode units for RISC processors smaller and more efficient than those for CISC processors. In practice, however, this is not where the die space and power is spent anymore, at least in desktop systems. In low-power embedded systems, this makes a bigger difference, but still not a big difference (hence the influx of x86 into high-end embedded).

Why do people who have no clue about computer architecture write articles like these?
:-( on April 4, 2010 at 1:23 am said:

Peter is now my hero, but I can’t help feeling a bit sorry for the people he disembowels.
JB on April 4, 2010 at 2:46 am said:

great piece. Thanks!
Mahmoud Al-Qudsi on April 4, 2010 at 5:25 am said:

Hi Peter,

Thanks for your comment. You’re right, of course, there’s a lot that’s gone unsaid in the article. But keep in mind, it’s only a briefer about basic computer architecture. It takes a lot to put as much info as simply as has been done; and I was not about to get into fixed vs variable instructions, register renaming, and any other advanced topic.

You’ll note I did make it clear that leaving x86 isn’t worth it for Mac. It is theoretically possible to do so, but not worth the effort.

But for mobile devices, I have to respectfully disagree. Atom is (very slowly) getting there, but even now after several years of R&D and several iterations, the performance/energy rating of Atoms pretty much sucks. We’re basically just seeing the power savings associated with a much simpler pipeline and datapath built around the same ISA, a smaller fabrication scale than the RISC competition (which makes a huge difference to power consumption coupled with the lower minimum voltage), and other compromises on optimizations found in desktop x86 hardware.
Pekka on April 4, 2010 at 10:49 am said:

Peter, one upside of variable-length instruction encoding is smaller code footprint. I don’t have any real world data but my gut feeling is that x86 code GCC, for example, generates is usually smaller than the ppc or arm counterparts. This means that x86 is able to utilize its instruction caches better which results in better performance.

Mahmoud, your “z = x + y” example is misleading. The x86 instruction set doesn’t have memory-to-memory instructions because instruction encoding only allows one memory operand. Furthermore, modern x86 CPUs have been RISC at the core for a long time now so much of your arguments for ppc and arm just don’t apply.

And lets face the facts, x86 isn’t the dominant platform because of politics but simply because AMD and Intel have been able to invest in the architecture more than the competitors making it better than the alternatives on the mass market over the long term.
Goffers on April 4, 2010 at 11:58 am said:

With the arrival of Qt and (I think) Android Native, software writers can increasingly use platforms whose code can run efficiently on different hardware architectures.

The advantage of existing software running only on x86 cpus may quickly be eroded.

Recently there was a posting (quickly removed) on Hexus predicting the early announcement of ARM 64bit processors. Presumably they would not be aimed at mobile devices.

Interesting times.
Peter on April 4, 2010 at 4:57 pm said:

Mahmoud, I’m not opposed to simplification or oversimplification. That’s now what you did. Your argument was analogous to claiming the US is currently the world’s leading economy because of a superior rail system, slavery and cotton in the South, unharnessed land in the West, budding manufacturing in the North, and the prevalence of Christianity. Your argument was, in some parts obsolete, in others misleading, and in other parts wrong. Virtually nothing in what you said, aside from the conclusion (ARM processors are currently more power efficient than IA32), is correct. More than half of the sentences in this article are either obsolete, misleading, or just plain wrong (I can go through line-by-line).

RISC processors lead the high-end market in the 80’s and early 90’s because at that technology point, RISC had huge architectural advantages over CISC in general and IA32 in specific. Alpha, Sparc, PowerPC, MIPS, PA-RISC were more expensive but also much, much faster. There has always been and always will be a market for fast and expensive. In the late 90’s, however, Moore’s Law scaling effectively eliminated the architectural gap between RISC and CISC. By economies of scale, Intel could throw more resources at design. Whereas most design houses designed primarily in a high level language like Verilog or VHDL, Intel could have engineers hand-tweak most of the performance-critical parts of the IC. IA32 was not only a lower price point, but crushed the competitors in performance, and destroyed the high-end RISC industry.

High-end embedded is now around the same place as high-end RISC in the late 90’s. An Alpha EV68 has 15 million transistors. An Alpha PCA56 had 3.5 million transistors. An Atom has 47 million.

The major advantage of ARM, both for Apple and other embedded, is that it is an open, licensable architecture. This is huge. If I’m making a cell phone, I get much lower cost if I integrate the whole thing into one IC. I can do that with ARM. I can stick an ARM core in my FPGA (or buy an FPGA with an ARM core build in). I can modify the ARM to the task at hand (e.g. add h264 decoding). If I went up to Intel to license IA32, they would laugh in my face. If Intel did agree, it’d be obscenely expensive to engineer — remember, Intel hand-optimizes the designs. Process ports are expensive. In contrast, I can get ARM cores in nice, synthesizeable HDL, and semi-automatically have it work in any process.

In terms of power/performance/cost/size, the battle will be waged largely based on engineering prowess. ARM Holdings has half a billion dollars in revenue to develop ARM with, as well as a huge community of users to help them. Intel has much more resources overall, but only throws a tiny portion of that at Atom and embedded. RISC vs. CISC won’t matter. Some of the other IA32 instruction set cruft you didn’t mention will — e.g. limited number of registers. Most of this is related to IA32 being a 25+ year old architecture with full backwards compatibility (being used as a desktop CPU), and ARM being an embedded processor with very limited need for backwards compatibility (being used in non-computer products).

On a side note, RISC is indeed taking over very low-end embedded. See Atmel (market cap: 2.3 billion) and Microchip (market cap: 5 billion) crush Zilog (market cap: 56 million), maker of the z80, once the world’s most popular 8-bit microcontroller.
Mahmoud Al-Qudsi on April 4, 2010 at 5:28 pm said:

You’re right, Peter. Perhaps I didn’t make that clear, but RISC is not any more inherently better than CISC. It does have cleaner roots and originally a better implementation, but with the internal microcodes and other optimizations it doesn’t really matter.

Oh, and the engineer within me apologizes. I knew when writing this article that there would be repercussions for over-simplifying such advanced topics, but someone needs to bite the bullet and try to explain things for those that care to learn 🙂

I just re-read my article and it does come off (a bit strongly) with the suggestion that RISC is better in the here and now, but that definitely isn’t the case, or at least, not in such a black-and-white fashion.

The fact of the matter is, it’s the current CISC/RISC-independent optimizations being done in comp. architecture R&D that count. And there’s a lot ideas going around. I’m sure that if ARM through the kind of money and manpower on its embedded devices that Intel did, we could perhaps more fairly compare the two. But I don’t doubt that Intel has optimized the hell out of the desktop x86 series, and I know for a fact that there’s real magic happening with its current lineup.

Previously with the Core (2 Duo) line and now with the i7, Intel is doing some really incredible stuff. They’ve finally got their deep pipeline issues sorted out, shrunk the manufacturing, and implemented some incredible power-saving and performance-boosting designs into their chips.

Obviously, this is the stuff that really matters, and, yes, you are right, the CISC vs RISC issue is really no longer an issue at all…. just a matter of belief/policy, sort of like Emacs verses vi.

But there’s one more thing that I felt was out of the scope of this article: that neither Intel’s current x86 lineup is a pure CISC system nor are ARMs current processor lineup a pure RISC operation. Perhaps similar to how there are no “pure” micro/monolithic kernels out in the market these days. In fact, there’s a lot of semblance between the two (CISC/RISC and Micro/Monolothic).

But one thing that is true: Intel’s x86 is hobbled by a lot of unnecessary cruft it has carried through from ages past. As you mention, a lot of this is because of their insistence on backwards compatibility (and with good reason – though it does remind me of Microsoft’s policy on backwards compatibility with Windows in a way), but a lot of it has to do with some really bad decisions back in the day, as well. But I guess that has a lot to do with the fact that CISC is a huge system with many opportunities for bad design and bad decisions.

But the culmination of these years of cruft and initial bad mistakes that stem from Intel’s original CISC design is that the Atom, with all the engineering magic that Intel employs and the smaller fabrication both of which the (majority of) the competition do not have is still outperformed and less efficient than its RISC cousins. We can only image (and perhaps, even then, not too accurately), what sort of amazing product we could have if the same research and tech went into the design of the Tegra (which is probably the 2nd-closest in terms of R&D money that went into the design, but with out the years of experience in general processing unit architectures) or ARM.

I think it would be in Intel’s best interest to “deprecate” certain x86 features and introduce a mobile “x86-lite” version for the Atom. Most existing software would run without any visible impact, only certain legacy software or weird compiler output would suffer. But the resulting Atom design would be simpler and less-complicated, and perhaps could stand its own against other RISC designs.

Again, not comparing Atom against RISC because I’m saying that RISC is inherently better – just comparing Atom against RISC because that’s what the current competition is.
Peter on April 4, 2010 at 7:38 pm said:

Dude. With each response, you’re digging yourself deeper and deeper in the hole. Did you even read what I wrote? x86 has been “x86-lite” for at least a decade now. Most of the inefficient CISC instructions were deprecated a long time ago. Compilers don’t use them. They’re never seen in modern code. In contrast to the slightly dumb scheme you described, they weren’t taken out — they were implemented in a special way for backwards-compatibility where they’re very slow (under the assumption they’d never be used), but have effectively zero cost to processor power, performance, cost, or complexity.

The only places where the cruft matters are in fundamental decisions like number of registers or addressing schemes, which can only change with the introduction of a new mode (16 bit->32 bit->32 bit). While this does hurt Atom, it doesn’t hurt it very much.

Again, you’re not oversimplifying. I don’t mind simplifying or even oversimplifying. What you’re doing is posting cow droppings. It sounds like someone oversimplified things when they explained them to you, and now you’re paraphrasing that oversimplification in a way that varies from confused to incorrect to obsolete.

By the way, why do you believe more R&D goes into Atom than ARM? Intel is about 25 times bigger than ARM in terms of both revenues and market cap. ARM invests darned near close to 100% of it’s resources into ARM, and ARM’s clients invest tons more resources, at least some of which get piped back to ARM. Intel invests the vast majority of its resources into desktop/server/laptop processors, and aside from that has dozens of other businesses from motherboards to ethernet cards to solid state drives to wireless cards to consumer electronics. Atom is a very tiny portion of Intel. More than 4%? I don’t know, but I would doubt it.

I’d highly recommend reading a good book on processor architectures and compilers. This stuff really isn’t that hard. Computer Architecture: A Quantitative Approach is fairly reasonable for the computer architectures side.
Mahmoud Al-Qudsi on April 4, 2010 at 8:39 pm said:

With all due respect, that’s the book I learned 🙂 It’s the book I used as a reference when I built a C++ cycle-accurate MIPS simulator for a custom multi-cycle datapath, configurable variable-depth pipeline, and a programmable cache and the book I used to learn about modern CPU optimizations. I confess that that was as far as it went – Computer Architecture: A Quantitative Approach 4th Edition by Hennessey and Patterson. I didn’t pursue a graduate degree in Computer Architecture, so bear with me here.

May I ask what you studied/did in computer architecture?

I also confess to not knowing the inside-out of the Atom, so help me out here. I’m only trying to logic out the reasons why Atom still isn’t as efficient as the ARM machines out there.

As for R&D: Let me turn the tables around ask you a question. Why do you think it’s only money spent directly on the Atom that counts? The reasons behind my claims for superior R&D by Intel in the Atom is because I think a lot of the concepts, algorithms, and refinements used by Intel in their x86 lineup from way back when until now with the i7 have been applied in some form or the other to the Atom. It’s Intel’s power play into the embedded market, which has long been a RISC stronghold (primarily by ARM (TM) products): I’m certain they went in with full force and by porting a lot of ideas, concepts, and technologies from their decades of experience and experimentation to the Atom.

And I also think you’re underestimating the effect of the fabrication process. Intel’s Atom has a minimum feature size of 45nm, whereas the ARM Cortex-A9 (I *think* that’s the most advanced ARM at the moment) has a minimum feature size of 65nm. We’re comparing cats and tigers here… yet the cat is somehow out-showing the Tiger.

So you can see the conundrum I’m in. My experience has been with RISC systems, and I don’t know x86 inside-out like you seem to. But I’m trying to *logically* analyze and address these shortcomings of the Atom, but your insults aren’t helping.
steve on April 4, 2010 at 9:05 pm said:

I agree with Mahmoud’s comments on x86 and windows backward compatibility. in fact I just wrote a blog post on why apple crushes companies in thinking through compatibility from generation to generation. Intel failed here – they had a monopoly on the desktop with x86 and made it way too complex over the years, starting circa MMX.

http://stevecheney.posterous.com/backwards-compatibility-vs-innovation-why-app

Regarding the overall argument you guys are having here, it really can be summed up with performance. Atom is a joke compared to Arm where it matters – performance per watt. I have written about this subject as well. You pointed it out – why is Intel’s 45nm technology inferior in performance per watt to Arm in 65nm??

Intel, by selling off the X-Scale biz to Marvell (right before the iPhone) missed the boat here. They gave up. x86 is not going to win in mobile. End of story. No one wants it, and Intel will dominate high end computing, but lose mobile entirely.
jerry on April 5, 2010 at 12:23 am said:

Hi, Guys

please let’s have a friendly talk about the x86 and arm stuff.

I am wondering about the performance of x86 and ARM ISA in terms of compiler optimization.
witch architecture will be better? I know the code size for x86 will be smaller and there are more choice for x86 in terms of instruction sequence. but for ARM, it has conditional execution and thumb mode.

power wise, it’s seems the overall system power is similar between an 1Ghz ARM and atom. so what do you guys think?
superdug on April 5, 2010 at 5:17 am said:

While Peter and Mahmoud argue history, I will say the following:

The ARM processor combined with a very well designed SoC can be used as a desktop machine. Case-in-point TI’s OMAP3/4 SoC.

I would direct you to the following site/project: http://beagleboard.org/

The beagleboard is a true “computer” built around TI’s OMAP3 SoC. I have run ubuntu on the beagle board with the same applications that I run on a netbook. Both systems are constrained by limited processing power and limited amounts of memory, but both can handle quite a few standard desktop applications and tasks.

What apple wanted to do with the iPad was bridge the gap between a “computer” and a netbook. In this application I don’t think anyone will argue with the decision to use an ARM CPU.

There will be multicore ARM CPU’s, they will begin to handle 64-bit instructions, and the gap will slowly come closer and closer between ARM and Intel.

The one thing the netbook taught the market is that computers right now are powerful enough for 99% of what people think they use computers for. With support from google from both Chrome OS and Android, along with Linux, and to some degree the apples iPhoneOS, will push the application world of ARM further.

The ARM may not replace the desktop, but theres more and more evidence leading to the fact that the desktop computer is slowly losing ground. As we move towards more portable and more powerful portable systems, the chip that can give the best battery longevity will be king.
Roberto on April 5, 2010 at 5:20 am said:

I love that this topic was brought up. I work with all three CPUs and it is interesting to me which one has the brighter future.
I have to agree with both Mahmoud and Peter.
1. ARM started as an embedded, MIPS-per-Watt focused CPU, and continues in this path.
2. Intel started with MIPS-per-dollar CPUs, and delivers a lot more MIPS than ARM.
3. Intel tries to get into ARM’s market.
4. ARM banks on its experience and an architecture that was efficient from the get-go
5. Intel has the engineering might and money to optimize a retarded koala into Einstein, so x86 would be less of a problem
6. Intel has the best fabrication processes in the world
7. ARM chip sales earnings are effectively split between ARM and the licensee, but at least we don’t see TV commercials of Texas Instruments OMAP chips. Advertising can get very expensive. Only Intel has TV ads, are they still doing the Blue Man group?
8. PowerPC still lives in high end professional switches, routers, cars, and servers, mainly produced by Freescale, IBM and AMCC. It is a big mystery why those markets prefer PPC over ARM, specially since most PPCs haven’t progressed from the 603e core. The exception to that state is IBM Power7 servers.
Doctorrobert on April 5, 2010 at 5:44 am said:

Power usage will become a moot point if battery technology actually evolves. Its been like 14 years since the last major Lithium ion battery innovation that was commercially acceptable. Now I’m not a computer scientist so you folks are a lot more qualified to answer this to me, but what happens in a theoretical situation where batteries quadruple in power due to some new tech? Does this change the movement away from ARM chips?
Roberto on April 5, 2010 at 6:21 am said:

Computers that run as long as cellphones are highly desired, and will required both higher capacity batteries and even more efficient CPUs and screens. This market wants OLED screens, but we haven’t seen those in other than Zunes.
Andreas Louca on April 5, 2010 at 7:23 am said:

Excellent article and intriguing discussion! Thanks
Mike on April 5, 2010 at 7:48 am said:

@peter @Mahmoud Al-Qudsi Who gives a Shit, Can a mobile ARM or Intel processor run Crysis? No.

For the record, this whole ARM desktop thing to be clear would never happen on a large consumer commercial scale. With that statement out of the way even if ARM desktop computers were made, commercially they won’t get anywhere until Micro$oft designs an OS around it, period. And widows CE isn’t what I am talking about. I am talking about a full featured Win 7 desktop OS.

As far as Atom’s shitty power usage compared to ARM, Intel and AMD aren’t going to stop until they match ARM on the power usage front. It will happen, it just might take another few years, but they are serious about it and they will get there, one day. And when they do, the machines will rise.

On a side note: I wonder if the Terminator/Skynet is RISC or CISC based?
Chris on April 5, 2010 at 7:59 am said:

@Steve-

MMX was bunk but the later SWAR implementations have been FAR cleaner and much more powerful. I think you fail to realize that without SWAR we would have NO youtube or better yet no open video standards! The primary reason RISC CPUs can handle video is because they dedicated hardware decoders. I can not, and likely never will be able to, use an ARM processor to encode / decode video at reasonable speed without a hardware accelerator. This means that if someone comes up with a better codec I’m stuck with what Ive got until its implemented in silicone and I upgrade. Admittedly I can use most SWAR techniques on normal registers however I am limited to the data width of the architecture.
Further, unlike ARM a large hunk of a x86/_64 CPU is dedicated to cache management across multiple cores. In the end the entire is ARM better then IA32 argument is total BUNK. ARM chips often run synchronous core clocks at or below memory clock! You CANT scale a ARM chip up to the speeds most IA32/64 chips operate at and have it retain ANY efficiency. If you did you would find that you can no longer have a chip wide synchronous clock due to distribution issues. You need inter module hardware to ensure that data clocks match up. And no direct access to memory due to latency which results longer pipelines and BIG issues with branches. Basically you loose ALL your performance. This is why you don’t see GPUs trying to do general purpose computing (IE Interrupts)!
I estimate that once ARM tries to reach beyond the 1.4 GHz range it will stop being RISC and start looking A LOT more like IA32 from 4 years ago (NO SSE2/3). The only reason we can get ARM chips as fast as we have is because of Dynamic Domino Logic (See humming bird). In reality your 1 GHz processors are NOT running that fast, just a tiny bit of logic at their core.

Thank you Dr. Dietz and David Shippy!
Mahmoud Al-Qudsi on April 5, 2010 at 9:34 am said:

@Chris: Very true. But who says that we should be doing video processing in the CPU anyway?

So long as you have a good connection between the databus and the encoding/decoding chip, there’s really no reason to stick it in the processor. It’s the same reason desktop PCs don’t have the GPU and CPU in one bundle (and I don’t even mean on-die SoC-style but actually part of the same core).

If you can have a really good and efficient CPU and another really good and efficient video processing unit, why would you want to merge them into one bloated beast?
Wouter on April 5, 2010 at 2:31 pm said:

Thank you for the interesting article and discussion!

Let me start by saying I’m not hindered by any significant amount of prior engineering knowledge on processors. Which is also exactly why I’m here to ask you guys a question.

The thing is that I’m interesting in the possibility of a shift taking place in the market for basic consumer computers, involving interrelated changes in usage, software and hardware.

The changes in usage are primarily linked to the increased interest in being mobile and the focus on the internet. The change related to software is that with smartphones, and now also the iPad, the average consumer becomes less reliant on Windows. The hardware change is mainly built around the concept that ‘low performance’ hardware of today is good enough for every day tasks. This is an argument brought up by consumers who bought netbooks or nettops.

To expand on the hardware subject: Other factors that play a role are the increased interest in power efficiency (due to the interest in mobility and environment), the credit crisis (no money to buy a high end PC (which they don’t need anyway)) and internet speeds being the bottleneck.

Considering these developments, I thought ARM processors might become an interesting option for basic computers.

What do you guys think about this scenario from an engineers viewpoint? Is it a possible scenario that ARM processors will come to play a significant role from a technological perspective? If so, what are your expectations?

Thank you in advance!
Wouter

BTW.
The reason I’m interested in this is that I intend to do research on how developments on one level of a product are influenced by expectations on other levels of a product. For that I’m currently looking into the viability of basic consumer computers to be the product under study, with usage, software, OS and hardware as the levels of the product.
Andy on April 5, 2010 at 4:08 pm said:

The discussion about ARM desktop machines as being hypothetical is quite amusing; ARM was originally developed for the desktop machines of the now-defunct British company Acorn Computers. I remember playing a port of Quake on an Acorn RISC PC which had been upgraded with a StrongARM processor.

They saw quite a lot of use in British schools in the 1990s.
Chris on April 5, 2010 at 5:36 pm said:

@Mahmoud Al-Qudsi-

As I previously stated the external encoder chip IS the problem IT SELF not the lack of a good connection to it. If we moved all encoding / decoding to hardware it would limit the people capable of implementing new codecs to hardware manufactures.

Further, AMD IS trying to “Fuse” the CPU and GPU onto one chip. The two big reasons this has not happened is failure to meet power budget and lack of an open instruction set for the GPU. Even in the face of these two huge problems AMD continues. Why? Because in the end of the day having the GPU and CPU on the same chip uses LESS POWER then separate chips.

This begs the question, what is the total power consumption of the ARM CPU and all of the associated support processors required to make it as functional as a IA32 CPU, assuming such a thing is even possible.

@Wouter –
The ARM will likely take over the very low end GP computing market. (See Nokia N810). However, as I said, you are going to have a hard time increasing the clock rate much further on the ARM architecture as it stands. The lack of advanced multi core support isn’t helping its cause any either. It is unlikely that ARM will enter the desktop market beyond the nettop level if even there. ARM is an embedded processor, it does very well there, I don’t see that changing any time soon.

Just to be completely clear, ARM does support multiple cores, a lot like Intel supported them years ago… They essentially use a shared “north bridge”, IE no shared cache and limited cross cache validation. This results in limited inter process communication.
Mahmoud Al-Qudsi on April 5, 2010 at 6:40 pm said:

@Chris: Just a quick question. When you say “on the same chip” are we talking on the same die or in the same core?
NickF on April 5, 2010 at 10:32 pm said:

I am not a CPU expert, so please bear with me. I would like to ask: what is the main technical reason (so not economical, or political) behind the decision of Microsoft not making a RISC compatible version of either Windows or Office? How much of the code would need to be changed to simply allow compilation on a RISC chip? Or is it only a matter of ARM processor not being powerful enough?

Any insight is appreciated.
Nick
Chris on April 6, 2010 at 3:39 am said:

@Mahmoud Al-Qudsi –

I asked David Shippy basically the question and he was unable to give me a much of an answer.

My take on the whole thing is that, it depends on where you draw the line of same core / separate core. There is a difference between multi core and die sharing. Multicore systems commonly have some level of dedicated interconnect, die shared often need to communicate through an external chip (north bridge) and act as if you had two physically separate chips.

As far as I understand, CPU+GPGPU combo chips are a somewhat fuzzy area. Is the GPU its own self contained system? NO. It will need to be fed by a core processor. Will it be as integrated as say the FPU? NO. Its far too big and will likely be clocked at a slower clock rate. From the documentation, I get the feeling it will be a separate core like thing. Think of it as a very very big FPU / SSE3 unit. You should be able to access registers and issue commands directly from code on the a CPU core, IE new OP codes. However, this is speculation as they have yet to get fusion anywhere close to the door let alone out it.

Hope this helps, It didn’t do much for me either.
Mahmoud Al-Qudsi on April 6, 2010 at 10:30 am said:

Thanks for sharing that, Chris.

Seeing as we are having problems drawing the line between the physical CPU and its on-chip components, I’ll be referring to them by their logical components.

So…

I think we’re both trying to get to the same point. There is no reason for the main CPU to be doing things it’s not meant to do. CPUs are made for general purpose computations and operations, and GPUs are made for exactly not that. See, you don’t need to limit the creation of new video/audio codecs to hardware designers: give them a programmable special purpose processor to do the trick. The GPU does just that, and all the latest generation of GPUs from both ATi and nVidia can be coded directly from the software platform. You could write a hardware-accelerated algorithm to do just about anything.

The developer chooses whether to use the general-purpose CPU or the specific-purpose GPU in the machine when writing new code. There is no reason to jam in extra operations in the CPU at the cost of lowering its performance.

Obviously though, any additional processing units (regardless of where they’re physically placed), will incur a power usage tax. And it may be more power than a behemoth like the Atom would use when both the CPU and the GPGPU are used at once, but I think it would stick to the principles of keeping the common case simple and efficient.
Mahmoud Al-Qudsi on April 6, 2010 at 3:58 pm said:

Oh, and you guys may be interested in the ARM Sparrow (Cortex A9) which has been demoed at 2GHz.

The Cortex A8 at 500MHz performed on-par with the Intel Atom at 1.6GHz; the Apple A4 is almost definitely better at 1GHz – I’m really interested in what the A9 at 2GHz will have to offer.
WiseMax on April 22, 2010 at 1:48 pm said:

This article is very well written.

The core of the things is never simple, but the way it works sometimes is.

Apple, with PowerPC was perhaps happy – but all Windows users were simply put away – no wonder Windows users wouldn’t even consider a Mac!… They would have to fully switch. Same with Mac users – using a PC was the same – a full switch.
Or else you’d have to simply duplicate computers, risking Murphy’s law when you need one you’ve taken the other with you… annoying and expensive.

With the adoption of the Intel, Apple played a great move, that I am sure it is working very sweetly for them.

Take my case: Before Intel, I would never ever consider a Mac. Now I have THREE (well, two, I offered the first macBook Air to my son) MacBooks. I am happily using Bootcamp to play Windows games, synchronizing my Symbian phones with Outlook (somehow the synchronization of these devices only works with Windows and Outlook) and a few other things that the Mac does not have. On the other end, I am appreciating and enjoying how fast OSX Snow Leopard fires up, how easy it is to use and I have my fun stuff in OSX’s hands (except games).

In other words, thanks to Steve Job’s insight… I now have the best of both worlds in one single piece of Hardware (not to mention the other gizmos I purchased from Apple after that…)

As for the phones… well, ARM or unarmed, the iPhone (I have 2, 1st and 3G) I was dispointed. Battery life and monotasking put me down.

The iPad – I can’t figure what THAT is for… anyone can, I’d appreciate some input. My little PC (ULV small Vaio) and my MacBook Air still work better for me – I am not (much) limited by them!

Oh, and for books, nothing will ever compare to the Kindle!

The point of my comment?… Easy: FUNCTION beats all the rest all the time (oh, and beauty is also a funtion ;-)) Getting 2 OS’s in full-fledged mode in the same equipment is better than having to opt. I’m happy with the Intel Mac.
Kamran Abdul Jabbar on September 20, 2016 at 10:49 am said:

what ever the CPUs but a person only can memorize and write all the assembly instructions for 8051.

East or West, 8051 is the best for beginners and hobbyist.

By the way continue the argument battle, this is simply a break.
Chinarut on June 29, 2020 at 3:19 am said:

came here to help some YouTubers understand the different between x86 and PPC – great debate going on here – you make me want to pick up my college Hennessey and actually read it 🙄😅

> Most likely not, it’s not exactly in their customers best interest and x86 really is a decent platform.

fast forward 10 years and Apple pulled it off and probably spent 10+ years crafting and executing today’s transition!

Good call on Apple focusing on a high-performance ARM – my first mac was a PPC mac mini, stretched an x86 MacBook for 12+ years and typing on an iPadPro I use as my daily driver as we speak 🙂

The NeoSmart Files

Recovery software and more

The ARM, the PPC, the x86, and the iPad…

Similar Posts

31 thoughts on “The ARM, the PPC, the x86, and the iPad…”

Leave a Reply