x86 finds its way into the iPhone

userbinator · on Sept 15, 2018

The author may have let his preconceptions get in the way of reasoning a bit too much --- x86, or at least x86 compiler output, is easy to recognise in hex/ASCII mostly because you'll see things like function prologue/epilogue sequences (55 8B EC, 8B E5 5D C3) and NOP (90) or INT3 (CC) padding everywhere. ARM, MIPS, and Z80 (the other 3 I can recognise by sight) all have their distinct "textures" too.

this is awesome…

I'll be the first to comment on the apparently misplaced bounds check(!) in the fragment of code above it; it reads a parameter from the stack, compares it with 8A, then makes it an index into some array of 8-byte elements and reads the two values from memory before deciding whether it's valid or not --- and seems to put -1 into eax if it's not.

Not really a problem if this is running in realmode (or "unreal mode") with no memory protection (it will just read 8 bytes from somewhere in the address space, and probably ignore them), but it could crash if it was in protmode (which the lgdt in the preceding fragment suggests) set up with restrictive segment liits, and the memory address was not valid.

Then again, the check could be completely superfluous if that function would never be called with an out-of-bounds value...

chrisfosterelli · on Sept 14, 2018

Can someone explain why it's a big deal they are using an x86 chip? It seems ARM is the standard in the mobile world, but I'm not sure what the motivations might have been for the change or if this has drawbacks that make it so surprising.

monocasa · on Sept 14, 2018

There isn't a big deal. Hell, AArch32 has something like 1400 instructions, it's just as complicated as an embedded x86. And as someone who's ported a kernel to both, it has just as many weird parts of the architecture built up over decades (ARM is about as old as 32 bit x86).

The motivation for the change is that Intel's making the baseband instead of Qualcomm now, so it's not an ARM/Hexagon like it once was.

The author is just bent around a decade plus out of date view of chip architecture.

xoa · on Sept 14, 2018

Yeah, this is at most just kind of geeky interesting. But otherwise it has no particular meaning whatsoever, particularly in an iPhone. Apple is one the few/only (I think Pixel too maybe? tptacek or other security folks would be up to date) phones to totally isolate the baseband, essentially treating it like a USB peripheral. Whether it uses USB or SDIO or PCIe it has no DMA to the application processor (Apple uses an IOMMU, all discusses in their iOS Security Guide) so while a hack of it I guess could still be irritating purely in terms of messing with cellular access, simple location privacy, or the wider network perhaps, it's not going to inherently leverage into access to the system. Baseband has its own secure bootchain of course as well.

Additionally given that this is a very low power low level embedded system device for a specific function it seems most likely that it's also, like Intel's old Atom chips, a refined derivative of older simpler x86 chips with in-order execution, no speculative execution, no µop transforms or other stuff like that. Would be kind of interesting to learn some of the details, but at any rate that means whole classes of security issues people are already bringing up in this thread (like Spectre/Meltdown) are entirely irrelevant. If some arch simply lacks speculative hardware period then that's that. Simplicity in general can have major performance impacts sure, but it's also lower energy and more secure.

dataflow · on Sept 14, 2018

Is using an IOMMU with a USB peripheral really enough? If my PC satisfies those and I plug in a random USB device am I safe?

monocasa · on Sept 14, 2018

You don't need a IOMMU with USB. USB isn't an RDMA protocol like firewire and PCI-E, so an IOMMU doesn't really even make sense in that context.

But no, you're not safe because crappy USB drivers can probably be exploited if they're expecting the USB device to be playing nice.

tptacek · on Sept 14, 2018

You're not necessarily safe, but obviously no matter how isolated you make the cell baseband, you can't get past "crappy glue code can still bone you". The reality is that the HSIC interface between the AP and the baseband is --- conceptually, at least --- about the best you can possibly do.

monocasa · on Sept 14, 2018

Sure, I was mainly addressing "If my PC satisfies those and I plug in a random USB device am I safe?" rather than the constrained case of a known baseband and single driver.

bri3d · on Sept 15, 2018

The basic USB protocol does not allow for direct peripheral mastering of DMA (unlike IEEE 1394 or PCIe for example). So, barring a protocol exploit in the USB host controller, which is certainly a theoretical possibility given the enormous size of the specification, fencing off the USB host controller using the IOMMU is just an additional layer of protection, rather than a necessary boundary like it is for a PCIe or 1394 device which has access to memory natively.

A USB-C/USB3 physical port makes things a bit more complicated, as it also could be attached to a controller supporting Thunderbolt, which is PCIe.

The unsafe part of plugging in a USB device is probably at the OS level rather than the protocol level - in my opinion you're much more likely to be owned by something like connecting a device with a buggy or compromised driver, mounting a filesystem containing an FS exploit, or reading a file containing an application exploit than by a USB-level protocol exploit.

monocasa · on Sept 15, 2018

Right, that's why I said "but no, you're not safe because crappy USB drivers can probably be exploited if they're expecting the USB device to be playing nice."

Finding a USB protocol level exploit would be like finding an exploitable bug in how a NIC does scatter/gather of packets. Like, notings impossible, but damn.

You still shouldn't stick random unknown USB devices into your computer though.

tzs · on Sept 14, 2018

Also perhaps it should be noted that if it is a USB-C connector it may be a Thunderbolt 3 port in addition to being a USB port, which may support PCI-E and allow direct access to memory by a peripheral even if there is no driver installed for that peripheral.

jrockway · on Sept 14, 2018

The drivers on your PC are the issue. The USB controller isn't going to let a USB device directly access memory, at least from my understanding of the spec.

_emacsomancer_ · on Sept 14, 2018

I believe USB presents a much smaller attack surface since it has no direct memory access, apparently in contrast to SATA.

monocasa · on Sept 14, 2018

SATA doesn't allow for DMA from the target device. SATA controllers can DMA, but that's not different from how NICs and USB controllers typically typically DMA. The target addresses are ultimately controlled by the device on the host side. Like USB it feels more like a traditional network protocol than a traditional bus protocol.

PCI-E does DMA, and AFAIU was inspired by Infiniband (another RDMA protocol).

_emacsomancer_ · on Sept 15, 2018

My understanding is that it was easier for SATA controllers to do DMA than it was for USB controllers, but I have a fairly limited understanding.

Buge · on Sept 15, 2018

You're not safe plugging a random USB into your computer. It can pretend to be a keyboard and install malware.

foobarian · on Sept 15, 2018

Intel making the baseband instead of Qualcomm?? That seems like a huge deal to me.

jsjohnst · on Sept 15, 2018

That’s old news mostly. Intel has been making iPhone basebands for several generations, the new news is its exclusively Intel. Given Qualcomm and Apple’s very public fights, not surprising, but I’m curious how Intel overcame Qualcomm‘s patents on CDMA. Did they expire? Did Intel somehow get a license? I’m definitely curious and will post back with what I find assuming someone else doesn’t first.

Edit:

Guess Intel got the CDMA assets via its purchase of VIA Telecom assets in 2015. Wonder why it’s taken this long to come to market?

jonstokes · on Sept 14, 2018

This ^^. The "OMG... HORROR" in this article about the mere fact of x86 usage is deeply silly. The stuff about C64 and so on at the end... there's just no redeeming this mess. The mods should nuke it.

xxpor · on Sept 14, 2018

I found it to be interesting. Don't be a gatekeeper.

monocasa · on Sept 14, 2018

So, I just want to throw out there that your arstechnica uarch articles are one of the things that pushed me into computer engineering rather than straight CS. And Inside The Machine I consider to be up there with Patterson & Hennessy. Thanks for all of that! : )

jonstokes · on Sept 14, 2018

Thanks!

evancox100 · on Sept 14, 2018

Hey, same here! Your articles are what inspired me to go into chip design /computer engineering. Still have my signed copy of Inside the Machine on my bookshelf :)

Thanks a ton!

godelmachine · on Sept 14, 2018

Thanks for the info! Now there's a pundit I can follow and read the books he authored.

pcwalton · on Sept 14, 2018

Uh, it's technically interesting. If HN consisted of just articles like this, it would be a better site.

simias · on Sept 14, 2018

Is it? It's 3 pages of overly emphatic text revealing that an Intel chip is based around an x86 CPU. By the author's own admission is their conclusion: "Nothing really, I just found this funny and wanted to share".

Also I'm a bit baffled that they wrote a tool to measure the entropy of the machine code, tried hand-disassembling and considered that it might have been an encrypted binary format before guessing that an Intel chip could be running an Intel CPU. But they did try "EVERY POSSIBLE RISC ARCHITECTURE [they] KNOW" because apparently nobody ever used CISC on embedded devices. Nobody tell him about the GameBoy.

Of course I'm a bit harsh, it's easy to mock in hindsight but it's still not very interesting technically.

lcq2 · on Sept 16, 2018

obviously you're not into baseband reversing, otherwise you would have known that for the past 10+ years, basebands were almost always RISC cpu and almost always ARM...

moreover, all previous iterations of Intel basebands were custom ARM cores based around Infineon IP acquired by Intel to be competitive in the baseband market...you did not even read my document, because I said this about the old baseband version

moreover, by the nature of baseband itself, it requires a CPU capable of real-time or near real-time processing, as a matter of fact other vendors are using Cortex-R CPU, which is an ARM cpu made for real-time os, giving you predictable timings, especially interrupt processing and memory access

for example, Cortex-R gives you a special kind of memory, called TCM (Tightly-Coupled Memory) memory, which gives you predictable memory access timings, something that you cannot obtain with a simple cache

by the way, Cortex-R is also used in WiFi chipsets, because the type of processing required is very similar (check the excellent writeup done by Google's Project Zero about this)

so yes, it is interesting to see how Intel managed to implement this kind of features in an x86 CPU, which was never designed for such kind of requirements

I suggest you take a look at the References in my document, they might provide some useful information on the matter

of course if you're not interested in baseband reversing, then I guess you're right, it's not technically interesting material

pcwalton · on Sept 14, 2018

It's interesting that x86 is making inroads into mobile, yes.

wrs · on Sept 15, 2018

Sorry, having learned on VAX and 68K, followed by several years of ARM, I will never lose my instinctive revulsion to the creeping unkillable multitentacled horror that is x86. :)

dataflow · on Sept 14, 2018

It was a great read, I wouldn't nuke it.

chocolatebunny · on Sept 14, 2018

The article mentions that the old Intel baseband processor was running ARM. It seems like Intel is slowly trying to migrate all their processors to x86, given that they migrated their Management Engine from ARC to x86 a few years ago and now this.

godelmachine · on Sept 14, 2018

Would you please tell me what do you mean by baseband here? Are you referring to base radio frequency? Thanks.

SomeHacker44 · on Sept 14, 2018

Baseband usually refers to the firmware running the CPU/chip dedicated to operating the radio(s), generally the WAN radio (LTE, 3G, etc.) for a cellular connection, although some are integrated with other types (WiFi, Bluetooth, NFC, etc.).

godelmachine · on Sept 14, 2018

Thanks!

lcq2 · on Sept 16, 2018

oh not really, as I already said, Intel basebands were ARM cpu...iPhone was using the Hexagon platform only for CDMA versions, you can check it by downloading a random ipsw from previous years iPhone models

they were Hexagon only just for few models (iPhone 5 and 5s I think), before that, they were using the Infineon baseband, which guess what...it's what Intel bought :)

btw, for the last 10+ basebands were mostly ARMs, with very few exceptions (the already mentioned Hexagon), check also Mediatek and Huawei basebands

and yes, I don't like having an x86 as an embedded CPU, but that's my problem, I guess...

rasz · on Sept 15, 2018

sure, booting in real mode, a20, EIP rollover. For real Intel wins look no further than Puma cable modem chipsets switched over from arm to x86 - 1GBit links can be DOSed with 1Kb/s stream.

nimish · on Sept 14, 2018

Intel is obviously going to use the x86 core they own so they don't have to pay license fees for embedded ARM cores

uluyol · on Sept 14, 2018

I believe that all of Intel's mobile networking products were purchased from infineon. So it wouldn't be surprising if they used ARM until now given that it takes time for these things to change. They even manufactured these chips at TSMC for a while.

detaro · on Sept 14, 2018

Intel is a pretty large ARM customer, so them using ARM here wouldn't be that surprising either.

ksec · on Sept 15, 2018

On a 200M Unit shipment and average $0.2 per unit, that is $40M saving per year for Intel. Of course I assume any R&D cost of this tiny x86, ( assuming it is new ) could also be used in many other places.

vl · on Sept 14, 2018

Remember when Intel sold StrongARM division? :)

btian · on Sept 14, 2018

Is that the same as XScale, or something else?

gsnedders · on Sept 14, 2018

I presume they mean so; they bought StrongARM, later replaced it with Intel designed IP (marketed as XScale), and sold on that team. I don't actually know that the StrongARM team and the XScale team are one and the same.

gip · on Sept 14, 2018

They were - I was on that team. StrongARM came to Intel when it acquired DEC. The CPU was developed thanks to an 'architecture license' from ARM. Intel renamed the processor XScale and later sold it to focus on embedded x86. https://en.wikipedia.org/wiki/StrongARM

xenadu02 · on Sept 14, 2018

What a brilliant decision that was /s

MBCook · on Sept 14, 2018

To be fair that was a year or two before the iPhone and ARM processors took off like a rocket. At the time it was used in PDAs and smartphones, both of which sold in pathetic numbers relative to what was about to happen.

aij · on Sept 14, 2018

I remember they acquired StrongARM. Why did they sell it?

freddy418_sc · on Sept 14, 2018

Because when they were planning where to allocate their fab capacity, it made more $ense to allocate it to Cores/Xeons...

jannes · on Sept 14, 2018

I was under the impression that x86 is not energy-efficient enough to be used on a phone. But I guess that applies to the modern variant of x86 with a quite bloated instruction set. Who knows if this version of x86 has a more restricted instruction set.

kevin_thibedeau · on Sept 14, 2018

Cortex family chips take compressed Thumb instructions and translate them into ARM. PentiumPro and up does roughly the same with x86 -> uOps. There is no telling what the internal instruction representation is on these parts and how CISCy it actually is. Though Intel have been using updated low power 486 cores as microcontrollers for the past few years it could just as readily be Atom.

awalton · on Sept 15, 2018

The Quarks (rip) had a reduced and improved x86-compatible instruction set and they were quite low powered... they really failed to catch on because Intel didn't think to put them in a friendly package to be integrated into anything - they thought teeny tiny pitch BGA was fine for the Maker community that still loves their throughhole components. (And they were kinda buggy chips, but they had roughly shaken them down well by the next silicon spins). I had kinda hoped the Quark would stick around, just so Intel could start sundowning some of those terrible old instructions and execution modes and start properly decrufting x86 - it's 2018, we don't need 32-bit Real Mode anymore, we can emulate it a thousand times on a PC and still have processor power left to play video games.

But, that being said, these things probably have a "Mobile Core" processor (i.e. an Atom), which are quite low powered still, and they probably don't run them at all that high of a clock rate either, saving more power.

umanwizard · on Sept 14, 2018

What is the relationship between the instruction set and the power efficiency?

klodolph · on Sept 14, 2018

People like to get in hot fights about this all the time, and I could be taking the bait, but there is an overhead to instruction decoding and fetching that changes with the complexity of the encoding, and there is also a cost of implementing all the instructions.

Simpler instruction set means fewer gates means less power draw, with some hand waving. The ARM 1 had 25,000 transistors and the ARM 2 had 30,000. The contemporaneous Intel 386 had 280,000 and the 486 had 1,200,000. Intel had the engineering resources to design powerful chips that people bought in droves, ARM had a very small number of engineers and had to design something smaller but as a consequence the ended up with very low power consumption, which wasn’t important until people put it in phones. Since then Intel has optimized for power consumption but they were catching up for a long time.

In most modern CPUs, other factors dominate and instruction set is less important. Nowadays we have plenty of gates to spare and the question is how much can you accomplish at a given price and power envelope. And it’s irrelevant to talk about power consumption of a baseband processor unless you’re also talking about the power consumption of the radio, since they get used together.

minipci1321 · on Sept 14, 2018

"Empirical Study of Power Consumption of x86-64 Instruction Decoder"

https://www.usenix.org/system/files/conference/cooldc16/cool...

From the conclusion:

"The result demonstrates that the decoders consume between 3% and 10% of the total processor package power in our benchmarks. The power consumed by the decoders is small compared with other components such as the L2 cache, which consumed 22% of package power in benchmark #1. We conclude that switching to a different instruction set would save only a small amount of power since the instruction decoder cannot be eliminated completely in modern processors."

klodolph · on Sept 14, 2018

Yep, that pretty's pretty much it. Back in 1985 when the 386 and ARM 1 came out, neither had any cache at all.

But I think the paper is a bit narrow in scope, relative to the discussion here. For any given application, you want to find a cheap part that can run that application. If you can run the application on an 8051 with a few K of ROM, then you can save a lot of money and reduce power consumption by switching to the 8051. If you need a powerful DSP to do some SDR for your cell phone radio, you're going to pick a different instruction set and pick a part that draws a lot more power.

I think the paper is taking as fixed the part functionality and considering how the encoding can be changed, but for a core running code that is not user accessible, the engineers are free to choose a core with functionality that suits their particular needs (which you can't do with the main CPU).

What's surprising to some people is that Intel has successfully scaled down their x86 cores so you can use them as embedded cores in larger ASICs, essentially. You can reuse a successful core design like the Pentium 4 and adapt it to a modern 14nm or 10nm process and you end up with something cheap and easy to use. 10 years ago that wouldn't have worked. Even recently it was much more common to use dedicated DSPs everywhere, but these days I feel like people are ditching the DSPs for cheap and ubiquitous general purpose CPUs and microcontrollers.

kevin_thibedeau · on Sept 14, 2018

> You can reuse a successful core design like the Pentium 4 and adapt it to a modern 14nm or 10nm process

The Pentium 4 is not a technical success. It was an inflexible hot rod with a compromised architecture driven by marketing. That is why its architecture was abandoned.

The 486 started out on a 1000nm process. Shrinking it to a modern process and enhancing it with modern goodies like a larger cache is comparatively easy.

klodolph · on Sept 16, 2018

> That is why its architecture was abandoned.

I'm curious where you're getting this information... from what I understand the P4 has stuck around.

kevin_thibedeau · on Sept 17, 2018

Core and up is derived from Pentium-M merged with P4 I/O.

minipci1321 · on Sept 14, 2018

> the engineers are free to choose a core with functionality that suits their particular needs

Doesn't Intel pay license fees on every ARM core sold?

klodolph · on Sept 14, 2018

Yes, and they also have a bunch of old x86 microarchitectures and core designs lying around which they can adapt to new process nodes and drop into random ASICs, plus the accompanying expertise. Intel had StrongARM and then XScale until they sold it off back in 2006.

JoeAltmaier · on Sept 14, 2018

The 386 had an option for external cache

exikyut · on Sept 15, 2018

Wow, I wonder how slow that would've been in practice.

qubex · on Sept 15, 2018

I had a 386DX-33 with a 387 FPU “math co-processor” and 256 kbytes of external cache... speed is relative, and compared to 386SX machines it was amazingly fast.

Symmetry · on Sept 14, 2018

It depends on the regime you're operating in. In a processor that's superscalar but in order the higher cost of decoding two x86 instructions versus decoding two ARM instructions is actually significant. But if you're reading instructions in at one or two bytes per clock tick x86's more complicated instructions make a lot of sense since they save on instruction stream size. And if you're dealing with a modern wide issue out of order machine the decoding costs are just lost in the noise.

EDIT: Oh, and load-op-store instructions make the level below superscalar a bit harder to design at least. And x86's strong versus ARM's weak memory ordering guarantee have effects even in OoO-land though which one is better is a very complicated issue I'm not going to try venturing an opinion on.

monocasa · on Sept 14, 2018

There's give and take here. Instruction set density is correlated with variable length instructions (which makes sense from an information theory sorta huffman encoding perspective). Even most modern RISC instruction sets designed for code density are variable length (looking at Thumb2 and RISC-V C here). So your choices are either you pay the power cost on the increased I$ size, you pay it on the decoder, or you take a hit on perf.

ZirconiumX · on Sept 14, 2018

A more complex instruction set needs more silicon to decode, and thus is less power efficient. However, I'd imagine that Intel are pretty good at decoding x86 at this point; they've had plenty of practice, so it might balance out.

astrodust · on Sept 14, 2018

A 7nm process Pentium III is going to use almost no power and take up very little space. An 80486 would be even tinier.

tracker1 · on Sept 14, 2018

Could even be a modernized 386 with cache and integrated co, effectively a 486dx without all the additional instruction support.

flamedoge · on Sept 14, 2018

yes, but usually complex = variable length encoding. size is also a factor.

bitwize · on Sept 14, 2018

It's probably something like a Quark, which is basically a Pentium SoC that can leverage modern manufacturing techniques to get the die size and power consumption down.

_ugfj · on Sept 15, 2018

This is just baseband. It doesn't need an Atom, hell no. It could be like a 486 core manufactured on a ten year old process and it'd still be ridiculously overpowered for the task. More likely it's a P5 core because Intel have repeatedly used that core to produce things -- Bonnell and Larrabee. Anyways it probably runs at like 100 MHz or such and consumes a few tenth of a watt. Maybe an ARM consumes less but compared to the main CPU it's negligible anyways.

ksec · on Sept 15, 2018

>It could be like a 486 core manufactured on a ten year old process and it'd still be ridiculously overpowered for the task.

I think you have vastly underestimate the amount of processing power required for modern days 4G LTE computation requirement. Even with an Modern DSP.

detaro · on Sept 14, 2018

It's IMHO not a big deal, but still interesting. Intel has seemingly invested less and less into their small embedded stuff in the past few years, is known to use ARM in various parts, so it's interesting to see that they've made this switch here and are using their embedded developments in their integrated products.

tracker1 · on Sept 14, 2018

Definitely interesting.. I mean a 486dx class processor with modern mfg, bigger cache and higher clocks than in the late 80's would totally sip power and be very capable... even the p2/3 designs could work well in a lot of embedded scenarios.

omarforgotpwd · on Sept 15, 2018

It's not a big deal, you just wouldn't expect that your iPhone is running x86 instructions.

bitwize · on Sept 14, 2018

[flagged]

draw_down · on Sept 14, 2018

Come on. Every Mac available for almost 15 years now runs on Intel x86 arch

MBCook · on Sept 14, 2018

My guess is simply that Intel has wanted to be on the iPhone for a long time instead of ARM and, in a very tiny way, it’s finally happened.

Just an interesting find/factoid more than anything earth shattering or important.

stefan_ · on Sept 14, 2018

There was no change. Presumably Intel always used a x86 CPU for their baseband, because they own all the patents on that stuff so they don't need to pay any royalty on it.

Chances are this isn't even a true x86 CPU, just some ad-hoc smaller version of it. Wonder what cpuid returns on it :)

heisenbergs · on Sept 14, 2018

unlikely. intel's baseband originates from infineon which intel bought a couple of years back. Hence it probably was a pure ARM baseband until intel purchased them and insisted on using x86 cores?! very weird move, particularly as i'd assume it also still has ARM cores in the baseband, making it a mix of x86/ARM cores.

stefan_ · on Sept 14, 2018

A lot of the acquired expertise is just the analog stuff, which obviously doesn't care what the processor is very much.

If you're shipping one of the highest-dollar chips in the iPhone, I'm sure cutting out ARM is one of the juiciest moves possible. That would have been a very big target for them.

anticensor · on Sept 14, 2018

Illegal instruction exception, then randomly mocking the instruction stream?

sempron64 · on Sept 14, 2018

x86 is considered a much more complex instruction set than arm, with multitudes more side-effects. This makes baseband firmware exploits potentially more likely.

That said, modern ARM architectures are also quite complex and calling them RISC is a stretch.

Symmetry · on Sept 14, 2018

There isn't anything to really indicate that this processor has many more instructions than a 8086, befitting its role as an embedded device. I'm not aware of any difference in x86 and ARM semantics in embedded that might cause a problem.

Now, the fact that an x86 instruction stream isn't self-synchronizing can represent a danger in theory. That is, you can craft a sequence of x86 instruction such that if you start executing form 0x0000 they're safe and friendly but if you start executing from 0x0001 you'll see a different, equally valid, stream of instructions that might do something malicious. But that doesn't seem like a credible attack vector in this case given the embedded nature of the code.

dfox · on Sept 14, 2018

Well, the disassembly in the article contains LGDT, so it is certainly significantly more complex core than 8086.

smolsky · on Sept 14, 2018

Not just that, the second listing used "e" registers, so 32-bit mode (and not just 8086).

dfox · on Sept 14, 2018

Given how the i386 instruction encoding works that is not significant as in 16b mode mov ax, bx has same encoding as mov eax, ebx in 32b mode (in reality it is somewhat more complex, but this is the gist of it) and thus how it gets disassembled depend on the configuration of the disassembler. One thing that would certainly point on the code being for 32b mode would be stores into control registers (eg. mov cr0, eax instead of smsw ax, which have different encodings, and slightly different effect, with the second one being somewhat nonsencial in 32b mode).

dfox · on Sept 14, 2018

On the other hand the 32b mode code in the disassembly snippet containing lgdt looks reasonable, if it would be 16b code disassembled as 32b it would lead to non-sense as it combines 16b and 32b instructions in meaningful way.

monocasa · on Sept 14, 2018

Sure, lgdt implies 32 bit protected mode.

dfox · on Sept 14, 2018

LGDT is valid even for 286-style 16b protected mode.

monocasa · on Sept 14, 2018

Oh, you're totally right.

monocasa · on Sept 14, 2018

I'll throw out there that unaligned, unintended instructions can give an attacker more options from an ROP perspective. But it's only a small benefit for the attacker. And for something like this 30MB binary it's six of one, half dozen of another practically speaking.

dataflow · on Sept 14, 2018

What does "self-synchronizing" mean in this context? I've never heard of a self-synchronizing instruction stream. Does it just mean "memory-aligned"?

kmeisthax · on Sept 14, 2018

No.

Memory-aligned means that instructions can only exist on certain addresses. (It also implies a minimum instruction length.) On most RISC architectures, instructions are also a constant size, with the same alignment. For example, a MIPS instruction stream consists of 4-byte instructions aligned to every fourth byte in memory. Memory alignment and constant instruction size ensure that any given instruction stream can only be executed as itself, or part of itself. Note that this is not self-synchronization - if you removed a byte from an instruction stream, you would get a wildly different instruction stream.

(I said MIPS because ARM instruction streams can be THUMB, which permits a different instruction interpretation for the same stream, and thus defeats the security advantages of memory alignment requirements.)

Self-synchronizing means that each subdivided unit (usually byte) of a stream also indicates if it is the start of a new symbol within that stream. For example, UTF-8 is an example of a self-synchronizing byte stream. In UTF-8, encoded codepoints have variable-length representations. However, the upper bits of each byte clearly indicate not only if the byte is the start of a new codepoint, but how many bytes follow to complete the codepoint. This means that a deleted or altered byte will only delete or alter the symbol it's a part of, and not any other part of the stream.

Self-synchronization and alignment requirements address the same problem - ambiguous instruction streams - by different methods. Memory alignment prohibits you from reinterpreting the stream; while self-synchronization makes doing so less valuable.

Dylan16807 · on Sept 14, 2018

But do any self-synchronizing instruction sets exist? (And are in commercial silicon?) It seems like it would be very annoying for density and constant-encoding reasons with 8 bit code units, and none of the variable 16/32 bit encodings I can name do it either.

dezgeg · on Sept 15, 2018

The PIC18 instruction set has this property. Most instructions take up a single 16-bit instruction word, but the few ones that take up 2 words all have the 0b1111 in the high bits of the second word, which would execute as a NOP if branched into.

This encoding is presumably a consequence of the instruction set containing conditional skip instructions. That is, the skip instructions don't have to parse the instruction stream for 2-word instructions but can always just skip a single instruction word and let the second word execute as a NOP.

dataflow · on Sept 14, 2018

Ohh, thanks! Now I recall I'd heard the term regarding UTF-8 at some point but completely forgotten about it.

It reminds me very much of prefix-freeness. Is it fair to say that that's what we really need (without alignment)? It seems self-synchronizing is a bit stronger (no pun intended) but not necessarily necessary to ensure you can't jump into the middle of an instruction?

sounds · on Sept 14, 2018

Well, just for one:

Spectre affects almost all CPUs. It affects ARM CPUs.

But Meltdown only affects x86 CPUs. [1]

[1] https://en.wikipedia.org/wiki/Meltdown_(security_vulnerabili...

Maybe Intel used such an old core design in their baseband that it's not affected? Nah, that'd just be too good to be true.

zamadatix · on Sept 14, 2018

Your source that Meltdown only afects x86 CPUs literally starts with "Meltdown is a hardware vulnerability affecting Intel x86 microprocessors, IBM POWER processors, and some ARM-based microprocessors.[1][2][3]" and that iOS needed to be patched.

Also worth noting AMD x86 CPUs were not affected. Meltdown was definitely more a "how good was your implementation" question not a "are you x86" question.

agency · on Sept 14, 2018

Can you describe a situation where an adversary would be able to run user code on a phone's baseband processor? This is a serious question - I don't know anything about smartphones. Do apps have access to the baseband processor? As an uninformed bystander I would think not...

atq2119 · on Sept 14, 2018

It might be possible to find and exploit a buffer overflow by impersonating a cell phone tower. Not exactly trivial to pull off, but certainly not a priori impossible.

acoye · on Sept 15, 2018

This comment is down-voted as other archs are vulnerable to Spectre/Meltdown not just x86. True.

Yet, as of right now Intel as not fixed in silicon all spectre/meltdown issues. Given the R&D / QA time on a specific arch, I'd bet it is still hardware flawed if they took an of the shelf in house x86 design. Atom based maybe?

So I do not feel confident for a chip that runs the broadband and moreover one that is not auditable by the end user (unlike a main CPU would be).

I would be supper interested in knowing if the reverse engineering show traces of retpolines.

monocasa · on Sept 15, 2018

It looks strongly like a quark, in which case it's not vulnerable to Spectre and Meltdown. That core is fundamentally a 486 and doesn't do the kind of speculation behind these vulnerabilities.

gip · on Sept 14, 2018

I was working as a Postdoc at Intel around 2005 when they decided to sell their X-Scale business (the ARM CPU they got when they bought DEC). My manager said that embedded x86 will make its way into smart phones. It was a long long shot back then. I admire folks at Intel for their persistence!

dfox · on Sept 14, 2018

One thing that strikes me as highly weird is x86 code in ARM ELFs. It is usually the other way: code for some random embedded custom almost-RISC in ELFs claiming to be for i386 :)

lcq2 · on Sept 16, 2018

they're not standard ELF files, more likely they're using the ELF format just to have a list of "load address-size-data" stuff assembled with some custom linker script, and they did not bother to change it, probably because of integrity checks or sanity checks along the assembly line

would have been much more fun if they switched to PE format though, like they did with EFI/UEFI :D

bogomipz · on Sept 14, 2018

>"What I discovered sent a shiver of horror down my spine", "I couldn't believe my eyes", "Holy shit"

These seems like a lot of histrionics for a summary that reads:

>"Conclusions Nothing really, I just found this funny and wanted to share."

tebruno99 · on Sept 14, 2018

In the closed box that this chip is in, as long as it meets power usage spec then it doesn't really matter what Arch it is. Not sure what the big deal here is other than some personal bias the author has against x86

ksec · on Sept 15, 2018

This is the first time Intel is the sole baseband modem supplier for Apple's new iPhone.

This is also the first time Intel has their baseband modem manufacture in their own 14nm Fab.

This is also the first time any Intel modem had support for CDMA / TDS-CDMA.

This is also the first time Intel made x86 into their baseband modem. All previous modem, 8 years since Infineon has been sold to Intel, has all been ARM based.

So yes, lots of hope for this new Modem. And finger crossed Intel don't mess this up.

yalok · on Sept 15, 2018

I wonder if the Dual SIM Dual Standby feature has anything to do with this, even as one of the minor reasons to switch to x86. Even though standby mode itself is usually the least demanding, and so it will just mean doubling of memory...

Seems very unlikely, but from product perspective it’s one (and maybe only) of the new features that’s related to baseband.

jsjohnst · on Sept 15, 2018

Doubtful, as Qualcomm baseband chipsets have been in dual sim phones for ages. It’s more likely two reasons:

1) Intel somehow is doing CDMA now, that’s a major reason previous generations were split between Qualcomm and Intel

2) The major reason Intel has a seat at the table is due to Apple and Qualcomm‘s very public fight. Intel and Apple don’t have an entirely happy relationship, but it’s a far better relationship than Apple and Qualcomm

spyridonas · on Sept 15, 2018

They also added 5g as well.

sigmaprimus · on Sept 14, 2018

Could this is a preeminent move to produce more parts in the US by Apple to avoid the new tariffs being imposed? How many years of R&D are required before production of a new phone these days? I hope this change will result in new low price SoC board PCs running x86 cores to compete with RPIs entering the market soon.

klodolph · on Sept 14, 2018

I suspect the new baseband processors are powerful enough that you can use one part for everything, rather than a different baseband processor depending on your network, which makes things cheaper. Tariffs might be a factor but Qualcomm works with Global Foundries which has fabs both in the US and elsewhere.

grawprog · on Sept 15, 2018

Arm scares me far more than Intel or Amd. The way they license their chip designs is what's created the fragmented mobile ecosystem we have today. I remember watching an interview a while back about arm chip technology with someone from their company about how their revolutionary virtualization technology could be used to run any OS on their chip...then the spokes person laughed and said how this would never happen and would insted be used by licensees to lockdown their processers even further. I'm really not a big fan of ARM or the company in general. Intel does some shady things but arm is a whole different beast altogether. It's designed with arbitrary software lockouts in mind and their licensing scheme is not conducive to open development.

bowyakka · on Sept 14, 2018

So while I am sure the author checked this, it bears mentioning that disassembling CISC is more of a black art than RISC. You can feed any binary file into a disassembler and get x86 code out, even if that code is invalid.

For example here is a "program" except it's really a meme gif off my phone. https://imgur.com/gallery/hoDKeC9

monocasa · on Sept 14, 2018

Yeah, but it's really clear that his stuff is real x86. lgdt followed by setting up all the data segment registers followed by a long jump to the code segment is about as x86 as you can get.

umanwizard · on Sept 14, 2018

Are you experienced with reading x86 assembly?

It's crystal clear that the gif from your phone is gibberish (or extremely obfuscated), whereas the code from the article is normal-looking.

bowyakka · on Sept 15, 2018

I am, I am not faulting the original author just pointing out you can get disassemblers to come up with x86.

From the comment right under the picture I took

> yeah, pretty typical function prolog, what's the question ?

Except we know it is not.

I am more saying to people be careful pushing any old binary blob through capstone without considering what it might produce, I get this at $DAYJOB where people disassemble VAX from things that are just data.

devy · on Sept 14, 2018

Since Qualcomm and Intel are the baseband/modems manufacturers and Apple has little or may not have anything to do with developing these x86 baseband modules, this should be mostly Intel/Qualcomm responsibility to tighten up the security, no? It's like we can't fault Boeing for the plan crashes if it's a CFM56 engine failure, right?

aij · on Sept 14, 2018

I don't think the CFM56 is a good analogy here. It does not appear to have been designed to fit the bolt pattern of a Cesna 172.

The main thing the x86 instruction set has going for it, is backwards compatibility. (Including the fact that there are a lot of highly optimized CPU designs around that instruction set.)

pantulis · on Sept 14, 2018

I'm buying the product from Apple. If some of their suppliers screw up, I'm blaming Apple, not Intel or Qualcomm.

nickpsecurity · on Sept 14, 2018

They’ve been in embedded with Atom processors for a while. VIA/Centaur beat them to it with C3, etc.. Id have assumed Intel would be in a mobile eventually if they weren't already. Wonder why x86 is that surprising.

Also, early Nokia 9000 Communicator had x86 CPU. I think it was a 386. Mobile returning to x86 instead of going to x86.

Mauricio_ · on Sept 15, 2018

Until 2015 my phone was A Motorola Razr i, which used an Intel cpu with x86 architecture, so it's not that uncommon. Also the PS4 uses today x86 for its main cpu. It's not mobile, but comparing it to Z80 is exaggerating a bit.

jsjohnst · on Sept 15, 2018

Anyone know if it’s the Intel XMM 7560? Very likely based on specs mentioned at the keynote, but won’t know for sure until the tear down next week I guess.

lcq2 · on Sept 16, 2018

yes that's the model "ICE7560_XMM7560_RFDEV_UB_FLASHLESS" :)

jsjohnst · on Sept 17, 2018

Reading the spec sheet on it has me feeling in awe for the antenna designer. This chipset claims to be able to simultaneously tune in on 850 / 900 / 1500 / 1700 / 1800 / 1900 / 2200 / 2800 / 3500 / 5000mhz. Having gone through the “black art” of antenna design just for a few of those frequencies before, I can’t imagine trying to cover all of them well, but I also know if anyone does it well, it’s XX’s team (not outing the person as I don’t know if it’s well known who leads that team at Apple).

pq0ak2nnd · on Sept 14, 2018

Can someone recommend a good resource to analyze entropy in a file as the author discussed?

zeeboo · on Sept 14, 2018

The easy way: compress it with whatever you want (xz? gzip?) and compare the sizes. If it doesn't get significantly smaller, it's probably encrypted or already compressed.

danmg · on Sept 14, 2018

binwalk -E

Will show a graph of the file offset vs shannon entropy.

lcq2 · on Sept 16, 2018

read "A Mathematical Theory Of Communication" by Shannon himself, it's the only source you will ever need, it's available for free

mankash666 · on Sept 14, 2018

Much Ado about nothing - Intel furnished modem on the iPhone uses x86 instructions!!!

agumonkey · on Sept 14, 2018

the news is that intel has competitive modems .. would have never expected that

irrelative · on Sept 15, 2018

to me the news is: intel has competitive modems that use intel instructions

agumonkey · on Sept 15, 2018

were they used often in the recent years regardless of the arch of their modem chip ? it's been long since I heard their name.

jsjohnst · on Sept 15, 2018

Yes, Intel has been a baseband chipset on iPhones for a while now (at least since iPhone 7, can’t remember if earlier), just not exclusively.

Edit:

And as the original iPhones used Infineon chips (which Intel acquired), depending on your perspective, you could twist it and say they’ve been there since the beginning. Bit disingenuous to me, but I could see someone making that claim.

danmg · on Sept 14, 2018

Well it's a battery powered phone. Decoding x86 instructions and converting them to the processor's internal microcode, unless it's not really the full x86 ISA, is not energy efficient. This means there may be a noticeable battery life difference between the GSM and CDMA versions of the new iPhone.

xenadu02 · on Sept 14, 2018

> Decoding x86 instructions and converting them to the processor's internal microcode, unless it's not really the full x86 ISA, is not energy efficient.

That's a very antiquated view of modern CPU design. Unless you're designing a coin-cell powered CPU the x86 decode stage is effectively free and can be disregarded. Besides: most ARM (and nearly ever other modern RISC CPU) does the same thing, decoding the instructions to internal micro-ops. x86 variable-length instructions vs RISC multiple instructions can be thought of as different instruction compression schemes and which one treats your L1-I cache better depends on workload.

CPU ISAs are effectively an ABI for hardware. Except for the extremely low end, no one directly executes the ISA anymore and hasn't for years.

I suppose if anyone were designing new ISAs they'd design them with that assumption, to avoid baking-in temporary implementation details they'd regret later (see: branch delay slots).

monocasa · on Sept 14, 2018

> I suppose if anyone were designing new ISAs they'd design them with that assumption, to avoid baking-in temporary implementation details they'd regret later (see: branch delay slots).

Yep. See RISC-V's decision making process.

blattimwind · on Sept 14, 2018

So does IBM Power.

monocasa · on Sept 14, 2018

There's some gross dark corners in Power. Their page table format is the stuff of angry elder gods. Additionally the shared CR register is a complication on OoO cores.

monocasa · on Sept 14, 2018

Those benefits are really over stated. Even the slightly bigger RISCs will convert into internal micro-op for various reasons generally. You can even see this openly in the RISC-V BOOM HDL.

jonknee · on Sept 14, 2018

All the new phones use the Intel modem...

baq · on Sept 14, 2018

numbers please

danmg · on Sept 14, 2018

This is literally the reason all mobile devices run on some variant of arm or mips, and why umpc's from the 00s never really caught on.

saagarjha · on Sept 14, 2018

There are some Android devices that use Intel Atom, so it's not quite right to say that "all" devices use ARM or MIPS.

mankash666 · on Sept 14, 2018

Wait what. Where is this article suggesting conversion of x86 to arm assembly prior to execution? Stop making things up. Unless otherwise proven, it's x86 executable running on x86 intel cores within the modem or related compute.

monocasa · on Sept 14, 2018

I don't read anything of the sort in his comment.

mankash666 · on Sept 14, 2018

Then you don't read right. "Decoding x86 instructions and converting them to the processor's internal microcode ..."

danmg · on Sept 14, 2018

Modern x86, for the past 25 years or so, doesn't implement the instructions directly. There's an internal intermediate, proprietary reduced instruction set. So outwardly it's a cisc processor, with variable length instructions and a lot of different instruction modes, but internally it's not.

monocasa · on Sept 14, 2018

And even before then they decoded in microcode. That goes all the way back to the 8086, which had five banks of microcode IIRC.

mankash666 · on Sept 14, 2018

Speculation about how an embedded x86 works. Just because their desktop focused, non-power optimized products do something, doesn't necessarily mean their embedded line-up does the same.

FYI - Apple is a hard driving customer, they do not compromise on power, performance & cost/pricing

monocasa · on Sept 14, 2018

Every x86, from 8086 on, has been microcoded.

danmg · on Sept 14, 2018

Well, you're up against basic physics here: bigger die required to decode full x86 ISA, more power, more heat.

tracker1 · on Sept 14, 2018

But a 486dx with more integrated cache could be incredibly efficient with modern manufacturing processes. A modern i5-8250 uses under 15W power... we're talking something less than 1/1000 the complexity.

Sephr · on Sept 14, 2018

The parent comment never mentioned ARM. Modern x86 CPUs translate instructions to non-x86 internal microcode.

kolderman · on Sept 14, 2018

But what core? Intel Atom?

And does this mean it can run Doom?

Symmetry · on Sept 14, 2018

Certainly not as big as an Atom, that would be an application core. Possibly a Quark[1] or possibly something custom cut down even further.

[1]https://en.wikipedia.org/wiki/Intel_Quark

johnvega · on Sept 14, 2018

I did lots of coding for x86 assembly in college, so this article was interesting. Actually, it was fun reading it.