The issue for me is that ARM is also really old now. I mean, just look at the IS...

klelatti · on Oct 16, 2021

Where to start?

> ARM is really old now.

Well aarch64 was announced in 2011 so not really that old.

> Apple’s implementation of ARM is decidedly CISC.

CISC is a description of the instruction set not the implementation.

> ARM doesn’t scale.

No idea what this means but you can get 128 core Arm CPUs and address huge amounts of memory but perhaps you have another definition of scaling.

And so on.

NobodyNada · on Oct 17, 2021

As far as I understand it, “CISC” doesn’t mean “has a lot of instructions”, it means the individual instructions are themselves complex/composable/expressing more than one hardware operation. For instance, on x86 you can write an instruction like ‘ADD [rax + 0x1234 + 8*rbx], rcx’ that performs a multi-step address calculation with two registers, reads from memory, adds a third register, and writes the result back to memory — and you can stick on prefix bytes to do even more things. ARM doesn’t have anything like that; it is a strict load/store architecture where every instruction is fixed-width with a regular format and either accesses memory or performs a computation on registers.

Stuff like hardware primitives for AES/SHA, or the FJCVTZS “JavaScript instruction” don’t make a processor CISC just because they’re specialized. They all encode trivial, single-cycle hardware operations that would otherwise be difficult to express in software (even though they may be a bit more specialized than something like “add”, they’re not any more complex). x86 is CISC because the instruction encoding is more complicated, specifying many hardware operations with one software instruction.

I’m not exactly sure whar all the “cruft” is in ARM that you’re referring to. The M1 only implements AArch64, which is less than 10 years old and is a completely new architecture that is not backwards-compatible with 32-bit ARM (it has been described as being closer to MIPS than to arm32). NEON doesn’t strike me as a good example of cruft because SIMD provides substantial performance gains for math-heavy programs, and in any case 10 years of cruft is much better than 45.

I’m curious as to why RISC-V is different or better? I don’t know much about RISC-V — but looking at the Wikipedia article, it just looks like a generic RISC similar to MIPS or AArch64 (and it’s a couple years older than AArch64 as well). Is there some sort of drastic design difference I’m missing?

klelatti · on Oct 17, 2021

The only advantage I’ve heard put forward for RISC-V on single threaded applications is the existence of compressed instructions - which could reduce cache misses albeit at the expense of a slightly more complex decoder. I’m a bit sceptical as to whether this is material though as cache sizes increase.

Of course the flexibility of the RISC-V model allows approaches such as that being pursued by Esperanto [1] with lots and lots of simpler cores.

[1] https://www.esperanto.ai/wp-content/uploads/2021/08/HC2021.E...

NobodyNada · on Oct 17, 2021

ARM had THUMB, which definitely improved performance back in the GameBoy days — but they dropped that with AArch64, so presumably they decided it wasn’t beneficial anymore.

klelatti · on Oct 17, 2021

Indeed and IIRC the increased code density got them into Nokia phones too.

I find it hard to believe that they didn't drop Thumb from AArch64 without a lot of analysis of the impact on performance.

fomine3 · on Oct 18, 2021

> On top of this, I'm still dumbfounded by the fact that they didn't go for a chiplet design where ARM could truly shine: if Apple had went the chiplet route, the M1 could have had a much higher IO ceiling and might have a shot at addressing more than 16 gigs of RAM.

Remember that M1 is just a mobile SoC that work for iPad/MacBook Air. It's exceptionally great so people tend to confuse M1 is targeted higher end. 16GB max is fine for a mobile SoC in 2021. I can't wait M1X.

mixmastamyk · on Oct 17, 2021

If you don't think arm can scale any further, why do you think x86 can? They could easily double all the specs in the "M2" and slap two+ of them into a mac pro.