Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And there are people who say the x86 ISA is just fine, that such kludges are absolutely normal...

I get the x86 is immensely successful, but being popular is not synonymous with being well engineered. Or engineered.




Right, because ARM cores exhibit completely predictable behavior with, say, assignments to IP or instruction predicate bits?

All architectures are insane. MIPS, probably the epitome of a minimal/clean architecture, still has a vestigial 1980's pipeline stage (the branch delay slot) baked right into the ISA such that it can never be removed.

Is the x86 ISA a big mess? Yeah. But ARM is hardly better (anyone remember Jazelle?) Basically, if you don't want to look at the assembly stick to your compiler. That's what it's there for. But don't pretend that this stuff isn't present everywhere -- real engineering involves tradeoffs.


As someone who's fought for months with the ARM cortex-a9 cache prefetch behaviour and behaviour of the various memory attributes (not intuitive and only partially documented), I wholeheartedly agree with you.

Those are bits of hardware which are supposed to run around GHz speeds, they might look clean enough on the surface but when you dig deep enough "hic sunt dracones".


x86 is great. You'll absolutely see kludges on RISC architectures (e.g. anything having to do with loading a 32-bit or 64-bit constant into a register). Then you have branch delay slots on MIPS, instruction alignment issues on certain PPC processors (G5), etc.

One of the nice things about x86 is that while the encoding is a little wonky, the semantics are great. No imprecise exceptions, no branch delay slots, strong memory ordering, etc. The last one in particular is absolutely wonderful in this day and age of multicore processors (scumbag RISC: makes you use memory fences everywhere; makes memory fences take 300 cycles).


This is an issue with AMD's branch predictor, not with the x86 ISA. The code would work fine with a single "ret" instead, however people found using "rep ret" made it nanoseconds faster. it just shows how you can find optimizations in the most unexpected of places.


Good point. This is not ISA-mandated behavior, but an implementation quirk.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: