AMD would be able to do DRAM on package for the lowest wattage "ultrabook" chips...

breatheoften · on Nov 27, 2020

Why would latency suck? 64-cores are already only beneficial for algorithms which are parallelizable -- with the most common class of parallelizable algorithm being data parallelizable ... So -- shouldnt the hardware and os be able to present the programmer illusion of uniform memory and just automatically arrange for the processing to happen on the compute resources closest to the RAM / move the memory closer to the appropriate compute resource as required?

johncalvinyoung · on Nov 25, 2020

Yeah, I'm no kernel developer, but I've been replying to anyone saying 'just stick n * M1 in it' that even AMD has been trying to move back to more predictable memory access latency, less NUMA woes.

freeone3000 · on Nov 26, 2020

But in general we're moving toward even less uniform memory, with some of it living on a GPU. NUMA pretended that all memory was the same latency, because C continues to pretend we're on a faster PDP-11, but this seems like a step in the wrong direction as for how high-performance computation is progressing.