The Future of Memory

Animats · on June 21, 2016

Getting bigger caches closer to the CPUs seems to be the big issue mentioned. SRAM takes up more real estate than DRAM, so on-chip DRAM is being considered, but doesn't look likely. Further out from the CPU, there are technologies faster than flash but slower than DRAM coming along.

This article doesn't address the architectural issues of what to do with devices which look like huge, but slow, RAM. The history of multi-speed memory machines is disappointing. Devices faster than flash are too fast to handle via OS read and write, but memory-mapping the entire file system makes it too vulnerable to being clobbered. The Cell processor tried lots of memory-to-memory DMA, and was too hard to program.

Maybe hardware key/value stores, so you can put database indices in them and have instructions for accessing them.

andrewflnr · on June 21, 2016

Can you go into some more detail about the risk of clobbering if you mmap everything? What's clobbering what?

bogomipz · on June 21, 2016

"Devices faster than flash are too fast to handle via OS read and write", interesting could you elaborate on why?

Animats · on June 21, 2016

Going through the kernel takes far longer than accessing memory.

imtringued · on June 20, 2016

I'm betting on HBM + APUs replacing dedicated GPUs for a lot of people.

Unklejoe · on June 20, 2016

This is the case with me. Waiting for the new AMD Zen APU.

I'm actually betting that APU SOCs with on board memory (HBM) will end up replacing 90% of the motherboard in the next few years.

snuxoll · on June 20, 2016

I'm waiting for Zen, but I'll hold my breath on the APU. Even if they integrate HBM that's going to be a rather large die with low yields to try to cram a GPU comparable to a modern mid-high to high-end GPU (that already has a die size bigger than most modern CPU's). Still, even a moderately powerful APU would work nicely in tandem with a future RX 490 or whatever AMD puts out for their Vega cards.

bogomipz · on June 21, 2016

Why are you skeptical on the APU? Intel(Ivy Bridge/Sandy Bridge) were APUs even if they aren't marketed as such.

snuxoll · on June 21, 2016

I'm just not expecting anything that will compare with my R9 390, that's all.

bogomipz · on June 20, 2016

Can anyone comment on this sentence :

"And further on, there are larger SRAMs for L3 caches, where they are possible."

I believe L3 is always SRAM so I'm confused about this. Also I thought that larger L3 caches were only an issue in terms of cost and perhaps power budgeting for the die. Are there other issues as well with moving to larger L3 caches?

sliverstorm · on June 20, 2016

Latency is also a problem. A bigger L3 means more access latency. (This is generally true of any memory)

bogomipz · on June 21, 2016

Good point.

amelius · on June 20, 2016

I'm wondering if one day memory will be so cheap, we won't need garbage collection anymore.

mikeash · on June 20, 2016

Reduction in price isn't enough on its own, you need to also reduce the growth in memory usage by software to be slower than that reduction in price, and stay slower for a long time.

Even then, it's rough. Consider that the IBM 704 from 1954 (the first machine to run Lisp) had a memory bandwidth of around 375kBps. That means that in theory it could need up to 12TB to run for a year without reclaiming memory, although real-world usage would surely be much less.

xamolxix · on June 20, 2016

> I'm wondering if one day memory will be so cheap, we won't need garbage collection anymore.

I believe we would need infinite memory for that. Cheap extensive memory is still finite.

theseoafs · on June 21, 2016

Only for long-running processes

andrewflnr · on June 21, 2016

So, just the most important ones, then. In a sense, processes are just a form of arena allocation. They don't eliminate the need for memory management; they are memory management (among many other things).

imtringued · on June 20, 2016

We can always restart the apache server after 10 requests.

amelius · on June 20, 2016

In a redundant design, why not? :)

binarycrusader · on June 20, 2016

Even if memory is cheap, it still uses power, and as a result, has lots of heat to dissipate. In many high-end server designs, the power / heat aspects are the limiting factor -- not the cost.

d0lph · on June 20, 2016

Or like a really non aggressive garbage collector perhaps.

ksec · on June 21, 2016

I have always wonder if it is possible to do TSV / Stacked SRAM on top or under the CPU Die. So instead of have 8MB sitting on the same plane, the same die size of SRAM, more likely 16 - 32MB could sit under or on top of the CPU die.

neilmovva · on June 22, 2016

The hurdle is heat - we can do something like this in mobile SoCs, which commonly place DRAM on top of the CPU (package-on-package). But for a TDP > 10 watts, the memory layer effectively insulates the main die from whatever thermal management is used, making it unworkable. Unfortunately, this problem stands to get worse as transistors shrink, since power density will keep going up.

Lind5 · on June 20, 2016

Looking forward to part 2 of this series