Hacker News new | past | comments | ask | show | jobs | submit login

> or 1 channel (128 bits wide)

The DDR4 bus is 64-bit, how can you have a 128-bit channel??

Single channel DDR4 is still 64-bit, it's only using half of the bandwidth the CPU supports. This is why everyone is perpetually angry at laptop makers that leave an unfilled SODIMM slot or (much worse) use soldered RAM in single-channel.

> The big win is you can have 8 cache misses in flight instead of 2

Only if your cache line is that small (16 bit) I think? Which might have downsides of its own.




> The DDR4 bus is 64-bit, how can you have a 128-bit channel??

Less familiar with the normal on laptops, but most desktop chips from AMD and Intel have two 64 bit channels.

> Which might have downsides of its own.

Typically for each channel you send an address, (a row and column actually), wait for the dram latency, and then get a burst of transfers (one per bus cycle) of the result. So for a 16 bit wide channel @ 3.2 Ghz with a 128 byte cache line you get 64 transfers, one ever 0.3125 ns for a total of 20ns.

Each channel operates independently, so multiple channels can each have a cache miss in flight. Otherwise nobody would bother with independent channels and just stripe them all together.

Here's a graph of cache line throughput vs number of threads.

https://github.com/spikebike/pstream/blob/master/png/apple-m...

So with 1,2 you see an increase in throughput, the multiple channels are helping. 4 threads is the same as two, maybe the L2 cache has a bottleneck. But 8 threads is clearly better than 4.


> two 64 bit channels

Yeah, I'm saying you can't magically unify them into a single 128-bit one. If you only use a single channel, the other one is unused.


It's pretty common for hardware to support both. On the Zen1 Epyc's for instance some software preferred a consistent latency from stripped memory over the NUMA aware latency with separate channels where the closer dimms have lower latency and the further dimms had higher.

I've seen similar on Intel servers, but not recently. This isn't however typically something you can do at runtime, just boottime, at least as far as I've seen.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: