With `likwid-bench -i 100 -t load -w M0:5GB:1 -w M1:5GB:1 -w M2:5GB:1 -w M3:5GB:1 -w M4:5GB:1 -w M5:5GB:1 -w M6:5GB:1 -w M7:5GB:1` we get 187976.60
Obvious there's a bottleneck either going on somewhere - at 33.5GB/s per channel, that would get close to 400GB/s, what you'd expect, but the reality is that it doesn't get to half of that. Bad MC? Bottleneck w/ the MB? Hard to tell, not sure that without swapping hardware there's much more that can be done to diagnose things.
I see. I am out of other ideas besides trying to play with BIOS tweaks wrt memory and CPU. I can see that there are plenty of them, for worse or for the better.
At a quick glance, some of them look interesting such as "Workload tuning" where you can pick different profiles. There is "memory throughput intensive" profile. You can also try to explicitly disable DIMMs that are not in use given you use only half of them. I wouldn't hold my breath that any of these will make a big difference but you can give it try.
Another idea: AFAICS there have been a few memory-bw zen-related bugs reported to likwid and, in particular, https://github.com/RRZE-HPC/likwid/issues/535 may suggest that you could be hitting a similar bug but with another CPU series.
The bug report used AMDuProf to confirm that the bandwidth is actually ~2x than what likwid reported. You could try the same.
With `likwid-bench -i 100 -t load -w M0:5GB:1 -w M1:5GB:1 -w M2:5GB:1 -w M3:5GB:1 -w M4:5GB:1 -w M5:5GB:1 -w M6:5GB:1 -w M7:5GB:1` we get 187976.60
Obvious there's a bottleneck either going on somewhere - at 33.5GB/s per channel, that would get close to 400GB/s, what you'd expect, but the reality is that it doesn't get to half of that. Bad MC? Bottleneck w/ the MB? Hard to tell, not sure that without swapping hardware there's much more that can be done to diagnose things.