also bear in mind that AMD's standards for these "challenges" have always involved some "funny math", like their previous 25x20 goal, where they considered a 5.02x average gain in performance (10x in CB R15 and 2.5x in 3D Mark 11) at iso-power (same TDP) to be a "32x efficiency gain" because they divided it by idle power or some shit like that.
But 5x average performance gain at the same TDP doesn't mean you do 32x as much computation for the same amount of power. Except in AMD marketing world. But it sounds good!
Like, even bearing in mind that that's coming from a Bulldozer derivative on GF 32nm (which is probably more like Intel 40nm) 5x gain in actual computation efficiency is still a lot, and it's actually even more in CPU-based workloads, but AMD marketing can't help but stretch the truth with these "challenges".
To be fair idle power is really important for a lot of use cases.
In a compute focused cloud environment you might be able to have most of your hardware pegged by compute most of the time, but outside of that CPUs spend most of their time either very far under 100% capacity, or totally idle.
In order to actually calculate real efficiency gains you'd probably have to measure power usage under various scenarios though, not just whatever weird math they did here.
That's not really being fair, because the metric is presented to look like traditional perf/watt. And idle is not so important in supercomputers and cloud compute nodes which get optimized to keep them busy at all costs. But even in cases where it is important, averaging between the two might be reasonable but multiplying the loaded efficiency with the idle efficiency increase is ludicrous. A meaningless unit.
I can't see any possible charitable explanation for this stupidity. MBAs and marketing department run amok.
Yep 100% agree with you - see my last sentence. Just trying to clarify that the issue here isn't that idle power consumption isn't important, it's the nonsense math.
Wow that's stupid, I didn't look that closely. So it's really a 5x perf/watt improvement. I assume it will be the same deal for this, around 5-6x perf/watt improvement. Which does make more sense, FP16 should already be pretty well optimized on GPUs today so 30x would be a huge stretch or else require specific fixed function units.
it's an odd coincidence (there's no reason this number would be related, there's no idle power factor here or anything) but 5x also happens to be about the expected gain from NVIDIA's tensor core implementation in real-world code afaik. Sure they advertise a much higher number but that's a microbenchmark looking at just that specific bit of the code and not the program as a whole.
it's possible that the implication here is similar, that AMD does a tensor accelerator or something and they hit "30x" but you end up with similar speedups to NVIDIA's tensor accelerator implementation.
I've seen tensor cores really shining in... tensor operations. If your workload can be expressed in convolutions, and are matching the dimensions and batching needs of tensor cores, there's a world of wild performance out there...
But 5x average performance gain at the same TDP doesn't mean you do 32x as much computation for the same amount of power. Except in AMD marketing world. But it sounds good!
https://www.anandtech.com/show/15881/amd-succeeds-in-its-25x...
Like, even bearing in mind that that's coming from a Bulldozer derivative on GF 32nm (which is probably more like Intel 40nm) 5x gain in actual computation efficiency is still a lot, and it's actually even more in CPU-based workloads, but AMD marketing can't help but stretch the truth with these "challenges".