"SLIDE doesn’t need GPUs because it takes a fundamentally different approach to deep learning. The standard “back-propagation” training technique for deep neural networks requires matrix multiplication, an ideal workload for GPUs. With SLIDE, Shrivastava, Chen and Medini turned neural network training into a search problem that could instead be solved with hash tables"
Seems this is related to an adaptive approach which GPUs don't have support for, but could be made to. I think this means the next version of TPUs will support it, and then GPUs follow closely after.
No, their approach changes the fundamental access pattern into something anathema to GPU and TPU architectures.
In ELI5 or layman's terms: current GPU/TPU accelerators are specialized in doing very regular and predictable calculations very fast. In deep learning a lot of those predictable calculations are not needed, like multiplying with zero. This approach leverages that and only does the minimal necessary calculations, but that makes it very irregular and unpredictable. Regular CPUs are better suited for those kind of irregular calculations, because most other general software is that as well.
"SLIDE doesn’t need GPUs because it takes a fundamentally different approach to deep learning. The standard “back-propagation” training technique for deep neural networks requires matrix multiplication, an ideal workload for GPUs. With SLIDE, Shrivastava, Chen and Medini turned neural network training into a search problem that could instead be solved with hash tables"
https://insidehpc.com/2020/03/slide-algorithm-for-training-d...