Ok I was obviously oversimplifying things but my point is since we can only speculate, it's clear that when you know specific algorithms/math operations/memory layouts/applications you want to optimize for you can create dedicated chips that optimize and do that quickly. That bitcoin miners are all dedicated chips and run circles around GPUs demonstrates exactly this fact.
Furthermore the fact that ML can be error tolerant means you also get to optimize certain floating point operations for speed or energy efficiency at the cost of accuracy. NVIDIA doesn't get to do this in their linear algebra support.
Bitcoin mining is an extremely well-defined task compared to machine learning. It remains to be seen how general these TPUs are in practice - whether they will support the neural network architectures common two years from now.
Furthermore the fact that ML can be error tolerant means you also get to optimize certain floating point operations for speed or energy efficiency at the cost of accuracy. NVIDIA doesn't get to do this in their linear algebra support.