I don't think GP was implying that brains are the optimum solution. I think you can interpret GP's comments like this- if our brains are more efficient than LLMs, then clearly LLMs aren't optimally efficient. We have at least one data point showing that better efficiency is possible, even if we don't know what the optimal approach is.
I agree. Spiking neural networks are usually mentioned in this context, but there is no hardware ecosystem behind them that can compete with Nvidia and CUDA.
A new HW architecture for an unproven SW architecture is never going to happen. The SW needs to start working initially and demonstrate better performance. Of course, as with the original deep neural net stuff, it took computers getting sufficiently advanced to demonstrate this is possible. A different SW architecture would have to be so much more efficient to work. Moreover, HW and SW evolve in tandem - HW takes existing SW and tries to optimize it (e.g. by adding an abstraction layer) or SW tries to leverage existing HW to run a new architecture faster. Coming up with a new HW/SW combo seems unlikely given the cost of bringing HW to market. If AI speedup of HW ever delivers like Jeff Dean expects, then the cost of prototyping might come down enough to try to make these kinds of bets.
Nvidia has a big lead, and hardware is capital intensive. I guess an alternative would make sense in the battery-powered regime, like robotics, where Nvidia's power hungry machines are at a disadvantage. This is how ARM took on Intel.