I call this “The bitter cycle”, after the famous essay.
1. Someone finds a new approach, often based on intuition or understanding.
2. People throw more data at it claiming that only data matters.
3. The new method eventually saturates, everybody starts talking about a new ai winter.
We had this with: perceptrons, conv nets, RNNs. Now we see it with transformers.
My guess is the next iteration will be liquid networks or KANs, depending on which one we figure out how to train efficiently first.
The good thing is that people have been working to build an understanding of why these things work for the last 20 years, so the period between the cycles gets shorter.
1. Someone finds a new approach, often based on intuition or understanding.
2. People throw more data at it claiming that only data matters.
3. The new method eventually saturates, everybody starts talking about a new ai winter.
We had this with: perceptrons, conv nets, RNNs. Now we see it with transformers.
My guess is the next iteration will be liquid networks or KANs, depending on which one we figure out how to train efficiently first.
The good thing is that people have been working to build an understanding of why these things work for the last 20 years, so the period between the cycles gets shorter.