Fwiw, I used to think this way too but LLMs are more RAG-like internally than we initially realised. Attention is all you need ~= RAG is a big attention mechanism. Models have reverse curse, memorisation issues etc. I personally think of LLMs as a kind of decomposed RAG. Check out DeepMindās RETRO paper for an even closer integration.