Regardless of personal opinions about his style, Marcus has been proven correct on several fronts, including the diminishing returns of scaling laws and the lack of true reasoning (out of distribution generalizability) in LLM-type AI.
These are issues that the industry initially denied, only to (years) later acknowledge them as their "own recent discoveries" as soon as they had something new to sell (chain-of-thought approach, RL-based LLM, tbc.).
Care to explain further? He has made far more claims of the limitations of LLMs that have been proven false.
> diminishing returns of scaling laws
This was so obvious it didn't need mentioning. And what Gary really missed is that all you need are more axes to scale over and you can still make significant improvements. Think of where we are now vs 2023.
> lack of true reasoning (out of distribution generalizability) in LLM-type AI
To my understanding, this is one that he has gotten wrong. LLMs do have internal representations, exactly the kind that he predicted they didn't have.
> These are issues that the industry initially denied, only to (years) later acknowledge them
The industry denies all their limitations for hype. The academic literature has all of them listed plain as day. Gary isn't wrong because he's contradicted the hype of the tech labs, he's wrong because his short-term predictions were proven false in the literature he used to publish in. This was all in his efforts to peddle neurosymbolic architectures which were quickly replaced by tool use.
The hype is coming from startups, big tech press releases, and grifters who have a vested interest in raising a ton of money from VCs and stakeholders, same as blockchain and metaverse. The difference is that there is a large legitimate body of research underneath deep learning that has been there for many years and remains (somewhat) healthy.
I would argue that the claim of "LLMs will never be able to do this" is crazy without solid mathematical proof, and is risky even with significant empirical evidence. Unfortunately, several professionals have resorted to this language.
These are issues that the industry initially denied, only to (years) later acknowledge them as their "own recent discoveries" as soon as they had something new to sell (chain-of-thought approach, RL-based LLM, tbc.).