Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regardless of personal opinions about his style, Marcus has been proven correct on several fronts, including the diminishing returns of scaling laws and the lack of true reasoning (out of distribution generalizability) in LLM-type AI.

These are issues that the industry initially denied, only to (years) later acknowledge them as their "own recent discoveries" as soon as they had something new to sell (chain-of-thought approach, RL-based LLM, tbc.).





Care to explain further? He has made far more claims of the limitations of LLMs that have been proven false.

> diminishing returns of scaling laws

This was so obvious it didn't need mentioning. And what Gary really missed is that all you need are more axes to scale over and you can still make significant improvements. Think of where we are now vs 2023.

> lack of true reasoning (out of distribution generalizability) in LLM-type AI

To my understanding, this is one that he has gotten wrong. LLMs do have internal representations, exactly the kind that he predicted they didn't have.

> These are issues that the industry initially denied, only to (years) later acknowledge them

The industry denies all their limitations for hype. The academic literature has all of them listed plain as day. Gary isn't wrong because he's contradicted the hype of the tech labs, he's wrong because his short-term predictions were proven false in the literature he used to publish in. This was all in his efforts to peddle neurosymbolic architectures which were quickly replaced by tool use.


I’m just trying to find where all this hype is

I think the hype is coming from people who have no idea what is going on and just feeding on each other

Much like blockchain, metaverse or whatever was dominated by know nothings who spoke confidently to people even dumber than them

No professionals that have any experience or research credentials have made any crazy claims


The hype is coming from startups, big tech press releases, and grifters who have a vested interest in raising a ton of money from VCs and stakeholders, same as blockchain and metaverse. The difference is that there is a large legitimate body of research underneath deep learning that has been there for many years and remains (somewhat) healthy.

I would argue that the claim of "LLMs will never be able to do this" is crazy without solid mathematical proof, and is risky even with significant empirical evidence. Unfortunately, several professionals have resorted to this language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: