Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Its not even been 2 years, and you think things are coming to a halt?


Yes. The models require training data and they already been fed the internet.

More and more of the content generated since is LLM generated and useless as training data.

The models get worse, not better by being fed their own output, and right now they are out of training data.

This is why Reddit just went profitable, AI companies buy their text to train their models because it is at least somewhat human written.

Of course, even reddit is crawling with LLM generated text, so yes. It is coming to a halt.


Data is not the only factor. Architecture improvements, data filtering etc. matter too.


I know for a fact they are because rate _and_ quality of improvement is diminishing exponentially. I keep a close eye on this field as part of my job.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: