Hacker News new | past | comments | ask | show | jobs | submit login

> I don't know what you mean by timing (relative to what?) or "simple indexing" (they scanned the complete contents of books), but I am, and was already aware, of the wiki article and the role of recaptcha.

Timing as in: it started in 2004, when the most advanced AI most people used was a spam filter, so it wasn't seen as a training issue (in the way that LLMs are) *at the time*.

As for training rights, I agree with you, there's no clarity for how such data could be used *today* by the people who have it. Especially as the arguments in favour of LLM training are often by comparison to search engine indexing.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: