Also, it isn't uniquely attributable to Sam. They all do it, use copyrighted material, for training data. By "all", I mean all LLMs (to my knowledge). They don't do it intentionally, but it gets scooped up with everything else.
Hmmm, just thinking... Adam d'Angelo is one of the board members of OpenAI. He has the entire corpus of Quora content to use as training data, i.e. the rights to it are his. But I doubt that only Quora content was used by OpenAI during the past 8 years or so since it was founded! And the content on Quora isn't that great anyway...
Hmmm, just thinking... Adam d'Angelo is one of the board members of OpenAI. He has the entire corpus of Quora content to use as training data, i.e. the rights to it are his. But I doubt that only Quora content was used by OpenAI during the past 8 years or so since it was founded! And the content on Quora isn't that great anyway...