I tell you what nialv7 I feel ya. Not only that, it makes me wonder how many great things have gone unnoticed. Partly why I'm glued to HN is bc how on earth do you find these gems otherwise??
Same here. Its biased sampling, also my prompt had generalized from GPT4 to Google’s own model - Bard. And was directly sampling, without having to go through the state when the model produces a repeating token. At least back then.
Should be a good food for the lawsuits. Some lawsuits were based on a hallucinated acknowledgement of the model that it used some particular materials, and this was clearly nonsense. Here, this is a bit more solid ground, provided that copyrighted material can be sampled and an owner would be interested in a class action.
> Screw those journals with their peer-reviewed, yet irreproducible, papers without code or data.
Seriously! I've spent so many years exploring for solutions, finding them, but only getting a description and images of the framework they boast about. For anyone thinking it should be incumbent on me to turn that into code again, screw you. If their results are what they claim, there is no god damn reason why I should be expected to recreate the code they already made. If I were a major journal, I'd tell their asses, "No code. No data. No published paper bitches!". It really makes me question what their goal is. Apparently, it's not to further their field of research by making the tools their so proud of available for others. So what is it?
By the way, one way to frequently find the code is to find the names on the paper of the 3 most published researchers, go to their homepage, and you'll typically find them eagerly making their code and data available. It frequently won't be their university page, either. For years, it was always some sort Google Sites page. I guess to make sure they maintain a homepage that won't be taken down if they switch universities.
To be fair, they did write things down. It’s more a matter of explaining why GPT was behaving the way it was (ie, because it was regurgitating its training data). Also, I’d personally respect a blog post just as a much as a peer reviewed journal article on something like this where it’s pretty easy to reproduce yourself, not to mention that I and I’m sure many others have observed this behaviour before.
https://www.reddit.com/r/ChatGPT/comments/156aaea/interestin...