This exactly scenario is why our company is so afraid to put AI into production without the results being completely clear that it could be wrong. But if there’s even a chance that it could be wrong, why are we offering it to the user? How much due diligence does the user need to do? Does the benefits outweigh the cons?
The AI should be required to cite the source. But that wont work for LLM though as they are just random words put together in a way that is statistically probable.
In this case the problem is caused by citing sources - Google finds a list of links using their existing search tech and then produces a response using RAG based on the search results.
That’s where the problem is - it’s originally citing a Reddit post where someone recommends it as a joke, then Business Insider through citogenisis from the original story.
Pure LLMs (no RAG) don’t make this mistake - Claude will tell you it’s a bad idea and will taste bad.
> But if there’s even a chance that it could be wrong, why are we offering it to the user?
Corporations push code written by fresh junior devs into production every day, breaking stuff that could cost them tens of thousands. Do they care? On paper, very much so, in practice, they dgaf.