> LLMs also love to double down on solutions that don't work.
“Often wrong but never in doubt” is not proprietary to LLMs. It’s off-putting and we want them to be correct and to have humility when they’re wrong. But we should remember LLMs are trained on work created by people, and many of those people have built successful careers being exceedingly confident in solutions that don’t work.
When it comes to programming. Tell me you don't know so I can do something else. I ended up just refactoring my UX to work around it. In this case it's a personal prototype so it's not a big deal.
That is definitely an issue with many LLMs. I've had limited success including instructions like "Don't invent facts" in the system prompt and more success saying "that was not correct. Please answer again and check to ensure your code works before giving it to me" within the context of chats. More success still comes from requesting second opinions from a different model -- e.g. asking Claude's opinion of Qwen's solution.
To the other point, not admitting to gaps in knowledge or experience is also something that people do all the time. "I copied & pasted that from the top answer in Stack Overflow so it must be correct!" is a direct analog.
So now you have an overconfident human using an overconfident tool, both of which will end up coding themselves into a corner? Compilers at least, for the most part, offer very definitive feedback that act as guard rails to those overconfident humans.
Also, let's not forget LLMs are a product of the internet and anonymity. Human interaction on the internet is significantly different from in person interaction, where typically people are more humble and less overconfident. If someone at my office acted like some overconfident SO/reddit/HN users I would probably avoid them like the plague.
A compiler in the mix is very helpful. That and other sanity checks wielded by a skilled engineer doing code reviews can provide valuable feedback to other developers and to LLMs. The knowledgeable human in the loop makes the coding process and final products so much better. Two LLMs with tool usage capabilities reviewing the code isn't as good today but is available today.
The LLMs overconfidence is based on it spitting out the most-probable tokens based on its training data and your prompt. When LLMs learn real hubris from actual anonymous internet jackholes, we will have made significant progress toward AGI.
“Often wrong but never in doubt” is not proprietary to LLMs. It’s off-putting and we want them to be correct and to have humility when they’re wrong. But we should remember LLMs are trained on work created by people, and many of those people have built successful careers being exceedingly confident in solutions that don’t work.