This take seems fundamentally wrong to me. As in opening premise.
We use humans for serious contexts & mission critical tasks all the time and they're decidedly fallible and their minds are basically black boxes too. Surgeons, pilots, programmers etc.
I get the desire for reproducible certainty and verification like classic programming and why a security researcher might push for that ideal, but it's not actually a requirement for real world use.
Because human minds are fallible black boxes, we have developed a wide variety of tools that exist outside our minds, like spoken language, written language, law, standard operating procedures, math, scientific knowledge, etc.
What does it look like for fallible human minds to work on engineering an airplane? Things are calculated, recorded, checked, tested. People do not just sit there thinking and then spitting out their best guess.
Even if we suppose that LLMs work similar to the human mind (a huge supposition!), LLMs still do not do their work like teams of humans. An LLM dreams and guesses, and it still falls to humans to check and verify.
Rigorous human work is actually a highly social activity. People interact using formal methods and that is what produces reliable results. Using an LLM as one of the social nodes is fine, but this article is about the typical use of software, which is to reliably encode those formal methods between humans. And LLMs don’t work that way.
Basically, we can’t have it both ways. If an LLM thinks like a human, then we should not think of it as a software tool like curl or grep or Linux or Apple Photos. Tools that we expect (and need) to work the exact same way every time.
> Because human minds are fallible black boxes, we have developed a wide variety of tools that exist outside our minds, like spoken language, written language, law, standard operating procedures, math, scientific knowledge, etc.
Standard operating procedures are great but simplify it to checklists. Don't ever forget checklists which have proven vital for pilots and surgeons alike. And looking at the WHO Surgical Safety Checklist you might think "that's basic stuff" but apparently it is necessary and works https://www.who.int/teams/integrated-health-services/patient...
> What does it look like for fallible human minds to work on engineering an airplane? Things are calculated, recorded, checked, tested. People do not just sit there thinking and then spitting out their best guess.
People used to do this. The result was massively overbuilt structures, some of which are still with us hundreds of years later. The result was also underbuilt structures, which tended to collapse and maybe kill people. They are no longer around.
All of the science and math and process and standards in modern engineering is the solution humans came up with because our guesses aren't good enough. LLMs will need the same if they are to be relied upon.
This is a fantastic and thought-provoking response.
Thinking of humans as fallible systems and humanity and its progress as a self-correcting distributed computation / construction system is going to stick with me for a long time.
Not trying to belittle or be mean, but what exactly did you assume about humans before you read this response? I find it facinating that apparently a lot of people don't think of humans as stochastic, non-deterministic black boxes.
Heck one of the defining qualities of humans is that not only are we unpredictable and fundamentally unknowable to other intelligences (even other humans!) is that we also participate in sophisticated subterfuge and lying to manipulate other intelligences (even other humans!) and often very convincingly.
In fact, I would propose that our society is fundamentally defined and shaped by our ability and willingness to hide, deceive, and use mind tricks to get what our little monkey brains want over the next couple hours or days.
I knew that they worked this way, but the conciseness of the response and clean analogy to systems I know and work with all day was just very satisfying.
For example, there was probably still 10-20% of my mind that assumed that stubbornness and ignorance was the reason for things going slowly most of the time, but I'm re-evaluating that, even though I knew that delays and double-checking were inherent features of a business and process. Re-framing those delays as "evolved responses 100% of the time" rather than "10% of the mistrust, 10% ignorance, 10% .... " is just a more positive way of thinking about human-driven processes.
I totally understand this rationally if you sit down and walk me through the steps.
But there's a lot of reasons - ego, fear of losing... that core identity, etc. that can easily come back and bite you.
I'm not sure if this is the same as meditation and ego death or whatever. I find that even if you go down the spiritual route, you also run into the same issues.
People in philosophy also argue things like rational actors, self-coherency, etc.
And hey, even in this current moment you were able to type out a coherent thought, right?
I've noticed more and more that humans behave a lot like LLM's. In the sense that it's really, really hard to observe my true internal state - I can only try to find patterns and guess at shit. Every theory I've tried applying to myself is just "wrong" - in the sense that either it feels wrong, or I'll get depressed because the theory basically boils down to "you're lazy and you have to do the work" which is a highly emotionally evocative theory that doesn't help anyone.
"People do not just sit there thinking and then spitting out their best guess."
Well, if you are using AI like this, you are doing it wrong.
Yes AI is imperfect, fallible, it sometimes hallucinates, but it is a freaking time saver (10x?). It is a tool. Don't expect a hammer to build you a cabinet.
There is no other way to use an LLM than to give it context and have it give its best guess, that's how LLMs fundamentally work. You can give it different context, but it's just guessing at tokens.
We've had 300,000 years to adapt to the specific ways in which humans are fallible, even if our minds are black boxes.
Humans fail in predictable and familiar ways.
Creating a new system that fails in unpredictable and unfamiliar ways and affording it the same control as a human being is dangerous. We can't adapt overnight and we may never adapt.
This isn't an argument against the utility of LLMs, but against the promise of "fire and forget" AI.
Human minds are far less black boxes than LLMs. There are entire fields of study and practice dedicated to understanding how they work, and to adjust how they work via medicine, drugs, education, therapy, and even surgery. There is, of course, a lot more to learn in all of those arenas, and our methods and practices are fallible. But acting as if it is the same level of black box is simply inaccurate.
They are more of a black box - but humans are a black box that is perhaps more studied and that we have more experience in.
Although human behavior is still weird, and highly fallable! Despite best interventions (therapy, drugs, education), sometimes they still kill each other and we aren't 100% sure why, or how to solve it.
That doesn't mean that the same level of study can't be done on AI though, and they are much easier to adjust compared to the human brain (RLHF is more effective than therapy or drugs!).
They are much more of a black box than AI. There are whole fields around studying them—because they are hard to understand. We put a lot of effort into studying them… from the outside, because we had no other alternative. We were reduced to hitting brains with various chemicals and seeing what happened because they are such a pain to work with.
They are just a more familiar black box. AI’s are simpler in principle. And also entirely built by humans. Based on well-described mathematical theories. They aren’t particularly black-box, they are just less ergonomic than the human brain that we’ve been getting familiar with for hundreds of thousands of years through trial and error.
I would say human behavior is less predictable. That is one of the reasons why today it is rather easy to spot the bot responses, they tend to fit a certain predictable style, unlike the more unpredictable humans.
Maybe include in a prompt a threat of legal punishment? Sure somebody has already tried that and tabulated how much it improves scores on different benchmarks)
I suspect the big AI companies try to adversarially train that out as it could be used to "jailbreak" their AI.
I wonder though, what would be considered a meaningful punishment/reward to an AI agent? More/less training compute? Web search rate limits? That assumes that what the AI "wants" is to increase its own intelligence.
LLM's response being best prediction of next token arguably isn't that far off from a human motivated to do their best. It's a fallible best effort either way.
And both are very far from the certainty the author seems to demand.
An LLM isn't providing its "best" prediction, it's providing "a" prediction. If it were always providing the "best" token then the output would be deterministic.
In my mind the issue is more accountability than concerns about quality. If a person acts in a bizarre way they can be fired and helped in ways that an LLM can never be. When gemini tells a student to kill themselves, we have no recourse beyond trying to implement output filtering, or completely replacing the model with something that likely has the same unpredictable unaccountable behavior.
Are you sure that always providing the best guess would make output deterministic? Isn’t the fundamental point of learning, whether done my machine or human, that our best gets better and is hence non-deterministic? Doesn’t what is best depend on context?
I tire of this disingenuous comparison.
The failure modes of (experienced, professional) humans are vastly different than the failure modes of LLMs. How many coworkers do you have that frequently, wildly hallucinate while still performing effectively?
Furthermore, (even experienced, professional) humans are known to be fallible & are treated as such.
No matter how many gentle reminders the informed give the enraptured, LLMs will continue to be treated as oracles by a great many people, to the detriment of their application.
We use humans for serious contexts & mission critical tasks all the time and they're decidedly fallible and their minds are basically black boxes too. Surgeons, pilots, programmers etc.
I get the desire for reproducible certainty and verification like classic programming and why a security researcher might push for that ideal, but it's not actually a requirement for real world use.