That doesn't really prove anything. I could create a Markov chain with a random seed that doesn't always answer the same question the same way, but that doesn't prove the human brain works like a Markov chain with a random seed.
One thing humans tend not to do is confabulate entirely to the degree that LLMs do. When humans do so, it's considered a mental illness. Simply saying the same thing in a different way is not the same as randomly randomly syntactically correct nonsense. Most humans will not, now and then, answer that 2 + 2 = 5, or that the sun rises in the southeast.
I'm not making any claim about how the human brain works. The only thing I'm saying is that humans also produce somewhat randomized output for the same question, which is pretty uncontroversial I think. That doesn't mean they're unintelligent. Same for LLMs.
You have a big opaque box with a slot where you can put text in and you can see text come out. The text that comes out follows some statistical distribution (obviously), and isn't always the same.
Can you decide just from that if there's an LLM or a human sitting inside the box? No. So you can't make conclusions about whether the box as a system is intelligent just because it outputs characters in a stochastic manner according to some distribution.
Okay... I objected to your use of the word token. Humans don't think in tokens or even write in tokens so obviously what you wrote is not a fact.
That shouldn't even be controversial, I don't think?
You wrote "The text that comes out follows some statistical distribution".
At the risk of being over my head here did you mean the text can be described statistically or "follows some statistical distribution". Are these two concepts the same thing? I don't think so.
A program by design follows some statistical distribution. A human is doing whatever electrochemical thing it's doing that can be described statistically after the fact.
Regardless my point was pretty simple, I know this will never happen but I wish tech people would drop this tech language when describing humans and adopt neuroscience language.
> Humans don't think in tokens or even write in tokens so obviously what you wrote is not a fact.
Doesn't matter what they think in. A token can be a letter or a word or a sound. The point is that the box takes some sequence of tokens and produces some sequence of tokens.
> You wrote "The text that comes out follows some statistical distribution". > At the risk of being over my head here did you mean the text can be described statistically or "follows some statistical distribution". Are these two concepts the same thing? I don't think so. > A program by design follows some statistical distribution. A human is doing whatever electrochemical thing it's doing that can be described statistically after the fact.
Again, it doesn't matter how the box works internally. You can only observe what goes in and out and observe its distribution.
> Regardless my point was pretty simple, I know this will never happen but I wish tech people would drop this tech language when describing humans and adopt neuroscience language.
My point is neuroscience or not doesn't matter. People make the claim that "the box just produces characters with some stochastic process, therefore it's not intelligent or correct", and I'm saying that implication is not true because there could just as well be a human in the box.
You can't decide whether a system is intelligent just based of the method with which it communicates.
I think we are talking past each other but this has been entertaining.
I'd say anybody who writes "the LLM just produces characters with some stochastic process, therefore it's not intelligent or correct" is making an implicit argument about the way the LLM works and the way the human brain works. There might even be an implicit argument about how intelligence works.
They are not making the argument that you can't make up statistical models to describe a box, a human generated text, or an expert human opinion. But that seems to be the claim you are responding to.
Also you could always pick the most likely token in an LLM as well to make it deterministic if you really wanted.