What do you think your brain does when deciding the next word to speak? It is scoring words based on the appropriateness considering context and all the relevant known facts, as well as your communicative intent. But it is not obvious that there is nothing like communicative intent in LLMs. When you prompt it, you are engaging some subset of the network relevant to the prompt that induces a generative state disposed to produce a contextually appropriate response. But the properties of this "disposition to contextually appropriate responses" is sensitive to the context. In a Q&A context, the disposition is to produce an acceptable answer, in a therapeutic context, the disposition is to produce a helpful or sensitive response. The point is that communicative intent is within the solution space of text prediction when the training data was produced with communicative intent. We should expect communicative intent to improve the quality of text prediction, and so we cannot rule out that LLMs have recovered something in the ballpark of communicative intent.
> What do you think your brain does when deciding the next word to speak? It is scoring words based on the appropriateness considering context and all the relevant known facts
I mean, it's not. It's visualizing concepts internally and then using a grammar model to turn those into speech.
>It's visualizing concepts internally and then using a grammar model to turn those into speech.
First off, not everyone "visualizes" thought. Second, what do you think "using a grammar model to turn those into speech" actually consists of? Grammar is the set of rules by which sequences of words are mapped to meaning and vice-versa. But this is implemented mechanistically in terms of higher activation for some words and lower activation for other words. One such mechanism is scoring each word explicitly. Brains may avoid explicitly scoring irrelevant words, but that's just an implementation detail. All such mechanisms are computationally equivalent.