Just because GPT *exhibits* a behavior does not mean it *performs* that behavior...

Jeff_Brown · on March 14, 2023

> GPT used nothing but the text itself to model the text.

I used nothing but my sensory input to model the world, and yet I have a model of the world, not (just) of sensory input.

There is an interesting question, though, of whether information without experience is enough to generate understanding. I doubt it.

dTal · on March 15, 2023

In what sense is your "experience" (mediated through your senses) more valid than a language model's "experience" of being fed tokens? Token input is just a type of sense, surely?

Jeff_Brown · on March 15, 2023

It's not that I think multimodal input is important. It's that I think goals and experimentation are important. GPT does not try to do things, observe what happened, and draw inferences about how the world works.

CyanBird · on March 15, 2023

> In what sense

In the sense that the chatbox itself behaves as a sensory input to chatgpt.

Chatgpt does not have eyes, tongue, ears, but it does have this "mono-sense" which is its chatbox over which it receives and parses inputs

stevenhuang · on March 15, 2023

I would say it's not a question of validity, but of the additional immediate, unambiguous, and visceral (multi sensory) feedback mechanisms to draw from.

If someone is starving and hunting for food, they will learn fast to associate cause and effect of certain actions/situations.

A language model that only works with text may yet have an unambiguous overall loss function to minimize, but as it is a simple scalar, the way it minimizes this loss may be such that it works for the large majority of the training corpus, but falls apart in ambiguous/tricky scenarios.

This may be why LLMs have difficulty in spatial reasoning/navigation for example.

Whatever "reasoning ability" that emerged may have learned _some_ aspects to physicality that it can understand some of these puzzles, but the fact it still makes obvious mistakes sometimes is a curious failure condition.

So it may be that having "more" senses would allow for an LLM to build better models of reality.

For instance, perhaps the LLM has reached a local minima with the probabilistic modelling of text, which is why it still fails probabilistically in answering these sorts of questions.

Introducing unambiguous physical feedback into its "world model" maybe would provide the necessary feedback it needs to help it anchor its reasoning abilities, and stop failing in a probabilistic way LLMs tend to currently do.

thomastjeffery · on March 15, 2023

Not true.

You used evolution, too. The structure of your brain growth is the result of complex DNA instructions that have been mutated and those mutations filtered over billions of iterations of competition.

There are some patterns of thought that are inherent to that structure, and not the result of your own lived experience.

For example, you would probably dislike pain with similar responses to your original pain experience; and also similar to my lived pain experiences. Surely, there are some foundational patterns that define our interactions with language.

mr_toad · on March 15, 2023

> The model isn't based on any rules: it's entirely implicit. There are no subjects and no logic involved.

In theory a LLM could learn any model at all, including models and combinations of models that used logical reasoning. How much logical reasoning (if any) GPT-4 has encoded is debatable, but don’t mistake GTP’s practical limitations for theoretical limitations.

thomastjeffery · on March 15, 2023

> In theory a LLM could learn any model at all, including models and combinations of models that used logical reasoning.

Yes.

But that is not the same as GPT having it's own logical reasoning.

An LLM that creates its own behavior would be a fundamentally different thing than what "LLM" is defined to be here in this conversation.

This is not a theoretical limitation: it is a literal description. An LLM "exhibits" whatever behavior it can find in the content it modeled. That is fundamentally the only behavior an LLM does.