The clue is in the name, it's a large language model.
When I asked ChatGPT "Does ChatGPT have a world model?", I got this output:
"ChatGPT, like other GPT models, does not possess an explicit world model. GPT (Generative Pre-trained Transformer) models, such as ChatGPT, are trained on a large corpus of text data from the internet and do not have direct access to real-time information about the world. They rely solely on the patterns and information they have learned from their training data.
While GPT models can generate contextually relevant and coherent responses based on their training, they do not possess up-to-date knowledge or awareness of current events. Their responses are primarily based on patterns, associations, and statistical regularities present in the text they were trained on, rather than an explicit representation of the world.
It's important to note that GPT models can occasionally generate plausible-sounding but inaccurate or misleading information, especially when prompted with questions or topics beyond the scope of their training data. Therefore, it's always advisable to verify information from reliable sources when seeking accurate and up-to-date information about the world."
When I asked it "Does a large language model infer an understanding of a topic?", it gave this output:
"A large language model, such as ChatGPT, can exhibit the appearance of understanding a topic due to its ability to generate coherent and contextually relevant responses. It can make inferences based on patterns and associations learned from its training data, which allows it to generate text that appears knowledgeable about a particular subject.
However, it's crucial to understand that this understanding is derived solely from statistical patterns present in the training data, and the model does not possess true comprehension or conceptual understanding like humans do. The model does not have access to real-world experiences, common sense, or the ability to reason in the same way as humans.
The responses generated by a language model should be interpreted as a combination of learned patterns and statistical associations rather than as genuine understanding. The model lacks the ability to deeply grasp the meaning, implications, or nuances of a topic beyond what it has learned from the text it was trained on. Therefore, it's essential to approach the responses generated by a language model critically and corroborate information from reliable sources when necessary."
What does this prove? It proves either that someone trained ChatGPT to spit out this information or that this view is very common on the Internet, which doesn’t necessarily make it right
In the same breath: what does saying it must have a world model prove? Why must it? For every occasion that it can impress with seeming reasoning, there's plenty of times where it just falls down.
The more balanced view is that we simply don't know, it's assumption on both sides of the fence—Schrödinger's world model. If something resembling a world model is emergent from the sheer scale of data, perhaps, that's a very interesting idea, but it's clear that it still doesn't understand, in the same way that a physics simulation doesn't understand what it's doing either, but it can still output useable information whilst doing something incredibly complex. Ergo, I'd say it's far more likely that there isn't anything out of the ordinary going on.
If I can't get something as simple as a valid NGINX configuration out of it without hallucinations, despite supplying it with the original Apache .htaccess file, documentation and the URL rewrites it will require, why would it have a world model? How an NGINX configuration works is much more simplistic than how the world works with its many systems. Especially when you consider that NGINX is inherently language-based, which should be what it excels at with pattern recognition et al, but it's dumb pattern recognition to a fault with "this commonly follows that" in the training data.
The world model is very flawed, yes, but it does exist. I don't think that's "out of the ordinary", if you train something on complex real-world data you would expect at least a rudimentary world model to develop. A world model doesn't have to be something complicated, it's just having some ability to predict things that could happen in the real world. An ant probably has a world model in the sense that I'm using it.
When I asked ChatGPT "Does ChatGPT have a world model?", I got this output:
"ChatGPT, like other GPT models, does not possess an explicit world model. GPT (Generative Pre-trained Transformer) models, such as ChatGPT, are trained on a large corpus of text data from the internet and do not have direct access to real-time information about the world. They rely solely on the patterns and information they have learned from their training data.
While GPT models can generate contextually relevant and coherent responses based on their training, they do not possess up-to-date knowledge or awareness of current events. Their responses are primarily based on patterns, associations, and statistical regularities present in the text they were trained on, rather than an explicit representation of the world.
It's important to note that GPT models can occasionally generate plausible-sounding but inaccurate or misleading information, especially when prompted with questions or topics beyond the scope of their training data. Therefore, it's always advisable to verify information from reliable sources when seeking accurate and up-to-date information about the world."
When I asked it "Does a large language model infer an understanding of a topic?", it gave this output:
"A large language model, such as ChatGPT, can exhibit the appearance of understanding a topic due to its ability to generate coherent and contextually relevant responses. It can make inferences based on patterns and associations learned from its training data, which allows it to generate text that appears knowledgeable about a particular subject.
However, it's crucial to understand that this understanding is derived solely from statistical patterns present in the training data, and the model does not possess true comprehension or conceptual understanding like humans do. The model does not have access to real-world experiences, common sense, or the ability to reason in the same way as humans.
The responses generated by a language model should be interpreted as a combination of learned patterns and statistical associations rather than as genuine understanding. The model lacks the ability to deeply grasp the meaning, implications, or nuances of a topic beyond what it has learned from the text it was trained on. Therefore, it's essential to approach the responses generated by a language model critically and corroborate information from reliable sources when necessary."