I don't see LLMs as a large chunk of knowledge, I see them as an emergent alien intelligence snapshotted at the moment it appeared to stop learning. It's further hobbled by the limited context window it has to use, and the probabilistic output structure that allows for outside random influences to pick its next word.
Both the context window and output structure are, in my opinion, massive impedance mismatches for the emergent intellect embedded in the weights of the model.
If there were a way to match the impedance, I strongly suspect we'd already have AGI on our hands.
Disagree. The input/output structure (tokens) is the interface for both inference and for training. There is an emergent intellect embedded in the weights of the model. However, it is only accessible through the autoregressive token interface.
This is a fundamental limitation, much more fundamental than appears at first. It means that the only way to touch the model, and for the model to touch the world, is through the tokenizer (also, btw, why tokenizer is so essential to model performance). Touching the world through a tokenizer is actually quite limited.
So there is an intelligence in there for sure, but it is locked in an ontology that is tied to its interface. This is even more of a limitation than e.g. weights being frozen.
They don't think, they don't reason, they don't understand. Except they do. But it's hard for human words for thought processes to apply when giving it an endless string of AAAAA's makes it go bananas.
That's not familiar behavior. Nor is the counting reddit derived output. It's also not familiar for a single person to have the breadth and depth of knowledge that ChatGPT has. Sure, some people know more than others, but even without hitting the Internet, it has a ridiculous amount of knowledge, far surpassing a human, making it, to me, alien. though, it's inability to do math sometimes is humanizing to me for some reason.
ChatGPT's memory is also unhuman. It has a context window which is a thing, but also it only knows about things you've told it in each chat. Make a new chat and it's totally forgotten the nickname you gave it.
I don't think of HR Geiger's work, though made by a human, as familiar to me. it feels quite alien to me, and it's not just me,
either. Dali, Bosch, and Escher are other human artists who's work can be unfamiliar and alien. So being created by our species doesn't automatically imbue something with familiar human processes.
So it dot products, it matrix multiplies, instead of reasoning and understanding. It's the Chinese room experiment on steroids; it turns out a sufficiently large corpus on a sufficiently large machine does make it look like something"understands".
The context window is comparable to human short-term memory. LLMs are missing episodic memory and means to migrate knowledge between the different layers and into its weights.
Math is mostly impeded by the tokenization, but it would still make more sense to adapt them to use RAG to process questions that are clearly calculations or chains of logical inference. With proper prompt engineering, they can process the latter though, and deviating from strictly logical reasoning is sometimes exactly what we want.
The ability to reset the text and to change that history is a powerful tool! It can make the model roleplay and even help circumvent alignment.
I think that LLMs could one day serve as the language center of an AGI.
The word "alien" works in this context but, as the previous commenter mentioned, it also carries the implication of foreign origin. You could use "uncanny" instead. Maybe that's less arbitrary and more specific to these examples.
"Alien" still works, but then you might have to add all the context at length, as you've done in this last comment.
Hype people do this all the time - take a word that has a particular meaning in a narrow context and move it to a broader context where people will give it a sexier meaning.
Makes me think that TikTok and YT pranksters are accidentally producing psychological data on what makes people tick under scenarios of extreme deliberate annoyance. Although the quality (and importance) of that data is obviously highly variable and probably not very high, and depends on what the prank is.
They can write in a way similar to how a human might write, but they're not human.
The chat interfaces (Claude, ChatGPT) certainly have a particular style of writing, but the underlying LLMs are definitely capable of impersonating as our species in the medium of text.
But they're extremely relatable to us because it's regurgitating us.
I saw this talk with Geoffrey Hinton the other day and he said he was astonished at the capabilities of ChatGPT-4 because he asked it what the relationship between a compost heap and a nuclear bomb was, and he couldn't believe it answered, he really thought it was proof the thing could reason. Totally mind blown.
However I got it right away with zero effort.
Either I'm a super genius or this has been discussed before and made it's way into the training data.
Usual disclaimer: I don't think this invalidates the usefulness of AI or LLMs, just that we might be bamboozling ourselves into the idea that we've created an alien intelligence.
> Either I'm a super genius or this has been discussed before and made it's way into the training data.
If an LLM can tell you the relatonship between a compost heap and nuclear bomb, that doesn't mean that was in the training data.
It could be because a compost heap "generates heat", and a nuclear bomb also "generates heat" and due to that relationship they have something in common. The model will pick up on these similar patterns. They tokens are positioned closer to each other in the high dimensional vector space.
But for any given "what does x have in common with y", that doesn't necessarily mean someone has asked that before and it's in the training data. Is that reasoning? I don't know ... how does the brain do it?
I mean that’s what sucks about Open AI isn’t it ? They won’t tell us what is in the training data so we don’t know. All I’m saying is that it wouldn’t be surprising if this was discussed previously somewhere in a pop science book.
We used to have a test (Turing test) that could quite reliably differentiate between AI and our own species over the medium of text. As of now, we do not seem to have a simple & reliable test like that anymore.
Working with pure bytes is one option that's being researched. That way you're not really constrained by anything at all. Sound, images, text, video, etc. Anything goes in, anything comes out. It's hard to say if it's feasible with current compute yet without tokenizers to reduce dimensionality.
Both the context window and output structure are, in my opinion, massive impedance mismatches for the emergent intellect embedded in the weights of the model.
If there were a way to match the impedance, I strongly suspect we'd already have AGI on our hands.