but... it understands the meat-eating goat part just fine?
That it hasn't learned enough doesn't show that this approach can never learn, which seems to be the point you're making.
It's input dataset is many orders of magnitude bigger than the model itself - it can't "remember" all of it's training data.
Instead, it collects data about how certain tokens tend to relate to other tokens. Like learning that "goats" often "eat" "leafy greens". It also learns to group tokens together to create meta-tokens, like understanding how "red light district" has different connotations to each of those words individually.
Is this process of gathering connections about the different types of things we experience much different to how humans learn? We don't know for sure, but it seems to be pretty good at learning anything thrown at it. Nobody is telling it how to make these connections, it just does, based on the input data.
A separate question, perhaps, might consider how some concepts are much harder to understand if you were a general intelligence in a box that could only ever experience the world via written messages in and out, and how some concepts would be much easier (one might imagine that language itself would come faster given the lack of other stimulation). Things like "left" and "right" or "up" and "down" would be about as hard to understand properly as the minutae of particle interactions (which humans can only experience in abstract too)
I think the fact it correctly uses "meat-eating goat" but misuses "vegan wolf" hints at the core lack of understanding.
Understanding either concept takes the same level of intelligence if you understand the meaning of the words (both a vegan wolf and a meat-eating goat are nonexistent entities outside of possibly bizarre exceptions, yet someone capable of understanding will have no problem with either).
That GPT has no trouble with meat-eating goat but struggles with vegan wolf hints that the former has some "statistical" property that helps GPT, and which the latter doesn't. It also hints that GPT doesn't understand either term.
Hence my example: something a human wouldn't fail to understand but GPT does.
we came from not being able to make a sensible output to these riddles at all, now discussing partial logical failures while it "got" the overall puzzle. Vast simplification and slightly incorrect on a technical level - still this development increases my confidence that scaling up the approach to the next orders of magnitude of complexity/parameters will do the trick. I even wouldn't be surprised that if the thing we call "consciciousness" is actually a byproduct of increasing complexity.
what remains right now is getting the _efficiency_ on point, so that our wetware brains (volume, energy usage, ...) can be paralleled by AI hardware demands, and not using a comically higher amount of computers to train/run
It's input dataset is many orders of magnitude bigger than the model itself - it can't "remember" all of it's training data.
Instead, it collects data about how certain tokens tend to relate to other tokens. Like learning that "goats" often "eat" "leafy greens". It also learns to group tokens together to create meta-tokens, like understanding how "red light district" has different connotations to each of those words individually.
Is this process of gathering connections about the different types of things we experience much different to how humans learn? We don't know for sure, but it seems to be pretty good at learning anything thrown at it. Nobody is telling it how to make these connections, it just does, based on the input data.
A separate question, perhaps, might consider how some concepts are much harder to understand if you were a general intelligence in a box that could only ever experience the world via written messages in and out, and how some concepts would be much easier (one might imagine that language itself would come faster given the lack of other stimulation). Things like "left" and "right" or "up" and "down" would be about as hard to understand properly as the minutae of particle interactions (which humans can only experience in abstract too)