Because it's a set of puzzles on a 2D grid. We don't live on a 2D grid so it's already on the wrong track. A set of puzzles for a 3D sphere wouldn't get us any closer to AGI either but at least it would be a more realistic representation of the world and how a general purpose problem solver should approach reality. Even Minecraft would be a better test and lately people have started testing LLMs in virtual worlds which is a much better test case than ARC.
Insofar as ARC is being used as a benchmark for code synthesis it might be somewhat successful but it doesn't seem like people are using code synthesis to solve the puzzles so it's not really clear how much success on ARC is going to advance the state of the art in AI and code synthesis according to a logical specification.
> Because it's a set of puzzles on a 2D grid. We don't live on a 2D grid so it's already on the wrong track.
I don't see what this has to do with anything. Intelligence is about learning patterns and generalizing them into algorithmic understanding, where appropriate. The number of dimensions latent in the dataset is ultimately irrelevant. Humans live in a 4D world, or 3D if the holographic principle is true, and we regularly deal with mathematics 27 or more dimensions. LLMs build models with at least hundreds of thousands of dimensions.
Show me an LLM that is doing any of the things you mentioned and furthermore I'm willing to bet none of that will be possible after ARC is solved either. How much money would you be willing to bet?
The only position I took issue with, and still do, is my closing paragraph of my last post. Your argument for why ARC solvers wouldn't generalize doesn't even make sense.
No point in arguing. If you think it will generalize then there is no reason to convince random people on the internet that ARC-AGI solver will get you closer to AGI.
You're on track in your arguments but don't underestimate how hard the puzzles in ARC actually are.
It takes a considerable amount of depth in reasoning to see and reason about the patterns / problems / solutions.
Try doing a few of them by hand to see what I mean.
Simulated worlds are complex enough to hide their own flaws just like LLMs are complex enough to lead us to believe they can reason when most of the time they are pattern matching.
Humans are not generally intelligent. The adjective "general" in "AGI" does not mean it is equivalent to human intelligence, it means it's above and beyond human intelligence.
I think “general” should be taken to mean “has an average human child’s common sense and causal reasoning,” since common sense and causal reasoning are at some level shared by all vertebrates. It seems like the focus on “above and beyond human intelligence” is how you get AIs which appear to understand algebraic topology, yet utterly fail at counting problems designed for pigeons. It should be scientific malpractice to compare an AI to human intelligence without making any effort to compare it to rat/etc intelligence. (I guess investors wouldn’t lie happy if Sam Altman said “in 20 years I believe we’ll reach ARI.”)
In general tech folks are far too beholden to an instinctual and unscientific idea of intelligence as compared between humans, which mostly uses linguistic ability and surface knowledge as a proxy. This proxy might sometimes be useful in human group decision-making, but it is also how dumb confident people manage to fail upwards, and it works about as well for a computer as it does a rat (though it mismeasures in the opposite direction).
Not at all. Humans are fundamentally limited by our finite statespace and bandwidth. Classifying systems that are able to generalize at least as well as a human but that can exceed those limits as superintelligent is a meaningful distinction.
I agree that "equivalent to human intelligence" is not a robust way to define general intelligence, but humans are a general intelligence.