> "Designed" is not right. What gives "AI models" (i.e. deep neural nets) a hard time is that there are very few examples in the public training and evaluation set
No, he actually made a list of cognitive skills humans have and is targeting them in the benchmark. The list of "Core Knowledge Priors" contains Object cohesion, Object persistence, Object influence via contact, Goal-directedness, Numbers and counting, Basic geometry and topology. The dataset is fit for human ease of solving, but targets areas hard for AI.
> "A typical human can solve most of the ARC evaluation set without any practice or verbal explanations. Crucially, to the best of our knowledge, ARC does not appear to be approachable by any existing machine learning technique (including Deep Learning), due to its focus on broad generalization and few-shot learning, as well as the fact that the evaluation set only features tasks that do not appear in the training set."
Thanks, I know about the core knowledge priors, and François Chollet's claims about them (I've read his white paper, although it was long, and long-winded and I don't remember most of it). The empirical observation however is that none of the systems that have positive performance on ARC, on Kaggle or the new leaderboard, have anything to do with core knowledge priors. Which means core knowledge priors are not needed to solve any of the so-far solved ARC tasks.
I think Chollet is making a syllogistic error:
a) Humans have core knowledge priors and can solve ARC tasks
b) Some machine X can solve ARC tasks
c) Therefore machine X has core knowledge priors
That doesn't follow; and like I say it is refuted by empirical observations, to boot. This is particularly so for his claim that ARC "does not appear approachable" (what) by deep learning. Plenty of neural-net based systems on the ARC-AGI leaderboard.
There's also no reason to assume that core knowledge priors present any particular difficulty to computers (i.e. that they're "hard for AI"). The problem seems to be more with the ability of humans to formalise them precisely enough to be programmed into a computer. That's not a computer problem, it's a human problem. But that's common in AI. For example, we don't know how to hand-code an image classifier; but we can train very accurate ones with deep neural nets. That doesn't mean computers aren't good at image classification: they are; CNNs to the proof. It's humans who suck at coding it. Except nobody's insisting on image classification datasets with only three or four training examples for each class, so it was possible to develop those powerful deep neural net classifiers. Chollet's choice to only allow very few training examples is creating an artificial data bottleneck that does not restrict anyone in the real world so it tells us nothing about the true capabilities of deep neural nets.
Cthulhu. I never imagined I'd end up defending deep neural nets...
I have to say this: Chollet annoys me mightily. Every time I hear him speak, he makes gigantic statements about what intelligence is, and how to create it artificially, as if he knows what dozens of thousands of researchers in biology, cognitive science, psychology, neuroscience, AI, and who knows what other field, don't. That is despite the fact that he has created just as many intelligent machines as everyone else so far, which is to say: zero. Where that self-confidence comes from, I have no idea, but the results on his "AIQ test" indicate he, just like everyone else, has no clue what intelligence is, yet he persists with the absurd self-assurance. Insufferable arrogance.
No, he actually made a list of cognitive skills humans have and is targeting them in the benchmark. The list of "Core Knowledge Priors" contains Object cohesion, Object persistence, Object influence via contact, Goal-directedness, Numbers and counting, Basic geometry and topology. The dataset is fit for human ease of solving, but targets areas hard for AI.
> "A typical human can solve most of the ARC evaluation set without any practice or verbal explanations. Crucially, to the best of our knowledge, ARC does not appear to be approachable by any existing machine learning technique (including Deep Learning), due to its focus on broad generalization and few-shot learning, as well as the fact that the evaluation set only features tasks that do not appear in the training set."