Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We know that computers are capable of things that humans can't - anything related to brute force computation, search and memory for example.

So, just because a human can't do something, or struggles to do it, doesn't mean that the task requires a huge IQ or generality - it may just require a lot of compute/memory, such as DeepBlue playing chess.

In the case in point of these ARC puzzles, they are easy for a human, so "absense of evidence" doesn't even apply, and it's worth noting that one could also brute force solve them by trying all applicable solution techniques (as indicated by the examples and challenge description) in combinatorial fashion, or just (as Chollet notes) generate a massive training set and train an LLM on it, and solve them via recall rather than active inference, which again proves nothing about AGI.

The point of the ARC challenge is to encourage advances in active inference (i.e. reasoning/problem solving), which is what LLMs lack. It's HOW you solve them that matters if you want to show general intelligence. Even in the realm of static inference, which is what they are built for, LLMs are really closer to DeepBlue than something intelligent - they brute force extract the training set rules using gradient descent. The interesting thing is that they have any learning ability at all (in-context learning) at inference time, but it's clearly no match for a human and they are also architecturally missing all the machinery such as working memory and looping/iteration to perform any meaningful try/fail/backtrack/try-again (while learning the whole time) active inference.

It'll be interesting to see to what extent pre-trained transformers can be combined with other components (maybe some sort of DeepBlue/AlphaGo MCTS?) to get closer towards human-level problem solving ability, but IMO it's really the wrong architecture. We need to stop using gradient descent and find a learning algorithm that can be used at inference time too.



I disagree that gradient descent brute force extracts the training set. That "overfitting" kind of thing has been shown to be false many times. Transformers learn predictive models of their input, beyond what their training set contains.

But in general I agree about active inference. Clearly there is something missing there.

Doing alpha-go style MCTS would be interesting but how would you approach training the policy and value net? It's not like we can take snapshots of people's thought processes as they read text in the same way you can perform arbitrary rollouts of your game engine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: