Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AGI should obviously be able to do them. But AI being able to do those 100 percent wouldn't be evidence of AGI however. It is a very narrow domain.


Why not? If the only thing that can solve problem X is AGI (e.g. humans), and something else comes along that solves it, then rationally that should be evidence that the something else is AGI right?

Unless you have strong prior beliefs (like "computers can't be AGI") or something else that's problem specific ("these problems can be solved by these techniques which don't count as AGI"). So I guess that's my real question.


That makes no sense at all. Any problem is initially only solvable by humans, until some technology is developed to solve it. Calculating a logarithm was at some point only doable by humans, and then digital computers came along. This would be in your view evidence that digital computers are AGI!? As in, an 8086 with some math code is AGI. We've had it for decades now, only nobody noticed :)


It's just Bayes theorem - there are basically two variables that control how strong the evidence is:

* How likely you think AGI is in general.

* How solvable you think the problem is, independently of what's solving it.

In the cases you've brought up that latter probability is very high, which means that they are extremely weak evidence that computers are AGI. So we agree!

In this case the latter probability seems to be quite low - attempts to solve it with computers have largely failed so far!


We don't agree. You're now saying anything is evidence of anything, which just makes the word "evidence" meaningless.

In real life, when people say "A is evidence of B" they mean strong evidence, or even overwhelming evidence. You just backpedalled by redefining evidence to mean anything and nothing, so you can salvage an obviously false claim.

Nobody in the real world says "rain is evidence of aliens" with the implicit assumption that it's just extremely weak evidence. The way English is used by people makes that sentence simply false, as is yours that anything previously not solved is evidence of AGI.


We're talking about a specific problem here - the competition in the OP. Not aliens in the rain.


This flies directly in the face of technologies such as Deep Blue and AlphaGo. They excel in tiny domains previously thought to be the pinnacle of intelligence, and now they dominate humans. Are they AGI in your definition?


See my response to the other commenter. In these cases as well I would conclude it's very weak evidence of AGI, so I don't think we disagree.

Edit: I think maybe the disagreement here is about the nature of evidence. I think there can be evidence that something is AGI even if it isn't, in fact, AGI. You seem to believe that if there's any evidence that something is AGI, it must be AGI, I think?


I personally don't find this line of rhetoric useful or relevant. Let's agree to disagree.


Okay, that's fair. But to be clear - this is a theorem of probability theory, not rhetoric.


> If the only thing that can solve problem X is AGI (e.g. humans), and something else comes along that solves it, then rationally that should be evidence that the something else is AGI right?

No.

Because there might undiscovered ways to solve these problems that no one claims is AGI.

The definition of AGI is notoriously fuzzy, but non-the-less if there was a 10 line python program (with no external dependencies or data) that could solve it then few would argue that was AGI.

So perhaps there is an algorithm that solves these puzzles 100% of the time and can be easily expressed.

So I agree that only being able to solve these problems doesn't define AGI.


I think I agree with you, but consider these two cases:

1. Only humans are known to have solved problem X, and we've spent no time looking for alternative solutions.

2. Only humans are known to have solved problem X, and we've spent hundreds of thousands of hours looking for alternative solutions and failed.

Now suppose something solves the problem. I feel like in case 2 we are justified in saying there's evidence that something is a human-like AGI. In case 1 we probably aren't justified in saying that.

To me this seems evident regardless of what the problem actually is! Because if it's hard enough that thousands of human hours cannot find a simple/algorithmic solution it's probably something like an "AGI-complete" problem?


Maybe (2). But it took ~50 years work to build systems that can beat people at poker and I don't think people argue poker bots are AGI.

To be clear, I think we have AGI (LLMs with tool use are generalized enough) and we are currently finding edge cases that they fail at.


That's a good point. In my head I was considering stuff like chess, where even though it took a long time to reach superhuman performance on computers, the issue was mainly compute. People basically knew how to do it algorithmically before then (pruning tree search).

I guess the underlying issue with my argument is that we really have no idea how large the search space is for finding AGI, so applying something like Bayes theorem (which is basically my argument) tells you more about my priors than reality.

That said, we know that human AGI was a result of an optimisation process (natural selection), and we have rudimentary generic optimisers these days (deep neural nets), so you could argue we've narrowed the search space a lot since the days of symbolic/tree search AI.


> we know that human AGI was a result of an optimisation process (natural selection)

I don't think this is obviously correct.

Three things:

1) Many actions we think of as "intelligence" are just short-cuts based on heuristics.

2) While there's probably an argument that problem solving is selected for it's not clear to me how far this goes at all. There's little evidence that smarter people end up in more powerful positions for example. Seems like there is perhaps there is a cut-off beyond which intelligence is just a side effect of the problem solving ability that is useful.

3) Perhaps humans individually aren't (very?) intelligent and it is only a society of humans that are.

(also perhaps human GI? Nothing artificial about it.)

> no idea how large the search space is for finding AGI, so applying something like Bayes theorem (which is basically my argument) tells you more about my priors than reality.

There are plenty of imaginable forms of intelligence that are often ignored during these conversations. One in common use is "an intelligent footballer" which applies to sport for someone who can read a game well. There are other, non-human examples too (Dolphins, crows, parrots etc).

And then in the world of speculative fiction there's a range of different types of intelligence. Vernor Vinge wrote about intelligences which had motivations that people couldn't comprehend (and Vinge is generally credited with the concept of the singularity). More recently Peter Watt's Blindside contemplates the separation of intelligence and sentience.

Basically I don't think your expression of Bayes' theorem had nearly enough possibilities in it.


> While there's probably an argument that problem solving is selected for it's not clear to me how far this goes at all. There's little evidence that smarter people end up in more powerful positions for example.

Evolution hasn't had enough time to adapt us to our new fangled lifestyle of last few hundred years, or few thousand for that matter, and anyways in the modern world people are not generally competing on things affecting survival, but rather on cultural factors that affect number of children we have.

Humans and most (all?) intelligent animals are generalists, which is why we need a big brain and intelligence - to rapidly adapt to a wide variety of ever changing circumstances. Non-generalists such as herbivores, crocodiles don't need intelligence and therefore don't have it.

The main thing that we need to survive & thrive as generalists - and what evolution has evidentially selected for - is ability to predict so that we can plan ahead and utilize past experience. Where will the food be, where will the water be in a drought, etc. I think active reasoning (not just LLM-like prediction/recall) would also play a large role in survival, and presumably parts of our brain have evolved specifically to support that, even if the CEO probably got his job based more on height/looks and golf handicap.


I strongly agree that the predictive and planning ability is very important - things like agriculture rely on it and must be selective at that point.

But the point has previously been made else humans developed large brains long (1.5M years?) before agriculture, and for a long time the only benefit seemed to be fire and flint tools.

It's not widely understood the causal link here - there are other species that have large brains but haven't developed these skills. So it's not clear exactly what facets of intelligence are selected for.


> also perhaps human GI? Nothing artificial about it.

Lol, thanks, that's quite funny. I should spend less time on the internet.

> While there's probably an argument that problem solving is selected for it's not clear to me how far this goes at all.

Yeah, I meant something much more low brow which is that _humans_, with all of our properties (including GI), are a result of natural selection. I'm not claiming GI was selected for specifically, but it certainly occurred as a side-effect either way. So we know optimisation can work.

> There are plenty of imaginable forms of intelligence that are often ignored during these conversations.

I completely agree! I wish there was more discussion on intelligence in the broad in these threads. Even if you insist on sticking to humans it's pretty clear that something like a company or a government is operating very intelligently in its own environment (business, or politics), well beyond the influence of its individual constituents.

> Basically I don't think your expression of Bayes' theorem had nearly enough possibilities in it.

Another issue with Bayes in general is that you have a fixed probability space in mind when you use it, right? I can use Bayes to optimise my beliefs against a fixed ontology, but it says nothing about how or when to update the ontology itself.

And no doubt my ontology is lacking when it comes to (A)GI...


> I think we have AGI

That seems a pretty extreme position!

What's your definition of AGI ?


> That seems a pretty extreme position!

Not really.

Jeremy Howard has said the same thing for example.

> What's your definition of AGI ?

Things that we consider intelligent when humans do them.

Basically we had all these definitions of AGI that we have surpassed (Turing test etc). Now we are finding more edge cases where we go "ahh... it can't do this so therefore it isn't intelligent".

But the issue with that is that lots of humans can't do them either.

I think the ARC challenge is valid. But I'd also point out that there are substantial numbers of people who won't be able to solve them either (blind people for example, as well as people who aren't good at puzzles). We make excuses there ("oh we can explain it to a blind person" or for many physical problems things like "Oh Stephen Hawking couldn't solve this but that is an exception") but we don't allow the same excuses for machine intelligence.

I don't think the boundary of AGI is a hard line, but if you went back 10 years and took what we had now and showed it to them I think people would be "Oh wow you have AI!".


OK, so where we differ is in defining AGI. To me, and I think most people, it's referring to human-level (or beyond) general intelligence. Shane Legg from DeepMind has also explicitly defined it this way, but I'm not sure where others in the industry stand.

LLMs do have a broad range of abilities, so not narrow AI, but clearly it's not general intelligence (or at least not human level), else they would not be failing or struggling on things that to us are easy - general means universal (not confined to specific types of problem), not just multi-capability.

The lack of reasoning ability, especially since it is architecturally based, seems more than a matter of patching up corner cases that aren't handled well. This shoring up of areas of weakness by increasing model size, adding targeted synthetic data and post-training is mostly just addressing static inference, much like adding more and more rules to CYC.

To make an LLM capable of reasoning it needs to go beyond a fixed N-layers of compute and support open-ended exploration, and probably replace gradient descent with a learning mechanism that can also be used at inference time. In a recent interview John Schulman (one of the OpenAI co-founders) indicated that they hoped that RL training on reasoning would improve it, but that is still going to be architecturally limited. You can learn a repertoire of reasoning templates than can be applied in gestalt fashion, but that's not the same as being able to synthesize a solution to a novel problem on the fly.

LLMs are certainly amazing, and as you say 10-years ago we would have regarded them as AI, but of course the same was true of expert systems and other techniques - we call things we don't know how to do "AI" then relabel them once we move past them to new challenges. Just as we no longer regard expert systems as AI, I doubt in 20 years we'll regard LLMs (which in some regards are also very close to expert systems) as AI, certainly not AGI. AGI will be the technology than can replace humans in many jobs, and when we get there LLMs will in hindsight look very limited.


Humans can do infinitely many things because we have general intelligence.

Testing whether an AI can play chess or solve Chollet's ARC problems, or some other set of narrow skills, doesn't prove generality. If you want to test for generality, then you either have to:

1) Have a huge and very broad test suite, covering as many diverse human-level skills as possible.

and/or,

2) Reductively understand what human intelligence is, and what combination of capabilities it provides, then test for all of those capabilities both individually and in combination.

As Chollet notes, a crucial part of any AGI test is solving novel problems that are not just templated versions (or shallow combinatins) of things the wanna-be AGI has been trained on, so for both of above tests this is key.


I suspect trying to reductively understand intelligence is a bit like trying to reductively understand biology - every level of abstraction is causally influenced by every other level of abstraction, so there just aren't simple primitives you can break everything down into.


You can get pretty far with an understanding of biochemical reaction cycles, genetic theory, and protein molecular interactions.


Trying to express the high-level behaviour of an organism in those reductive terms is way beyond science right now, if it's even possible at all. Like have a look at a chart of human metabolic pathways - it's absolute insanity. And those are extremely simplified already!


A implies B, doesn't mean than B implies A. That's a basic logical fallacy.

AGI can add 1+1 correctly, but an ability to do that is not a test for AGI.


This is not what I'm saying. Consider the following statement:

"Absence of evidence is evidence of absence."

Presumably you would call this a simple logical fallacy for the same reason, but a little reflection would show that in many cases such a statement is true! It depends on context, in this case your estimate of how well your search covered the possible search space.

Evidence is a continuous variable - things can be weak evidence, strong evidence... There's a whole spectrum. I just take issue with statements like "X is zero evidence of Y" because often you can do a lot better than that with the information at hand.


We know that computers are capable of things that humans can't - anything related to brute force computation, search and memory for example.

So, just because a human can't do something, or struggles to do it, doesn't mean that the task requires a huge IQ or generality - it may just require a lot of compute/memory, such as DeepBlue playing chess.

In the case in point of these ARC puzzles, they are easy for a human, so "absense of evidence" doesn't even apply, and it's worth noting that one could also brute force solve them by trying all applicable solution techniques (as indicated by the examples and challenge description) in combinatorial fashion, or just (as Chollet notes) generate a massive training set and train an LLM on it, and solve them via recall rather than active inference, which again proves nothing about AGI.

The point of the ARC challenge is to encourage advances in active inference (i.e. reasoning/problem solving), which is what LLMs lack. It's HOW you solve them that matters if you want to show general intelligence. Even in the realm of static inference, which is what they are built for, LLMs are really closer to DeepBlue than something intelligent - they brute force extract the training set rules using gradient descent. The interesting thing is that they have any learning ability at all (in-context learning) at inference time, but it's clearly no match for a human and they are also architecturally missing all the machinery such as working memory and looping/iteration to perform any meaningful try/fail/backtrack/try-again (while learning the whole time) active inference.

It'll be interesting to see to what extent pre-trained transformers can be combined with other components (maybe some sort of DeepBlue/AlphaGo MCTS?) to get closer towards human-level problem solving ability, but IMO it's really the wrong architecture. We need to stop using gradient descent and find a learning algorithm that can be used at inference time too.


I disagree that gradient descent brute force extracts the training set. That "overfitting" kind of thing has been shown to be false many times. Transformers learn predictive models of their input, beyond what their training set contains.

But in general I agree about active inference. Clearly there is something missing there.

Doing alpha-go style MCTS would be interesting but how would you approach training the policy and value net? It's not like we can take snapshots of people's thought processes as they read text in the same way you can perform arbitrary rollouts of your game engine.


Yes, a narrow domain, but the core capability it is testing for (explorative combination/application of learned patterns and skills) is a general one that in a meaningful AGI would be available across domains.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: