Well-defined goal is the big one. We wanted a big bomb. What does AGI do? AGI is...

lucubratory · 2024-09-04T23:54:40 1725494080

"General" is every activity a human can do or learn to do. It was coined along with "narrow" to contrast with the then decidedly non-general AI systems. This was generally conceived of as a strict binary - every AI we've made is narrow, whereas humans are general, able to do a wide variety of tasks and do things like transfer learning, and the thinking was that we were missing some grand learning algorithm that would create a protointelligence which would be "general at birth" like a human baby, able to learn anything & everything in theory. An example of an AI system that is considered narrow is a calculator, or a chess engine - these are already superhuman in intelligence, in that they can perform their tasks better than any human ever possibly could, but a calculator or a chess engine is so narrow that it seems absurd to think of asking a calculator for an example of a healthy meal plan, or asking a chess engine to make sense of an expense report, or asking anything to write a memoir. Even in more modern times, with AlexNet we had a very impressive image recognition AI system, but it couldn't calculate large numbers or win a game of chess or write poetry - it was impressive, but still narrow.

With transformers, demonstrated first by LLMs, I think we've shown that the narrow-general divide as a strict binary is the wrong way to think about AI. Instead, LLMs are obviously more general than any previous AI system, in that they can do math or play chess or write a poem, all using the same system. They aren't as good as our existing superhuman computer systems at these tasks (aside from language processing, which they are SOTA at), not even as good at humans, but they're obviously much better than chance. With training to use tools (like calculators and chess engines) you can easily make an AI system with an LLM component that's superhuman in those fields, but there are still things that LLMs cannot do as well as humans, even when using tools, so they are not fully general. One example is making tools for themselves to use - they can do a lot of parts of that work, but I haven't seen an example yet of an LLM actually making a tool for itself that it can then use to solve a problem it otherwise couldn't. This is a subproblem of the larger "LLMs don't have long term memory and long term planning abilities" problem - you can ask an LLM to use python to make a little tool for itself to do one specific task, but it's not yet capable of adding that tool to its general toolset to enhance its general capabilities going forward. It can't write a memoir, or a book that people want to read, because they suck at planning or refining from drafts, and they have limited creativity because they're typically a blank slate in terms of explicit memory before they're asked to write - they have a gargantuan of implicitly remembered things from training, which is where what creativity they do have comes from, but they don't yet have a way to accrue and benefit from experience.

A thought exercise I think is helpful for understanding what the "AGI" benchmark should mean is: can this AI system be a drop-in substitute for a remote worker? As in, any labour that can be accomplished by a remote worker can be performed by it, including learning on the job to do different or new tasks, and including "designing and building AI systems". Such a system would be extremely economically valuable, and I think it should meet the bar of "AGI".

riffraff · 2024-09-05T05:56:51 1725515811

> LLMs are obviously more general than any previous AI system, in that they can do math or play chess or write a poem, all using the same system

But they can't, they still fail at arithmetic and still fail at counting syllables.

I think that LLMs are really impressive but they are the perfect example of a narrow intelligence.

I think they don't blur the lines between narrow and general, they just show a different dimension of narrowness.

lucubratory · 2024-09-05T06:28:41 1725517721

>But they can't, they still fail at arithmetic and still fail at counting syllables.

You are incorrect. These services are free, you can go and try it out for yourself. LLMs are perfectly capable of simple arithmetic, better than many humans and worse than some. They can also play chess and write poetry, and I made zero claims at "counting syllables", but it seems perfectly capable of doing that too. See for yourself, this was my first attempt, no cherry picking: https://chatgpt.com/share/ea1ee11e-9926-4139-89f9-6496e3bdee...

I asked it a multiplication question so it used a calculator to correctly complete the task, I asked it to play chess and it did well, I asked it to write me a poem about it and it did that well too. It did everything I said it could, which is significantly more than a narrow AI system like a calculator, a chess engine, or an image recognition algorithm could do. The point is it can do reasonably at a broad range of tasks, even if it isn't superhuman (or even average human) at any given one of them.

>I think that LLMs are really impressive but they are the perfect example of a narrow intelligence.

This doesn't make any sense at all. You think an AI artifact that can write poetry, code, play chess, control a robot, recommend a clutch to go with your dress, compute sums etc is "the perfect example of a narrow intelligence." while a chess engine like Stockfish or an average calculator exists? There are AI models that specifically and only recognise faces, but the LLM multitool is "the perfect example of a narrow intelligence."? Come on.

>I think they don't blur the lines between narrow and general, they just show a different dimension of narrowness.

You haven't provided an example of what "dimension of narrowness" LLMs show. I don't think you can reasonably describe an LLM as narrow without redefining the word - just because something is not fully general doesn't mean that it's narrow.

riffraff · 2024-09-05T10:07:08 1725530828

For arithmetic, this is ChatGPT:

    how much is (0.2 + 0.1) * 10?
    The result of (0.2+0.1)×10 is approximately 3, with a slight rounding difference leading to 3.0000000000000004.

My 10yo does not make this error, ChatGPT does because it does not understand math, but knows how to use Python.

For poetry: counting syllables is a significant part of most poetry forms, so if you can't count syllables, you can't do poetry.

Let's say you want a 5-7-5 haiku, this is ChatGPT

    write a 5-7-5 haiku about windstorms

    Fierce winds howl and spin,
    Branches bend, clouds race the sky,
    Storm leaves quiet calm.

this is not a 5-7-5 haiku.

LLMs are not general, but they show that a specific specialization ("guess next token") can solve a lot more problem that we thought it could.

lucubratory · 2024-09-05T10:41:30 1725532890

This argument generalises to all possible AI systems and thus proves way too much.

>[AI system]s are not general, but they show that a specific specialization ("[process sequential computational operations]") can solve a lot more problem that we thought it could.

Or if you really want:

>Humans are not general, but they show that a specific specialization ("neuron fires when enough connected neurons fire into it") can solve a lot more problem that we thought it could.

This is just sophistry - the method by which some entity is achieving things doesn't matter, what matters is whether or not it achieves them. If it can achieve multiple tasks across multiple domains it's more general than a single-domain model.