OpenAI created a team to control 'superintelligent' AI – then let it wither

addisonj · 2024-05-18T23:11:18 1716073878

Do I think that we are likely to suddenly see a massive massive leap to AGI/ super AGI? Not really... But does it still seem like enough of a long tail risk in my/my kids lifetime that I hope someone is getting resources and access to think about this? I sure do.

I know comparing AGI development to nuclear weapon development is a trope at this point, and in general, not that that useful of comparison... But I do think having a group with a diversity of how to think about the impact of desired outcomes is a good thing to have whenever you are working in the realm of world changing outcomes, and it seems like the changes at openAI are eroding that to some extent, even if it is really really unlikely to come out of this specific company at this time.

j45 · 2024-05-18T23:46:52 1716076012

I like how you put this, things go slow before they go very fast.

Also, competition search engine style.. this feels is getting ahead and staying ahead.

nicklecompte · 2024-05-18T23:55:02 1716076502

I don't think any of us will live to see AI smarter than a rat. So I am not concerned whether this team was going to superalign anything.

The problem is that OpenAI's software, especially GPT-4o, is primed for dangerous misuse, and the demo videos of GPT-4o seemed "misaligned" with any reasonable standards of AI safety. Even if Leike/Sutskever have delusions of grandeur about AGI, at least they cared about the idea of AI safety. It seems like pushing the team out meant getting rid of a lot of internal critics (and implicitly threatening anyone else who might speak up).

me_me_me · 2024-05-21T13:18:38 1716297518

> I don't think any of us will live to see AI smarter than a rat.

Rat, dog, human, bee. Doesn't matter. The moment AI is able to incrementally improve a bee intelligence will evolve into rat, blink again and its dog, another blink and its smarter than us.

That is the danger. The issue of control and danger is not linear but exponential. And without a plan to deal with issues things can get out of control exponentially faster.

> The problem is that OpenAI's software, especially GPT-4o, is primed for dangerous misuse, and the demo videos of GPT-4o seemed "misaligned" with any reasonable standards of AI safety.

I forgot who said it, maybe Sam Altman. The reason for releasing the current models was to show public what the current AI can do and how far it progressed. To get people to understand where we are and where we can get to with AI

Sort of like giving people muskets so they adapt to it before machine-guns appear.

It forces other AI research to also show their stuff and not keep everything secret, until one day out of the blue we have pocket nukes available at everycorner for low low price of $10

soist · 2024-05-18T23:52:34 1716076354

There was one such group but they determined it was impossible because of Rice's theorem and other limitations of formal systems for computation. Logical incompleteness, Tarski's theorem, and Rice's theorem are the main meta-theoretical results that make alignment fundamentally unsolvable. If you're really concerned about robots taking over the world then understanding basic computational theory should be a prerequisite but most people are not willing to spend the time to learn the theory and instead focus on vague and ill-defined science fiction concepts which are very unlikely to be actually physically possible/implementable because of various physical and formal limitations of computers.

I've decided anyone concerned about these issues knows almost nothing about computability theory so their theories are either nonsensical or just outright crazy. Very few understand the required formal concepts to have any useful ideas about how computers should be programmed to prevent "unsafe" results (which is often left just as ill-defined as most everything on AI safety and alignment research).

andy99 · 2024-05-18T22:39:34 1716071974

Luckily that doesn't exist. It's about equivalent to spinning up a Ghostbusters unit and letting it wither. You can argue that we'll be in trouble if we encounter ghosts, but on likelihood weighted basis there are much better uses of time.

manmal · 2024-05-18T22:58:51 1716073131

Part of me thinks that’s exactly why this department has been sidelined - having such a department is necessary to create hype (“we are creating things so powerful we have to explore how to contain them”), but it doesn’t need to thrive either.

ilrwbwrkhv · 2024-05-18T23:02:46 1716073366

The real answer is they have realized internally that agi is simply not possible. Even gpt 4o doesn't understand at all any logic.

sigmoid10 · 2024-05-18T23:09:49 1716073789

That's simply not true. We now have LLMs that perform similarly or better than average humans on general reasoning and logic benchmarks. So if you were to say that they don't understand logic at all, then by that definition most humans don't either (which can be debated, but it's a different topic).

codexon · 2024-05-18T23:16:44 1716074204

GPT4o still can't reason. It is super fancy autocomplete.

https://i.imgur.com/z83umbk.jpeg

Here I change a widely known riddle to the opposite answer, and I manage to make it state them both as the answer.

macspoofing · 2024-05-20T12:00:56 1716206456

"Super fancy autocomplete" may not be that different to us, or at least some substantial part of us. Do you think when you speak colloquially with a friend or a colleague, you are engaging in a deep reasoning exercise? When you speak, the next set of words you utter feels like a 'fancy autocomplete' because you don't think through every word, or even the underlying idea or question that was presented - you just know how to respond and with what set of words and sentences.

codexon · 2024-05-21T05:34:05 1716269645

> Do you think when you speak colloquially with a friend or a colleague, you are engaging in a deep reasoning exercise?

When I speak colloquially, I have an underlying idea rooted in a world model to be expressed. I don't spit out 1 word at a time based on the previous words I already said.

j45 · 2024-05-18T23:47:40 1716076060

It's a bit more than fancy autocomplete.

It's very much feels to be gearing towards figuring out the gist of a search engine you may be trying to complete and put together by reading a few links.

Eisenstein · 2024-05-18T23:35:31 1716075331

'Fancy autocomplete' is a thought-terminating cliche.

You can stump a person with a riddle or a logic puzzle or an optical illusion.

andy99 · 2024-05-18T23:46:58 1716076018

"How do you know we're not just next token predictors" is the thought terminating cliche . We know that's what LLMs are. It was certainly eye opening to see how far that gets you. But any deeper claims about intelligence or reasoning need real evidence or at least a proposed line of reasoning. "We don't know how intelligence works so it might be that" doesn't count.

Eisenstein · 2024-05-19T01:34:22 1716082462

When did I say "How do you know we're not just next token predictors"?

The fact is that people who say 'it is just fancy autocomplete' are using a thought-terminating cliche, and a 'I stumped an LLM' proves nothing.

gfourfour · 2024-05-18T23:46:22 1716075982

It’s no more thought-terminating than “intelligence,” which is extremely loaded and causes people to make assumptions about these models that work backwards from the “intelligent” label rather than forwards from the tech itself

Eisenstein · 2024-05-19T01:34:44 1716082484

So? Stop using thought-terminating cliches.

jordanpg · 2024-05-18T23:44:08 1716075848

Also, even if this oft-repeated trump card is true, why are you so sure this is different from how our own brains functionally work?

codexon · 2024-05-19T00:21:13 1716078073

Because if this is how our brains really worked, then chatgpt wouldn't be beating most humans at standardized tests and then failing this absurdly easy question that even an elementary school kid could pass.

It is simply regurgitating this phrase without even considering that it is stating the exact opposite of the answer it just gave, simply because most answers to this riddle on the internet say this at the end.

> This riddle plays on the assumption that a surgeon is typically male, but in this case, the surgeon is the boy's mother.

So from this 1 failing, you can see that it is a copy and paste machine, and it doesn't even understand that it is contradicting itself.

jordanpg · 2024-05-19T01:07:55 1716080875

Your counterexample doesn’t prove that this isn’t how our minds functionally work, except better. No amount of counterexamples could.

nicklecompte · 2024-05-19T00:09:06 1716077346

> which can be debated, but it's a different topic

No, it "can't be debated," it is clearly false! You said "by definition," but you used an irrational and bigoted definition of "general reasoning and logic" which conflates such things with performance on a standardized test. Humans aren't innately good at stupid logic puzzles that LLMs might get a 71st percentile in. Our brains are not actually designed to solve decontextualized riddles. That's a specialized skill which can be practiced. It's depressing enough when people claim IQ tests are actually good measures of human intelligence, despite overwhelming evidence to the contrary. But now, by even worse reasoning, we have people saying a computer is smarter than "average humans." (MTurk average humans? Undergrads? Who cares!) The complete lack of skepticism and scientific thinking on display by many AI developers/evangelists is just plain depressing.

Let me add that a truly humiliating number of those """general reasoning""" LLM benchmarks are fucking multiple choice questions! Not all of them, but a lot. ML critics have been complaining since ~2017 (BERT) that LLMs pick up on spurious statistical correlations in benchmarks but fail badly in real-world examples that use slightly different language. Using a multiple choice test is simply dishonest, like a middle finger to scientific criticism.

ilrwbwrkhv · 2024-05-19T04:11:54 1716091914

I don't know where you are getting that, but if you give it a simple problem which hasn't been written about on the internet, it gets it wrong. It is cool tech, don't get me wrong, but it is also dumb tech. I do not think we will ever get AGI without understanding consciousness first.

lgas · 2024-05-18T22:46:56 1716072416

Probability is only half of the expected value equation. When the outcome is excessive then you should care even if the probability is low as long as it is non-zero.

XorNot · 2024-05-18T23:03:44 1716073424

Nobody knows what the outcome is though. Its literally bad science fiction being used to pay people to do... What exactly? About a technology which doesn't exist.

FormerBandmate · 2024-05-18T23:06:26 1716073586

It could theoretically be possible, but it's not gonna happen from training a machine learning model off Reddit.

SolidGoldMagikarp is the canary in the coal mine if you doubt it's Reddit

agucova · 2024-05-18T23:23:54 1716074634

I find this kind of dismissive attitude annoying.

There are good arguments in the literature for why you might want to care about these risks [1, 2], and I think there's lots of room for reasonable disagreement about whether these are arguments are any good, but pretending the entire field of AI Safety is just delusional is just bad faith at this point. Especially when companies like OpenAI, Anthropic or GDM were explicitly created to build AGI, and have been talking about these risks since they were first founded.

[1]: An introductory paper I like is The Alignment Problem for a Deep Learning Perspective, from Ngo et al, https://openreview.net/forum?id=fh8EYKFKns.

[2]: A broader, less technical introduction to AI Safety that I like is Hendricks et al's An Overview of Catastrophic AI Risks, https://arxiv.org/abs/2306.12001

gfourfour · 2024-05-19T00:07:05 1716077225

> This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty in controlling agents far more intelligent than humans.

The first three risks are completely reasonable and people should be thinking about them. No, ChatGPT should not be diagnosing patients and giving them medicine. Yes, we should be vigilant to a flood of disinformation and revenge porn made possible by AI generated content.

But when people talk about “AI safety” in this context, it’s usually in reference to the fourth category, planning for a superintelligent malicious AI that evades detection, self-replicates, etc. That’s pure science fiction at that point, and it’s not a reason to slow down development of LLMs, which yes are basically glorified chatbots and will not lead to “AGI” in this threatening sense.

If I recall correctly, when steam engines started being able to go 40-50 MPH, there were people who were concerned that human beings would not be able to survive travel at such speeds because we never had experienced them. This wasn’t completely irrational, I suppose, as there are speed-induced G forces that are fatal, and they had no way of knowing the threshold back then. But once it was clear that steam locomotives weren’t in any danger of putting us over that threshold, incessant worry about locomotive-induced speeds death was kooky. “Locomotive safety” involving derailment mitigation, track crossing markings, etc. - still legitimate. But if “locomotive safety” was associated with people making claims like “we’re headed for a mass casualty event when the first locomotive hits 60 mph,” then “locomotive safety” would be marginalized.

It doesn’t help that the public faces of “AI safety” include autodidactic pseudointellectuals, clearly mentally unwell people, and philosophers too deep in their own “taken to its logical conclusion…” thought experiments.

throwaway115 · 2024-05-18T22:46:38 1716072398

>What if someone builds a supermassive teapot in Earth's orbit and it breaks free and comes hurtling towards us?

Ok, that seems ridiculous and infeasible and nothing that has happened so far indicates that this is a real threat.

>Yes, but it would be so catastrophic to humanity, that we must take it seriously, regardless of how improbable it is!

This is the AGI debate, in my view. If we're picking different extremely improbable events to get worked up about, why stop at AI?

circuit10 · 2024-05-18T23:05:11 1716073511

It’s arguable whether it’s actually improbable. Many people, including (and seemingly especially) experts think it’s likely or even inevitable. This video might be interesting to see some arguments for why people think it’s important: https://youtu.be/9i1WlcCudpU

throwaway115 · 2024-05-18T23:44:50 1716075890

I know someone who fancies himself an expert, and indeed, by comparison to the average tech worker, he is an expert in LLMs and AI. He vehemently believes AGI is coming this year. At least, that's what he said last year. I suppose he will probably feel sheepish if I bring it up now that we're at month 6.

My point is that the only thing that seems to convince the pro-AGI crowd is having them make specific predictions, waiting until the deadline, then asking them why they didn't happen yet.

JKCalhoun · 2024-05-18T23:33:34 1716075214

I feel like we'll know right away though. We can then act right away.

I'm not sure what you would do now that is different because you are anticipating such a thing.

circuit10 · 2024-05-19T22:30:47 1716157847

It would be far too late by that point, as it can already take measures to stop us. This is a difficult problem and we really should have started putting a lot more effort into this a lot earlier. Of course we can’t test a superintelligence directly yet, but we should do as much theoretical work as we can

Actually the video addresses this point, the comparison it makes is waiting until we’re already on Mars before we start thinking about spacesuits and airlocks

Aloisius · 2024-05-18T23:37:48 1716075468

Inevitable? Sure assuming we don't extinct first. In the near future? No.

This is just another in a long line of technology panics. Unfortunately, there always seems to be some "concerned" experts that are both overly optimistic about the speed of technological progress and overly pessimistic about where that progress will lead adding fuel to the fire.

Eventually some other new technology will incite a new panic and this one will become another footnote in history like the fears over grey goo or genetically engineered superhumans.

circuit10 · 2024-05-20T23:16:02 1716246962

What about when people pointed out problems like leaded fuel and climate change and we didn’t take action for far too long? We shouldn’t dismiss it just because of things it sounds similar to. In fact you could always dismiss any concern that could lead to human extinction like that, because if there was an existing example of human extinction then we wouldn’t be around to talk about it; that doesn’t mean it can’t happen.

agucova · 2024-05-18T23:20:28 1716074428

Agreeing with circuit10's comments, I don't think many proponents of AI Safety are doing so through Pascal wagers. People differ a lot in their assessment of how likely certain risks are, but people I know working on AI Safety tend to place numbers between 1% to 30% of a global catastrophe associated with AI in the next 20 years [1]. I don't think this is negligible, nor analogous to your supermassive teapot example.

[1]: You might disagree with this assessment, but my point is that this is not where most people are arguing from.

richardw · 2024-05-18T23:05:40 1716073540

Because an increasing amount of capital and effort is going towards improving the chances we eventually achieve AGI. We might hit another winter where the party stops, but right now a lot of stock prices are encouraging more research. That increases the odds. Assuming we never get there is a bet against human ingenuity.

hehdhdjehehegwv · 2024-05-18T23:28:25 1716074905

It’s not all or nothing. I’d be surprised if we don’t see autonomous quad copters in eastern Ukraine dropping bombs on soldiers by end of ‘25.

The parts and models are off the shelf now, it just needs the first person desperate enough to release a swarm of killcopters with cheap ML classifiers to make it a reality.

Curious how people will feel after the first autonomous swarm becomes widely known, because it will be replicated widely.

Angostura · 2024-05-18T23:50:05 1716076205

Is there a company out there building large orbital teapots? If so, you analogy is a good one. But I haven't heard of said company

RcouF1uZ4gsC · 2024-05-18T23:49:18 1716076158

> OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But requests for a fraction of that compute were often denied, blocking the team from doing their work.

So…

The team that was supposed to control and align super-human intelligence was unable to align merely human intelligence to give them compute?

Maybe that was part of the test.

jijijijij · 2024-05-18T23:05:19 1716073519

At this point I am not sure, if it's all theatre to keep the hype up, or if these people actually drank the koolaid.

agucova · 2024-05-18T23:38:31 1716075511

Is your hypothesis that what, Jan Leike resigned as part of an elaborate conspiracy to boost OpenAI's prospect by... criticizing it?

I find these theories to be extremely convoluted and implausible, and they often lack awareness of the history behind companies like OAI, Anthropic or GDM

bossyTeacher · 2024-05-18T23:38:25 1716075505

Hate to be that person but climate change is more of a threat than AI. Where is the team to reduce its impact? That should be the actual news.

OpenAI realised that it is not getting closer to AGI anytime in the near future and it has more significant worries closer to the present. Including existential threats. Anyone worries about dangers to humanity should focus on climate change and jump off the sci fi hype train.

ChrisArchitect · 2024-05-19T01:33:34 1716082414

[dupe]

Lots more discussion: https://news.ycombinator.com/item?id=40390831

https://news.ycombinator.com/item?id=40391382

And Jan's post among others:

https://news.ycombinator.com/item?id=40391412

darepublic · 2024-05-18T23:40:52 1716075652

From the Jan leike tweet:

> OpenAI is shouldering an enormous responsibility on behalf of all of humanity.

To me this sounds delusional. It assumes all kinds of things, but primarily that openai will be the leader in this space up to and beyond smarter than human agi. This self important bs is also why I wouldn't trust open ai with this responsibility

m3kw9 · 2024-05-18T23:18:37 1716074317

They quickly find out there is no Skynet to control, it’s all just simple alignment so it doesn’t spit out how to make bombs and slurs. The ones that quit likely weren’t getting enough work to look important because AGI surely is thought to be coming, but everyone else inside knows they are very far off

jordanpg · 2024-05-18T23:52:01 1716076321

This big shakeup could just be a garden-variety "too many big egos in the room" situation. These folks have achieved a tech celebrity status that transcends anything else I can recall in recent memory.

consumer451 · 2024-05-18T23:30:52 1716075052

Somewhat related discussion:

"International Scientific Report on the Safety of Advanced AI"

https://news.ycombinator.com/item?id=40400438

36 points | 53 comments

neverokay · 2024-05-18T23:24:35 1716074675

Something is wrong with Sam Altman. You can’t find a single interview of him discussing technical details of anything.

He may really be a tech version of George Santos. A grifter like that should be nowhere near in charge of tech like this.

slavetologic · 2024-05-18T23:26:22 1716074782

They dont have any super intelligence, which is why they don't care about these teams

agucova · 2024-05-18T23:34:42 1716075282

For context, the point of the Superalignment team was to work on a problem known as scalable oversight: the problem of aligning models in a way that holds up as models become more capable [1]. The reason behind this is that current alignment techniques (like RLHF), have limitations which are expected to worsen as models are scaled up [2].

This is to say, the objective of the Superalignment team was precisely to work on techniques that would work for models which don't yet exist. They are of course aware that they don't yet have superintelligence.

[1]: This paper by Anthropic is a good introduction to the problem; https://arxiv.org/abs/2211.03540

[2]: See, for example, Jan Leike's talk on this; https://www.youtube.com/watch?v=BtnvfVc8z8o

consumer451 · 2024-05-19T01:28:18 1716082098

Thank you very much for the informative non-hot-take context.

RayVR · 2024-05-18T23:47:31 1716076051

Ah, the wonderful lack of thought and skepticism we’ve come to expect from TC

JumpCrisscross · 2024-05-18T23:12:48 1716073968

This is why finance needs regulators. A mistake happens, a team is empowered, five minutes go by and everyone forgets. Sometimes maliciously. More often because other priorities gain mindshare.

EA-3167 · 2024-05-18T23:16:28 1716074188

I think everyone involved realized that LLM's are a dead end that has nothing to do with AGI, and we're no closer to AGI than we were before LLM's hit the scene. LLM's are nifty chatbots and have some other uses, but at a catastrophically high and unsustainable energy cost that means they will never be profitable.

Who needs a safety team to control that?

worik · 2024-05-18T23:34:41 1716075281

> catastrophically high and unsustainable energy cost that means they will never be profitable.

Reality differs from this opinion

EA-3167 · 2024-05-19T00:34:25 1716078865

I vividly remember people saying that on this very forum when the topic was Crypto, the last "Make NVIDIA's stock and the market soar" technology was the darling. Proof of Work will give way to Proof of Stake, or some other thing will happen, trust us, the tech is just so good and useful!

What makes this different? What do we actually have, stripping away all of the "it will be this someday" thinking? Pretty good chat bots, ok summarizers, things that can messily and somewhat unpredictably code at a low level, and worst of all, things that sometimes just make up what they're saying or otherwise fail in harmful ways.

Then compare that to the cost to run these mediocre miracles.

JumpCrisscross · 2024-05-19T02:01:52 1716084112

> What do we actually have, stripping away all of the "it will be this someday" thinking?

I can speak to a few restaurant owners who no longer need devs because an LLM can keep their website menu up to date. (Bonus: it’s now just a PDF.)

That’s a tangible productivity change. Granted, it’s on par with crypto’s remittance-efficiency pitch. But unlike crypto, these advantages are growing. Speaking personally, Kagi’s AI search is far superior to a list of links in most cases.

jfengel · 2024-05-18T23:17:38 1716074258

I'm genuinely shocked that this Supreme Court permitted the Consumer Financial Protection Bureau to continue to exist. It challenges my world view, in a way that should give me some optimism.

It doesn't, but that's cynicism, not realism.

fumar · 2024-05-18T23:06:14 1716073574

LLMs don't reason. Are LLMs on he path to AGI?

_giorgio_ · 2024-05-18T23:14:39 1716074079

There's no agi in sight, and by definition they have to bother you to justify themselves.

Probably they even had something to say about how they legally download public data and videos.

rahidz · 2024-05-18T23:12:37 1716073957

Is this the same safety team that decided GPT-2 was too dangerous for us plebs to use? Good riddance.