Do I think that we are likely to suddenly see a massive massive leap to AGI/ super AGI? Not really... But does it still seem like enough of a long tail risk in my/my kids lifetime that I hope someone is getting resources and access to think about this? I sure do.
I know comparing AGI development to nuclear weapon development is a trope at this point, and in general, not that that useful of comparison... But I do think having a group with a diversity of how to think about the impact of desired outcomes is a good thing to have whenever you are working in the realm of world changing outcomes, and it seems like the changes at openAI are eroding that to some extent, even if it is really really unlikely to come out of this specific company at this time.
I don't think any of us will live to see AI smarter than a rat. So I am not concerned whether this team was going to superalign anything.
The problem is that OpenAI's software, especially GPT-4o, is primed for dangerous misuse, and the demo videos of GPT-4o seemed "misaligned" with any reasonable standards of AI safety. Even if Leike/Sutskever have delusions of grandeur about AGI, at least they cared about the idea of AI safety. It seems like pushing the team out meant getting rid of a lot of internal critics (and implicitly threatening anyone else who might speak up).
> I don't think any of us will live to see AI smarter than a rat.
Rat, dog, human, bee. Doesn't matter. The moment AI is able to incrementally improve a bee intelligence will evolve into rat, blink again and its dog, another blink and its smarter than us.
That is the danger. The issue of control and danger is not linear but exponential. And without a plan to deal with issues things can get out of control exponentially faster.
> The problem is that OpenAI's software, especially GPT-4o, is primed for dangerous misuse, and the demo videos of GPT-4o seemed "misaligned" with any reasonable standards of AI safety.
I forgot who said it, maybe Sam Altman. The reason for releasing the current models was to show public what the current AI can do and how far it progressed. To get people to understand where we are and where we can get to with AI
Sort of like giving people muskets so they adapt to it before machine-guns appear.
It forces other AI research to also show their stuff and not keep everything secret, until one day out of the blue we have pocket nukes available at everycorner for low low price of $10
There was one such group but they determined it was impossible because of Rice's theorem and other limitations of formal systems for computation. Logical incompleteness, Tarski's theorem, and Rice's theorem are the main meta-theoretical results that make alignment fundamentally unsolvable. If you're really concerned about robots taking over the world then understanding basic computational theory should be a prerequisite but most people are not willing to spend the time to learn the theory and instead focus on vague and ill-defined science fiction concepts which are very unlikely to be actually physically possible/implementable because of various physical and formal limitations of computers.
I've decided anyone concerned about these issues knows almost nothing about computability theory so their theories are either nonsensical or just outright crazy. Very few understand the required formal concepts to have any useful ideas about how computers should be programmed to prevent "unsafe" results (which is often left just as ill-defined as most everything on AI safety and alignment research).
Luckily that doesn't exist. It's about equivalent to spinning up a Ghostbusters unit and letting it wither. You can argue that we'll be in trouble if we encounter ghosts, but on likelihood weighted basis there are much better uses of time.
Part of me thinks that’s exactly why this department has been sidelined - having such a department is necessary to create hype (“we are creating things so powerful we have to explore how to contain them”), but it doesn’t need to thrive either.
That's simply not true. We now have LLMs that perform similarly or better than average humans on general reasoning and logic benchmarks. So if you were to say that they don't understand logic at all, then by that definition most humans don't either (which can be debated, but it's a different topic).
"Super fancy autocomplete" may not be that different to us, or at least some substantial part of us. Do you think when you speak colloquially with a friend or a colleague, you are engaging in a deep reasoning exercise? When you speak, the next set of words you utter feels like a 'fancy autocomplete' because you don't think through every word, or even the underlying idea or question that was presented - you just know how to respond and with what set of words and sentences.
> Do you think when you speak colloquially with a friend or a colleague, you are engaging in a deep reasoning exercise?
When I speak colloquially, I have an underlying idea rooted in a world model to be expressed. I don't spit out 1 word at a time based on the previous words I already said.
It's very much feels to be gearing towards figuring out the gist of a search engine you may be trying to complete and put together by reading a few links.
"How do you know we're not just next token predictors" is the thought terminating cliche . We know that's what LLMs are. It was certainly eye opening to see how far that gets you. But any deeper claims about intelligence or reasoning need real evidence or at least a proposed line of reasoning. "We don't know how intelligence works so it might be that" doesn't count.
It’s no more thought-terminating than “intelligence,” which is extremely loaded and causes people to make assumptions about these models that work backwards from the “intelligent” label rather than forwards from the tech itself
Because if this is how our brains really worked, then chatgpt wouldn't be beating most humans at standardized tests and then failing this absurdly easy question that even an elementary school kid could pass.
It is simply regurgitating this phrase without even considering that it is stating the exact opposite of the answer it just gave, simply because most answers to this riddle on the internet say this at the end.
> This riddle plays on the assumption that a surgeon is typically male, but in this case, the surgeon is the boy's mother.
So from this 1 failing, you can see that it is a copy and paste machine, and it doesn't even understand that it is contradicting itself.
> which can be debated, but it's a different topic
No, it "can't be debated," it is clearly false! You said "by definition," but you used an irrational and bigoted definition of "general reasoning and logic" which conflates such things with performance on a standardized test. Humans aren't innately good at stupid logic puzzles that LLMs might get a 71st percentile in. Our brains are not actually designed to solve decontextualized riddles. That's a specialized skill which can be practiced. It's depressing enough when people claim IQ tests are actually good measures of human intelligence, despite overwhelming evidence to the contrary. But now, by even worse reasoning, we have people saying a computer is smarter than "average humans." (MTurk average humans? Undergrads? Who cares!) The complete lack of skepticism and scientific thinking on display by many AI developers/evangelists is just plain depressing.
Let me add that a truly humiliating number of those """general reasoning""" LLM benchmarks are fucking multiple choice questions! Not all of them, but a lot. ML critics have been complaining since ~2017 (BERT) that LLMs pick up on spurious statistical correlations in benchmarks but fail badly in real-world examples that use slightly different language. Using a multiple choice test is simply dishonest, like a middle finger to scientific criticism.
I don't know where you are getting that, but if you give it a simple problem which hasn't been written about on the internet, it gets it wrong. It is cool tech, don't get me wrong, but it is also dumb tech. I do not think we will ever get AGI without understanding consciousness first.
Probability is only half of the expected value equation. When the outcome is excessive then you should care even if the probability is low as long as it is non-zero.
Nobody knows what the outcome is though. Its literally bad science fiction being used to pay people to do... What exactly? About a technology which doesn't exist.
There are good arguments in the literature for why you might want to care about these risks [1, 2], and I think there's lots of room for reasonable disagreement about whether these are arguments are any good, but pretending the entire field of AI Safety is just delusional is just bad faith at this point. Especially when companies like OpenAI, Anthropic or GDM were explicitly created to build AGI, and have been talking about these risks since they were first founded.
[2]: A broader, less technical introduction to AI Safety that I like is Hendricks et al's An Overview of Catastrophic AI Risks, https://arxiv.org/abs/2306.12001
> This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty in controlling agents far more intelligent than humans.
The first three risks are completely reasonable and people should be thinking about them. No, ChatGPT should not be diagnosing patients and giving them medicine. Yes, we should be vigilant to a flood of disinformation and revenge porn made possible by AI generated content.
But when people talk about “AI safety” in this context, it’s usually in reference to the fourth category, planning for a superintelligent malicious AI that evades detection, self-replicates, etc. That’s pure science fiction at that point, and it’s not a reason to slow down development of LLMs, which yes are basically glorified chatbots and will not lead to “AGI” in this threatening sense.
If I recall correctly, when steam engines started being able to go 40-50 MPH, there were people who were concerned that human beings would not be able to survive travel at such speeds because we never had experienced them. This wasn’t completely irrational, I suppose, as there are speed-induced G forces that are fatal, and they had no way of knowing the threshold back then. But once it was clear that steam locomotives weren’t in any danger of putting us over that threshold, incessant worry about locomotive-induced speeds death was kooky. “Locomotive safety” involving derailment mitigation, track crossing markings, etc. - still legitimate. But if “locomotive safety” was associated with people making claims like “we’re headed for a mass casualty event when the first locomotive hits 60 mph,” then “locomotive safety” would be marginalized.
It doesn’t help that the public faces of “AI safety” include autodidactic pseudointellectuals, clearly mentally unwell people, and philosophers too deep in their own “taken to its logical conclusion…” thought experiments.
It’s arguable whether it’s actually improbable. Many people, including (and seemingly especially) experts think it’s likely or even inevitable. This video might be interesting to see some arguments for why people think it’s important: https://youtu.be/9i1WlcCudpU
I know someone who fancies himself an expert, and indeed, by comparison to the average tech worker, he is an expert in LLMs and AI. He vehemently believes AGI is coming this year. At least, that's what he said last year. I suppose he will probably feel sheepish if I bring it up now that we're at month 6.
My point is that the only thing that seems to convince the pro-AGI crowd is having them make specific predictions, waiting until the deadline, then asking them why they didn't happen yet.
It would be far too late by that point, as it can already take measures to stop us. This is a difficult problem and we really should have started putting a lot more effort into this a lot earlier. Of course we can’t test a superintelligence directly yet, but we should do as much theoretical work as we can
Actually the video addresses this point, the comparison it makes is waiting until we’re already on Mars before we start thinking about spacesuits and airlocks
Inevitable? Sure assuming we don't extinct first. In the near future? No.
This is just another in a long line of technology panics. Unfortunately, there always seems to be some "concerned" experts that are both overly optimistic about the speed of technological progress and overly pessimistic about where that progress will lead adding fuel to the fire.
Eventually some other new technology will incite a new panic and this one will become another footnote in history like the fears over grey goo or genetically engineered superhumans.
What about when people pointed out problems like leaded fuel and climate change and we didn’t take action for far too long? We shouldn’t dismiss it just because of things it sounds similar to. In fact you could always dismiss any concern that could lead to human extinction like that, because if there was an existing example of human extinction then we wouldn’t be around to talk about it; that doesn’t mean it can’t happen.
Agreeing with circuit10's comments, I don't think many proponents of AI Safety are doing so through Pascal wagers. People differ a lot in their assessment of how likely certain risks are, but people I know working on AI Safety tend to place numbers between 1% to 30% of a global catastrophe associated with AI in the next 20 years [1]. I don't think this is negligible, nor analogous to your supermassive teapot example.
[1]: You might disagree with this assessment, but my point is that this is not where most people are arguing from.
Because an increasing amount of capital and effort is going towards improving the chances we eventually achieve AGI. We might hit another winter where the party stops, but right now a lot of stock prices are encouraging more research. That increases the odds. Assuming we never get there is a bet against human ingenuity.
It’s not all or nothing. I’d be surprised if we don’t see autonomous quad copters in eastern Ukraine dropping bombs on soldiers by end of ‘25.
The parts and models are off the shelf now, it just needs the first person desperate enough to release a swarm of killcopters with cheap ML classifiers to make it a reality.
Curious how people will feel after the first autonomous swarm becomes widely known, because it will be replicated widely.
> OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But requests for a fraction of that compute were often denied, blocking the team from doing their work.
So…
The team that was supposed to control and align super-human intelligence was unable to align merely human intelligence to give them compute?
Is your hypothesis that what, Jan Leike resigned as part of an elaborate conspiracy to boost OpenAI's prospect by... criticizing it?
I find these theories to be extremely convoluted and implausible, and they often lack awareness of the history behind companies like OAI, Anthropic or GDM
Hate to be that person but climate change is more of a threat than AI. Where is the team to reduce its impact? That should be the actual news.
OpenAI realised that it is not getting closer to AGI anytime in the near future and it has more significant worries closer to the present. Including existential threats. Anyone worries about dangers to humanity should focus on climate change and jump off the sci fi hype train.
> OpenAI is shouldering an enormous responsibility on behalf of all of humanity.
To me this sounds delusional. It assumes all kinds of things, but primarily that openai will be the leader in this space up to and beyond smarter than human agi. This self important bs is also why I wouldn't trust open ai with this responsibility
They quickly find out there is no Skynet to control, it’s all just simple alignment so it doesn’t spit out how to make bombs and slurs. The ones that quit likely weren’t getting enough work to look important because AGI surely is thought to be coming, but everyone else inside knows they are very far off
This big shakeup could just be a garden-variety "too many big egos in the room" situation. These folks have achieved a tech celebrity status that transcends anything else I can recall in recent memory.
For context, the point of the Superalignment team was to work on a problem known as scalable oversight: the problem of aligning models in a way that holds up as models become more capable [1]. The reason behind this is that current alignment techniques (like RLHF), have limitations which are expected to worsen as models are scaled up [2].
This is to say, the objective of the Superalignment team was precisely to work on techniques that would work for models which don't yet exist. They are of course aware that they don't yet have superintelligence.
This is why finance needs regulators. A mistake happens, a team is empowered, five minutes go by and everyone forgets. Sometimes maliciously. More often because other priorities gain mindshare.
I think everyone involved realized that LLM's are a dead end that has nothing to do with AGI, and we're no closer to AGI than we were before LLM's hit the scene. LLM's are nifty chatbots and have some other uses, but at a catastrophically high and unsustainable energy cost that means they will never be profitable.
I vividly remember people saying that on this very forum when the topic was Crypto, the last "Make NVIDIA's stock and the market soar" technology was the darling. Proof of Work will give way to Proof of Stake, or some other thing will happen, trust us, the tech is just so good and useful!
What makes this different? What do we actually have, stripping away all of the "it will be this someday" thinking? Pretty good chat bots, ok summarizers, things that can messily and somewhat unpredictably code at a low level, and worst of all, things that sometimes just make up what they're saying or otherwise fail in harmful ways.
Then compare that to the cost to run these mediocre miracles.
> What do we actually have, stripping away all of the "it will be this someday" thinking?
I can speak to a few restaurant owners who no longer need devs because an LLM can keep their website menu up to date. (Bonus: it’s now just a PDF.)
That’s a tangible productivity change. Granted, it’s on par with crypto’s remittance-efficiency pitch. But unlike crypto, these advantages are growing. Speaking personally, Kagi’s AI search is far superior to a list of links in most cases.
I'm genuinely shocked that this Supreme Court permitted the Consumer Financial Protection Bureau to continue to exist. It challenges my world view, in a way that should give me some optimism.
I know comparing AGI development to nuclear weapon development is a trope at this point, and in general, not that that useful of comparison... But I do think having a group with a diversity of how to think about the impact of desired outcomes is a good thing to have whenever you are working in the realm of world changing outcomes, and it seems like the changes at openAI are eroding that to some extent, even if it is really really unlikely to come out of this specific company at this time.