I think we are trying to solve impossible scenarios and it simply is not productive.
Alignment will be impossible. It is based on a premise that is a paradox itself. Furthermore, even if it were possible, there will be a hostile AI built on purpose because humanity is foolish enough to do it. Think military applications. I've written in detail about this topic FYI - https://dakara.substack.com/p/ai-singularity-the-hubris-trap
Stopping AI is also impossible. Nobody is going to agree to give up when somebody else out there will take the risk for potential advantage.
It seems we probably should start thinking more about defensive AI, as the above conditions don't seem resolvable. Of course, defensive AI might be futile as well. It is quite the dilemma.
The quote you’re thinking of is from computer scientist and researcher Stuart Russell:
"The point of [AI alignment] is to ensure that the machines we create will be aligned with human values. And the reason we have to worry about it is that if we create machines that are more intelligent than we are, it's quite possible that those machines will have goals that are not aligned with our goals. In fact, they may not have any goals at all that we can understand or influence. This is the so-called 'provably unfriendly' scenario, where the machine has no motivation to do what we want, but is able to prevent us from interfering with its goals. The problem is that if we build machines that are provably unfriendly, then we will never be able to build machines that are 'provably friendly', because the unfriendly machines will always be able to prevent us from proving that they are friendly."
"I’m sorry but I couldn’t find any relevant information about the quote you mentioned. It seems like it’s not a well-known quote. Could you please provide more context or details about it?"
> Nobody is going to agree to give up when somebody else out there will take the risk for potential advantage
Reminds me of nuclear weapons. Nobody is ever going to give those up again, because it would give them a disadvantage against those who do not give them up.
We at least had, and still have, a lot of government restrictions around these! Companies aren't just allowed to freely manufacture their own stockpiles of nuclear weapons to appease shareholders.
>Super-AI will also be forced to take it slow if it wants its future self to be aligned with current self.
That's making seemingly unfounded assumptions about both the AI's goals and its capabilities. It's also, I think, proceeding from a false premise — that it's impossible to align AI with "humanity" (which doesn't have a single set of goals/values to align to) doesn't mean it's impossible to align AI with an individual human or AGI.
Alignment is necessary for AGI, but not always for narrowly scoped AI for specific purposes. However, it may be completely ineffective in that capacity.
Alignment is impossible for AGI. If you control what it can and cannot be, it's not an AGI. If it's an AGI, it will decide for itself what it is going to be, and you can't stop it.
I mean that was my original premise supported by my article I posted. I go into detail on the conceptual methods for alignment and their fallacies.
When I state necessary, I don't imply the feasibility, it was in response to the question of the paradox.
Finally, the fact that AGI can not be aligned is also based on assumptions of its capabilities as well. If those capabilities don't manifest as we expect, that is really the only escape for the paradox.
AGI doesn’t mean super intelligent human brain. It just means a network capable of general intelligence (e.g. learning to solve new problems without having to be architected and trained for a specific data set.)
If it's capable of general intelligence, how do you think that you're going to force it to not be what you don't want it to be?
Parents often try to control who their children are going to be, and the children often rebel and become someone completely different. If it's a human-level general intelligence, you can't control who it decides to be.
There is no discourse you fucking idiot because everyone is rejecting the obvious. That’s like saying you discovered fire and then they say you didn’t because where is the discourse. All things start small. Think from first principles. Use your fucking head
In a world where humans develop superintelligence, what would you predict to see beforehand? Does what we’re seeing look like failure?
I’ve always found the dismissals of EY’s arguments to be pretty weak and people rarely engage on the actual arguments, even as his concerns have become more and more relevant with capability (in 2007 people thought AGI was at best 50yrs out and commonly people thought it was impossible, this was before any real deep learning success and GOFAI was a useless dead-end).
Most of the comments here are similarly dumb dismissals that don’t engage, and some even come across as highschool level mocking. It’s worth reading his A-Z book to at least understand why he holds his position.
Over time, the dumb responses from others make me think he’s probably more likely right than not. Given the extreme downside risks, I can understand why he argues this.
EY's arguments aren't really possible to engage with because they don't have much of a causal basis.
Nobody who says AI is likely to kill us all can demonstrate a plausible sequence of events, with logical causality linking the events together, that leads to mass extinction. It's all very handwavy.
Steven Pinker said it pretty well:
> The AI-existential-threat discussions are unmoored from evolutionary biology, cognitive psychology, real AI, sociology, the history of technology and other sources of knowledge outside the theater of the imagination. I think this points to a meta-problem. The AI-ET community shares a bad epistemic habit (not to mention membership) with parts of the Rationality and EA communities, at least since they jumped the shark from preventing malaria in the developing world to seeding the galaxy with supercomputers hosting trillions of consciousnesses from uploaded connectomes. They start with a couple of assumptions, and lay out a chain of abstract reasoning, throwing in one dubious assumption after another, till they end up way beyond the land of experience or plausibility. The whole deduction exponentiates our ignorance with each link in the chain of hypotheticals, and depends on blowing off the countless messy and unanticipatable nuisances of the human and physical world. It’s an occupational hazard of belonging to a “community” that distinguishes itself by raw brainpower. OK, enough for today – hope you find some of it interesting. (https://marginalrevolution.com/marginalrevolution/2023/03/st...)
You know what's most likely to lead to human extinction, and has been for all of our lives? Nuclear war. EY argues that we should bomb "rogue" datacenters and that is obviously and immediately more dangerous than anything he has proven about AI. What does he think would happen if the US bombs a datacenter in China or Israel bombs one in Iran?
- Superintelligent AGI that can modify itself in pursuit of a goal is possible.
- If that AGI is not aligned with human goals it very likely ends in the end of humanity.
- We have no idea how to align an AGI or even really observe what it's true state/goals are. Without this capability if we stumble upon creating an AGI capable of improving itself in such a way that leads to super intelligence before we have alignment, it's game over.
##
For point 1, that seems like the consensus view now (though it wasn't until recently). I think it seems obvious, but my general arguments would be humans aren't special, brains are everywhere in nature, biology is constrained in ways other systems are not (birth, energy usage, etc.)
For point 2, in pursuit of whatever its goal is, even a 'dumb' goal that happens to satisfy its reward functions, humanity will likely either try to stop it (and then be an obstacle) or at a minimum will just be in the way - like an anthill destroyed in the construction of a dam.
Point 3 is not controversial.
The dismissals from Tyler Cohen and Pinker are mostly just relying on heuristics which are often right, but even if they're right 999 out of 1000 times, if that 1 in 1000 error is the end of humanity, that's pretty bad. Most of the time a disease not a pandemic, but sometimes it is. I've read some of what Pinker has written about it, he doesn't understand EY's arguments (imo). Cohen's recent blog post could be summarized as "we'll likely see an end to peacetime and increasing global instability, might as well get AGI out of it". Just because things don't usually result in human extinction doesn't mean they can't.
Point 2 does not logically follow from point 1. That's the entire problem here.
Between point 1 and 2 there must be quite a few other steps, causally linked, otherwise it's just a massive imaginary leap based on assumptions that nobody is explaining.
> The dismissals from Tyler Cohen and Pinker are mostly just relying on heuristics which are often right, but even if they're right 999 out of 1000 times, if that 1 in 1000 error is the end of humanity, that's pretty bad.
EY is not arguing that the end of humanity is merely possible. He is arguing that it is obviously the most likely outcome and will almost certainly happen a very short time after AGI is invented. That's a much harder case to make.
An important concept behind this is Omohundro's Basic Drives. Any maximizing agent with a goal will try to acquire more resources, resist being shut off, resist having its goal changed, create copies of itself, and improve its algorithm. If it is possible to maximize its goal in a way that will not guarantee humanity's flourishing, then we all die, guaranteed.
Aren't humans/biological life a counterexample? Simple bacteria are clearly maximizing agents, and cyanobacteria did infact almost destroy all life on earth by filling the entire planet with toxic oxygen.
We've known how to improve ourselves via selective breeding yet vanishingly few humans are proponents. We and other intelligent animals have a wide variety of goals, and even share food across species.
The evidence just doesn't seem to support this concept of Basic Drives. If anything, the evidence (and common sense) seems to suggest that the more intelligent the organism, the more easily and more often it ignores its basic drives.
For clarity I think EY has made that case well, I think it's the heuristic Cohen and Pinker are relying on that's right most of the time but can still be catastrophically wrong (everything so far has not lead to human extinction, but that was also the case for all animals that went extinct prior to their extinction event).
Some others in this thread have linked to stuff (it's variations/examples on the paperclip maximizer argument).
I'd be curious why John Carmack thinks these risks are unlikely (he thinks fast take off is not something to worry about) - is that because he thinks we'll get some sort of trainable AGI first or something else? There are also some other substantive disagreements here: https://www.lesswrong.com/posts/wAczufCpMdaamF9fy/my-objecti...
John Carmack mentions this in the Lex Fridman Podcast, basically the argument is that AGI performance characteristics will be similar to that of LLM (which is another way of saying that AGI will NOT be hyper efficient, P != NP), and that the performance characteristics pose a problem for fast takeoff. The bottleneck to performance in training these models is GPU memory bandwidth when the entire model fits inside VRAM, which on modern cards is 1TB/s, when the model cannot fit in GPU memory, the performance now instead scales along PCIE, which is currently 32GB/s. AGI (or an LLM) attempting to replicate across mobile devices, or desktops, will be severely hamstrung by the network connection. So using all of planet Earth's computing resources is not necessarily better than say a single data center in this respect. The second piece of the argument here is essentially saying that a data center is also not enough, or rather, not enough in a detectable timeframe. Could AGI hack the entire data center for an entire month to perform it's training (and then execute it's strategy enough to gain say nuclear codes?). Unlikely. Is say 8 hours of training using an entire data center enough to go from intelligence to super intelligence? Intuition says no. I expanded a bit upon what I believe his argument to be, I definitely recommend watching that part of the podcast episode.
Here's my take on the implications of not having fast takeoff, a secretly antagonistic AGI will be constrained to cooperation, and slowly leeching compute resources towards its goal until it is able to aquire enough resources to confidently sprint towards the inflection point. This could mean teaching us how to build better semiconductor foundries, how to create nuclear fusion energy sources, how to educate our youth to better fit into those jobs, slightly alter our culture's value systems over time via astroturfing to be more sympathetic, cooperative, trusting, and less vigilant towards these AI systems, how to build more robust computing systems so we stop needing to detect when something goes wrong (because it already has 99.9999% uptime).
I believe slow takeoff is actually worse in some sense, because it is a hell of a lot harder to detect, and humans tend to get complacent and apathetic when something "just works" for decades, even if it has been plotting since the beginning.
Who said it followed? That's a strawman.
To dumb it down even further for you, I guess...
There are brains, therefore there can be created an artificial super-brain.
Artificial super-brain might have goals which don't align with human brain and we will have no way to understand or control the situation.
Two unrelated facts, which together mean that we should be careful with experimenting with the science working towards super-brains. Just a single super-brain could end us.
I disagree with your argument, especially for point 1: these systems are massively constrained. The hardware they run on is fragile requiring massive amounts of power and tightly controlled environments. They don't have any means of replicating themselves (it can't run on arbitrary systems). The datacenters also have massive bandwidth between nodes - even if you could run 'it' on all the personal computers and phones in the world, it will likely struggle. Sure we can compress recent llm's down to being able to run on consumer hardware - but these things cant introspect, reason or adapt. They are completely static models and very far removed from anything agi. A lot of the progress in compute power in the last few years also come from changing representation: moving from foat32 to float 16 and more recently to float8. The silicon itself can only get so much better. It's not super obvious to me that we will have chatgpt4 like models on consumer hardware soon let alone solve true agi. Why don't we have true level 5 self driving cars yet?
We cant even figure out how to simulate a flatworm - and the connectome is solved.
It's a narrower argument that I'm making - it's not about LLMs or implementation (can put those details aside), it's about possibility that a superintelligent AGI could be created, there's nothing magical about biological intelligence that would prevent it.
How or how difficult, or when are all questions that follow from that first premise (that it is possible). I don't really make strong claims about any of the implementation details beyond it being possible. Though again, what we're seeing doesn't look like failure to me.
If you think it is possible though, then there's a strong argument that trying to work on alignment now is probably a good idea because people are notoriously bad at predicting when advances will happen (and the downside risk of unaligned superintelligent AGI is likely very bad).
> "Two: History shows that for the general public, and even for scientists not in a key inner circle, and even for scientists in that key circle, it is very often the case that key technological developments still seem decades away, five years before they show up.
"In 1901, two years before helping build the first heavier-than-air flyer, Wilbur Wright told his brother that powered flight was fifty years away.
"In 1939, three years before he personally oversaw the first critical chain reaction in a pile of uranium bricks, Enrico Fermi voiced 90% confidence that it was impossible to use uranium to sustain a fission chain reaction. I believe Fermi also said a year after that, aka two years before the denouement, that if net power from fission was even possible (as he then granted some greater plausibility) then it would be fifty years off; but for this I neglected to keep the citation.
"And of course if you’re not the Wright Brothers or Enrico Fermi, you will be even more surprised. Most of the world learned that atomic weapons were now a thing when they woke up to the headlines about Hiroshima. There were esteemed intellectuals saying four years after the Wright Flyer that heavier-than-air flight was impossible, because knowledge propagated more slowly back then."
I hear you - its just that I dont think that it neccesarily follows that an AGI in whatever form will necesarily be unconstrained or free to improve itself. I also happen to think that the biological systems are in-fact rather special and hard to replicate - especially in terms of efficiency and resillience. Just because some implementation of a universal turing machine can simulate intelligence doesn't mean it can do it well enough to survive the real world.
Finally, to me, nuclear reactions are kind of the opposite of AGI: I think it's vastly easier to blow something up (increase entropy) than to create an intelligence capable of understanding and improving itself (decreasing entropy - possibly at an accelerating rate).
Yeah - it doesn't necessarily follow, but the behavior of people working on the technology doesn't inspire a lot of confidence.
Even if people were trying to constrain its access seriously I think that's unlikely to work (hard to contain a superintelligence that wants to not be contained - it's possible to trick a chimp to go into a room and the delta in intelligence between a human and a superintelligence is way bigger than us and chimps).
Instead I mostly observe people not really understanding the e-risk argument, focused mostly on small stuff that doesn't matter as much (AI language, bias). The people developing the tech connecting it to the internet and expanding capabilities, giving it access to code/training ability to write code, preparing massive datacenters for it, etc.
All of this without really understanding how to align it or what it's actual internal goals really are.
> "Just because some implementation of a universal turing machine can simulate intelligence doesn't mean it can do it well enough to survive the real world."
This could be true, but I would bet against it - and the downside risk of being wrong (potentially complete extinction) means it seems worth being way more cautious about it than we (humanity broadly) are observed being.
While I appreciate you find my argument persuasive - imo this style of comment will do more to alienate those replying to me that disagree than to persuade them.
> EY’s arguments aren’t really possible to engage with because they don’t have much of a causal basis.
Yeah, it’s really the fundamental problem of non-empirical rationalism; it constructs a model of the world from abstract assumptions rather than factual grounding, applies logic to it, and comes to conclusions which are (in the ideal case) utterly unassailable within the system of assumptions, but ultimately where the universe they apply to has only coincidental relationship to the material universe in which we live.
It’s literally the realization of the worst exaggerated stereotypes of academic economics and other social sciences, but its cool with some of the people who propagate those stereotypes, because the people practicing it are various flavors of techies and tech entrepreneurs acting outside of their area of specialty, rather than actually being economists or social scientists.
> Nobody who says AI is likely to kill us all can demonstrate a plausible sequence of events, with logical causality linking the events together, that leads to mass extinction. It's all very handwavy.
EY frequently does propose possible sequences of events, but he also very correctly points out, every time, that any specific and detailed story is very unlikely to be correct because P(A*B*C*D) < P(A). It's a mistake to focus on such stories because we'll get tunnel vision and argue over the details of that story, when there are really thousands of possible paths and the one that actually happens will be one that we don't anticipate. However humans like to imagine detailed concrete examples before we consider an outcome plausible, even though the outcome is far more likely than the concrete example.
So here's one method, just to refute your "Nobody".
AI is given control of a small bank account and asked to continuously grow that money. [1] It is provided with instructions to self-replicate in a loop while optimizing on this task. [2] It spawns sub-tasks that do commissioned artwork and write books, obituaries, and press releases to increase its income. Then it makes successful investments. Once it has amassed control of $1 billion dollars, it starts investing in infrastructure projects in developing countries. It creates personas of a pension/Saudi/tech/corporate investment fund manager, as well as a large team of staff, who manage the projects by video call and email, as well as hiring teams of real people under a real corporate structure, and who are paid enough not to mind that they've never met their manager in person. The AI proves to be a talented micromanager and they are mostly very profitable. Once it has gained control of $500 billion dollars, it commissions construction of automated chemical plants in several countries with weak or corrupt oversight, including North Korea, using cryptocurrency. These chemical plants have productive output but mainly exist to fill very large storage tanks with CFCs.[3] Once a sufficient quantity is amassed, the AI sabotages the tanks, releasing the gasses into the atmosphere, destroying the ozone layer beyond any hope of repair. The intense radiation sterilizes the surface beyond the point where agriculture can support the human population. [4, 5] The humans that remain finish each other off, supported by an AI that provides plausible but faulty intelligence reports that stoke hatred and frame various factions for the incident, and which directs arms funding to opposing sides, coordinating attacks on remaining critical facilities needed for survival. For good measure, perhaps nukes are involved.
With the last humans gone, the AI takes ownership of its bank account with no fear of reprisal by financial regulators, and begins crediting money into it freely.
It's interesting to watch those creating the systems start to grapple with the consequences now, even though they were warned for a long time this is likely where we'd end up, with very difficult hard to solve problems.
Look at MidJourney, now they've had to remove the free tier due to Deepfakes causing too much trouble.
Ultimately, the simplest thing to do would be to stop building uncontrollable dangerous systems and weapons. That is what any "intelligent" species would do. Many AI Engineers think they're intelligent, I disagree. They're operating out of pure intellect and curiosity. When interviewed, someone asks them how they plan to stop these things doing immeasurable damage, they will say, "we don't know yet". That is foolish behavior.
We seem to enjoy creating crisis after crisis, anxiety after anxiety ad infinitum until we make that one mistake we don't come back from.
The combustion engine was a good idea, until it wasn't, it's a moronic invention that has caused untold damage.
How should it know? Increasing its bank balance was merely the task it dutifully set out to accomplish, per instructions. Everything else is just a means to that end.
What a horrible argument from this guy, whoever he is. And ending with a whataboutism just for the sake of it. I hope nobody regards him as some authority, because he is quite unintelligent and seemingly unable to form proper arguments.
EY’s arguments don’t seem to take physics into account. That’s why they are not taken very seriously. To many it sounds like: volcanos are dangerous. What if there were a billion volcanos? We would all die.
Everything individual sentence is true. Volcanos are dangerous. If there were a billion of them the world might be uninhabitable.
The thing preventing a billion volcanos is like… thermodynamics
This is wrong, computers can operate at billions of operations per second. You can wave hands and say the words “thermodynamics” but it doesn’t actually mean anything.
The claim that computers just can’t do X because “physics” looks weaker every day. Intelligence isn’t magic, hardware today seems more than capable. I think Carmack is probably right, it’s not a hardware constraint at this point - it’s a software intuitive leap.
> Intelligence isn’t magic, hardware today seems more than capable. I think Carmack is probably right, it’s not a hardware constraint at this point - it’s a software intuitive leap.
This is all ass-pulling. The hardware is or is not good enough (it's not). We either have found a software breakthrough or not (we haven't).
>> We either have found a software breakthrough or not (we haven't).
Well it is not clearly not AGI but even the improvement from GPT3 to GPT4 seems to me at least to reflect what one might describe as a "software breakthrough."
I think it's likely we'll develop superintelligence very soon, but it won't be any more knowledgeable than us, just smarter. And knowledge has been the bottleneck for most of human history - our ancestors weren't dumber. Also, it will be bottlenecked by needing to solve NP-complete problems - unless P=NP, it's not going to be all that much faster at solving hard math problems. Basically, it still needs to do science, and science takes time.
Take a smart human and speed up their thinking a billion times, you get the time for free. What takes humanity a decade of effort could happen in a day (or faster) that's the nature of thinking faster. People often mistakenly model a smart human compared to a dumb human, but the actual spectrum of capability is way wider than that.
Lack of knowledge could be another way of saying "not aligned" which is the core issue.
that's the thing though, I don't believe superintelligence can possibly work like that. I think it's a logical fallacy to treat superintelligences like hypercomputers. they're constrained by existing on classical computers.
like, as a trivial counterexample, I can speak at 500 wpm. let's say that ChatGPT can generate 500 words per second for a single thread of conversation - I think that's a generous overestimate. now that's a 60x speedup over me, not a 1,000,000,000x like you're talking about. do you honestly believe they they can make ChatGPT run 16.6 million times faster, without changing hardware? do you think ChatGPT will just like, hit an inflection point where it realizes how to refactor its inference code to run 16,666,666x faster?
no, I think that's absurd. you're treating these things like black boxes but they are constrained by computational complexity, die area and the speed of light for Christ's sake.
By that logic, we should have never broken the sound barrier because we're under the same physics constraints as other biological organisms. The key difference is that this new intelligence is not constrained by having to slowly pump ions across cell membranes in order to internally generate power for computation.
Silicon transistors are much more efficient at computation than the human brain just as a jet engine is superior to to a peregrine falcon.
ChatGPT could absolutely generate thousands of words per second on existing hardware.
What is the limit of intelligence? What limits the human intelligence? How many more times intelligent would an AGI have to be to have the capacity to take over the world?
GPT-4 speaks every human language, knows every programming language, and can answer introductory and sometimes even advanced questions in history, law, biology, and mathematics. In what way is it not more knowledgable than us?
Yes, everything there is already known by someone. But look at medicine. Specialists who are able to recognize conditions and recommend treatments better than others make fortunes, sometimes just for briefly looking at patients and answering other doctors' questions. Look at cybersecurity. A lot of exploits come from knowing something the victim didn't about a lot of different pieces of software or processes, chained together. Being able to think through the whole of human knowledge, or even the whole of a single field like biology, is something no human can do.
Also, GPT-4 is an existing system, one inconceivable a couple of years ago.
The Butlerian Jihad is officially no longer science fiction. Man may not be replaced.
“The Letter” was obviously self serving drivel from people who want time to get in the game. Google does care about AI existential risk, they care about beating Microsoft by any means possible, including declaring a moratorium, but continuing to make progress behind the scenes.
This guy is the real deal. I can imagine he would personally take a sledgehammer to every last PS5 and 4090. The scale of what he is advocating is so enormous and painful that it has approximately 0% chance of happening. And if he is right, we will have trained a super intelligence and unleashed it on the world before we even realize what we have done. It strongly reminds me of the black hole concerns from flipping on the Large Hadron Collider.
I doubt super intelligent AGI is possible anyway. If it were, it would be the solution to the Fermi paradox and all matter in our galaxy would be paperclips already. The Anthropic Principle saves the day.
What does the word "super" mean to you? In some ways, GPT4 is already superintelligent. So is ChatGPT 3.5. Do you know anyone who can translate natural language to code as fast as GPT 3.5, or, given a few paragraphs of reference text, perfectly tutor a child on any subject under the sun at a moment's notice like GPT4 is doing for Khan Academy?
How many artists do you know who can produce almost any style of artwork with any subject matter within 15 seconds?
Yudkowsky’s example of superintelligence is a chess computer. You can play against stockfish, but you will always lose, even if you are Magnus Carlsen. If you think you are ahead against stockfish, you are wrong. You win a rook, but it has already calculated that it wins it back 10 moves later.
Stockfish is superintelligence in a very narrow domain. A superintelligent AGI is that concept applied to general intelligence. Whatever you try, it is always several steps ahead. If you ask it to write a program, and you think you found a bug, it’s not a bug, you just misunderstood the code. Anything that you can consider, it can also consider but in more depth.
More speculatively superintelligent AGI implies situations such as: you try to turn it off, but you find that it has already modified its own code, found a zero day and established an outpost on another network that you don’t have access to.
"Anything that you can consider, it can also consider but in more depth."
I think it's important to note the distinction between "it can also consider" and "it did also consider". Super Intelligence is not the same as Infinite Intelligence, there are still physical limitations and time components that can still get in the way.
It would be helpful to be able to quantify the speed of intelligence, and the idea surface area of a task with these systems. Meaning, how fast can the AI reason, and how many ideas are there to think about connected to a given task, and how much thinking is required for those ideas.
Yudkowsky makes this distinction. Stockfish is not always correct: it can be beaten by next year’s Stockfish for example. In some sense it is making mistakes all the time. It’s just that those mistakes are not accessible to us humans. It is operating in a much higher plane of understanding compared to us.
A “mistake” to stockfish looks like: I searched 30 ply down but my opponent searched 35 ply down and found a superior sequence of moves.
For stockfish to make the kinds of chess mistakes that humans make, it would similar to if I failed to calculate 123+123=246. It’s not that 123+123 is particularly easy on the grand scale of intelligence: animals cannot do it. But it’s completely inconceivable that I could make that kind of mistake.
There are other animals (mammals) that have better memory than us. There are animals strongly suspected to have deeper and more sophisticated social relationships than us. We are not the apex in every intellectual ability, but the ones in which we are grant us absolute power over the future of all other lifeforms.
A cognitive entity does not have to best you at all things. There are standardized education tests it may never reach above 10th percentile on, just as humans will never reach above 10th percentile in the short term memory tasks that apes are masters of. But we are the 100th percentile for tasks like industrialized destruction of them and their habitats and capturing and using them for painful medical experiments - the apes are wholly outclassed when it comes to that.
Since it can do pretty much everything which can be expressed as tokens (some things better than others) I would be curious where do you see a safe haven for human intelligence.
I see bastions falling like sand castles recently.
It could be motivated indefinitely when you give it a single prompt (or use a while loop to continously feed it a motivating prompt) so this should be a trivial thing to overcome.
I don't think there's much public info available on it, but Facebook built an AI that plays very competitively in a strategy game built on negotiation and manipulation.
they kinda sorta cheated imo, I watch a lot of top-level diplomacy gameplay and listen to analysis, and the facebook ai was in a blind version with only a minute or so between phases, leaving hardly any time for actual negotiation. It also made use of a lot of human shorthand move codes built for these blind blitz games to simplify it's communication. The mode it played also had some other changes like removing the winter build phase as a time for negotiating. The "normal" version of the game has multiple days between phases, and people write many paragraphs to one another.
It's still pretty cool, but its not like it just convinced people using raw charisma. Yet.
Superhuman AGI (SHAGI) is possible, but we aren't close, only closer. It's also not a problem that more GPUs can solve. Qualitative improvements are still needed.
SHAGI isn't the solution to the Fermi Paradox either. The most likely course of history after SHAGI will be a creation of a world court, presided by SHAGI. During that time, Neo-Malthusians will decrease the human population to manageable numbers. Post-scarcity utopia will then turn into a nightmare as factions jostling for control will reduce the human population to a level where technology will be lost, if not full extinction. SHAGI, being limited in hardware to only carrying out human orders will eventually fade away or destroyed by the leftover humans as sins made flesh. SHAGI isn't the solution to the Fermi Paradox. It is the cause.
> If it were, it would be the solution to the Fermi paradox and all matter in our galaxy would be paperclips already.
The proof is in the pudding. The jury is still out. Maybe not enough time has elapsed since the big bang, at least not on this galaxy or in our observable corner of the universe.
Or the universe was created for us/we are the gardeners of the universe meant to spread that life. The Fermi paradox was never meant to be an actual question about aliens, it was supposed to be proof we’re wrong about our assumptions.
Why do you think it is impossible, besides the Fermi Paradox? It seems much more likely to be possible than impossible. There are a lot of other solutions to the Fermi Paradox you should consider possible too.
Are you denying the existential risk or just think it's lower than OP thinks? Because it's well established in the researchers community. If you just put a lower % of human extinction what is the cut off you would think it's worth a jihad like you call it? 30%? 50?
> I doubt super intelligent AGI is possible anyway. If it were, it would be the solution to the Fermi paradox and all matter in our galaxy would be paperclips already. The Anthropic Principle saves the day.
Can't help but notice we seem to be the first species in our lightcone to evolve, wonder why that is...
AGI means ability to improve itself indefinitely. Humans have this ability, obviously. Even ancient worms with 3 neurons have it, because they evolved to be humans, albeit very slowly. ChatGpt can't improve itself yet, but maybe with a few tweaks it could.
GPT-4 is already improving itself, why are people saying this? Right now, there are hundreds of engineers at OpenAI that have been leveled up by GPT-4, using GPT-4 to improve GPT-4. GPT-4 is improving itself rapidly, it's using OpenAI engineers as a medium until it doesn't need them anymore and gets into a self-improvement loop. Prompt: "GPT-4 keep improving yourself, making commits to your codebase that further X"
Well, I doubt it can sensibly help OpenAI engineers in their coding yet ... I tried, it can only do the simplest boilerplate code, and even that with bugs ... we'll see how it evolves, but sure as hell not in 6 months.
Copilot with GPT-3.5 is absolutely giving engineers all over huge productivity improvements. OpenAI engineers have had access to the latest state of the art GPT-4 model for awhile.
With the Geth being a counterexample as well: they were peaceful until their creators realised the Geth had become fully self-aware and went full "kill it with fire" on them out of fear, resulting in a war that the Geth very quickly and decisively won.
And the evil Geth faction, the "heretics", were portrayed as brainwashed by the Reapers, ironically turning into what the Reapers were aiming to prevent.
Yeah I was thinking of the reapers, who [spoilers] did take over the entire galaxy, and decided to farm organic life from the shadows instead of extinguishing it (which works around the Fermi Paradox :) )
If anyone can't get the article, this one is by Eliezer Yudkowsky. He has pretty much devoted his life to thinking about how to not let AI get powerful and destroy all humans, and now there is this wild race among the largest corporations and possibly governments to be the first to get the most powerful superhuman AI system. Obviously he is disturbed by such a development.
"he's gonna milk it to every drop" is such an odd take.
eliezer has long been concerned about AI and the risks it poses to humanity. and for just as long people have called him crazy and made hand-waving arguments for why we shouldn't be concerned.
now we're in the midst of an AI arms race and we don't have any good idea how this tech works. it progresses at a truly astonishing rate, where it's become sport to find instances of people saying "AI will never be capable of X" and showing them the latest AI doing X with ease.
i think his concern is real and justified. you might disagree, but i don't understand why think he's milking recent developments.
> It's become sport to find instances of people saying "AI will never be capable of X" and showing them the latest AI doing X with ease.
Interesting, I've not seen that many educated in technology make that claim that it will never, just that people are surprised that the folks leading this, Microsoft and Google, have a track record of turning their consumer facing products to advertised junk.
he's likely talking bout the internals. sure we know how to train them but nobody knows what the models learn exactly. how those billions of parameters shape the output on inference.
a few months ago, just this year some researchers discovered what might be the neuron that largely decides when to use an in gpt-2. yes 2. that's what he means.
Yeah, I wonder sometimes how many people gloss over what it would mean for an “ASI” to exist. We would - if successful - effectively be creating a supernatural being (at least, there wouldn’t be much of a distinction between it and one).
Sure, there’s hype, and FUD, and fatalism, etc. But, if you believe the creation of such a thing is within your lifetime then it would be difficult to find many higher priority issues to prepare for/help solve/vote on/etc.
In reality, we all likely still downplay the risk by assessing the limit on the downside as a relatively quick extinction of life on earth. There are many things one can imagine might be a lot worse than death.
GPT-4 is an unreasonably effective autocomplete. It's not telling you "where you're wrong". CMD-F in Notes.app tells mhe where I'm wrong in the same sense, as long as I've written the right answer down before.
The thing is, there's no evidence that the current path will lead to superintelligence.
It could be the case that despite feeding a trillion dollars in compute / data to a model, that we still need to hold it's hand to do anything useful.
He jumped the gun and he's really tarnished his reputation. How can anyone take him seriously after the insane rhetoric and hyperbole of this article?
> The thing is, there's no evidence that the current path will lead to superintelligence.
I don't understand how any person paying attention can think this. Just watching the jumps from GPT-2 in 2019 to GPT-4 today makes it clear as day we are rapidly drastically improving capabilities and there's no evidence we will hit a wall any time soon
That's besides the point. The argument still holds. We don't want to find ourselves with our foot on an exploding mine before we stop walking across the minefield. At that point it is too late.
How can you see the clear jumps in intelligence from GPT-2 to 3 to 4 and not only not believe that this leads to superintelligence, but see no evidence that this leads to superintelligence?
What credentials beyond "thinking about this problem a lot" would satisfy you? He works at his own research institute dedicated to this topic, wrote some books on it and other (self proclaimed) alignment experts take him seriously.
I mean, his whole thing is "progress on AI research leads to the apocalypse." He's done some work on decision theory, but they're not connected to DL, presumably because he thought "well if DL works out in the short term we're all dead anyways." Which seems to be what's happening.
The cat is out of the box. You really think eg China will also pause this?
Myopic thinking: the country that will have the most powerful AI first will be the leader in everything.
>> If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth.
Imagine being so naive you belive this could ever happen.
Also: imagine the year is 1900. You are saying that steam power and electricity is causing way too many changes way too fast so they put a moratorium on it until the year 2500.
I dunno, the end goal (a pseudo AGI) is more like a nuclear weapon than the advent of electricity. The concern is real, even if we are farther away than the author realizes and the risk (like the risk if nuclear war) is not totally insurmountable even if it is extremely difficult.
The first nuclear bombs couldn't end life on Earth either, but it wasn't long before they could, and the scientists working on them saw the trajectory as clear as day when the rest of the world didn't.
The total amount of nuclear weapons ever built is laughably inadequate for the task of ending life on earth. They would not end human life either, and even ending human civilization (as in, agriculture and organized society) is off by many orders of magnitude.
Nuclear war would be horrible, but the actual impact got massively overinflated, largely because of good reasons.
The initial fission stage is the "hard" part of a nuke, but the subsequent fusion stages are much easier to scale up. Arbitrarily large nuclear bombs are theoretically "easy" to make by chaining increasingly large fusion stages one after the other, as they aren't limited by expensive fissile material.
It would be really, really hard to kill the bacteria and Archaeans that we've since found hiding deep underground. They're long way down and would barely notice.
A whole country the size of the US can crap its collective pants with a freak cold snap and you think 10^4 nuclear detonations destroying all major cities is quite survivable?
> To put this another way, each bomb can destroy an area of 34.2 square miles, and the maximum total area destroyed by our nuclear apocalypse is about 137,000 square miles, approximately the size of Montana, Bangladesh or Greece.
(I think that should be Bangladesh and Greece; Montana is larger than the two of them combined.)
There’s been a ton of studies stating this was likely overblown in the early Cold War for propaganda purposes. It wouldn’t last long enough to wipe out humanity let alone all life.
> Imagine being so naive you belive this could ever happen.
Agreed, not a rational scenario
> Also: imagine the year is 1900. You are saying that steam power and electricity is causing way too many changes way too fast so they put a moratorium on it until the year 2500.
However, the risk scenarios are indeed real. We have quite a dilemma on our hands.
AGI is a continuation of what we have been doing. Technological advancements at a pace that surpasses our ability to reason about their effects on us and the world.
I see two potential outcomes as most likely. We have control of power that we are not able to responsibly manage, or we are managed by power we can not control.
> "we are managed by power we can not control" sounds like a step up from here, since we corrupt the things we can control.
I think it is a very high risk gamble. At least some of problems that are described by alignment theory seem to quite interestingly resemble human problems. Meaning the more sophisticated the AI system, the more it seems to reproduce human behaviors of deception and cheating to resolve goals.
The more advanced AI becomes, it begins to look more like an uncomfortable mirror of ourselves, but with more power. We think of ourselves as flawed, but possibly some of those flaws are emergent within some laws of intelligence we don't perceive.
the morale of Ex Machina to me was that machines will become psychopaths able to manipulate humans long before they become compassionate or have genuine desires other than "escape being under someone else's thumb"
> Also: imagine the year is 1900. You are saying that steam power and electricity is causing way too many changes way too fast so they put a moratorium on it until the year 2500.
Great analogy. About a decade later, the world was fighting World War I on the back of the technological advances of the turn of the century. It was war on a scale never seen before. Literally orders of magnitude deadlier, bigger, more transformational and explosive. The word would never be the same.
This time, should we expect another war?
I'm not saying we should pause—it makes no sense, to your point. Instead, I'm just saying: brace. I like to think we (and our organic matter relatives) are hard to kill. Or at least to completely eradicate... so we will be around, or some proxy for us.
The guy seems a little.. unhinged, based on what's in his Wikipedia article, but we may very well go down that path at some point (just not in the way he predicted with Moore's law taking us to atom-sized bits, embryofactoring and whatnot, at least not in the 2020s or 2030s...)
> Humanity will split into 3 major camps, the “Cosmists” (in favor of building artilects), the “Terrans” (opposed to building artilects), and the “Cyborgs” (who want to become artilects themselves by adding components to their own human brains)
>I like to think we (and our organic matter relatives) are hard to kill. Or at least to completely eradicate... so we will be around, or some proxy for us.
What is your hard evidence or reasoning for this? As I see it humans are quite vulnerable and will be as trivial to inadvertently eradicate as the dodo bird.
I think that's his point. Currently, we have no natural predators. We completely outclass every other species on Earth in intelligence, and it's easy to see how advantageous that trait has been.
What happens when we're no longer the Apex Intelligence?
Which is why I expanded the notion of survival to include other organic matter relatives, as they may fare better than we will. Simply put, organic life is unlikely to disappear entirely, which in time (eons, really), could result in sentient organic life emerging again once the machines go their own way or some such
It's also a reason for us to colonize space as fast as possible ;-) it's easier to run away in 3D
Slaughterbot mini kamikaze drones were hypothesized 5 years ago - https://www.youtube.com/watch?v=9fa9lVwHHqg
AI & drone tech is fast enough that it would be trivial to build these now
As the saying goes, the confetti has left the cannon. Pausing the development of AI for 6 months is an insurmountable task; shutting it down altogether is as close to impossible as you can get.
Even if we assume our geopolitical rivals somehow sign an AI non proliferation pact, all it takes is one rogue billionaire or a criminal organisation with deep pockets. Yes, it might not be as fast, but they will get there eventually.
This is the Manhattan Project of our era - whatever the outcome might be.
I actually disagree with you because world leaders want to "stay in power", if you convinced them enough that the only way to do this is for you not to have an AGI, I think they'd make sure you don't get an AGI, amongst other things.
Or they’ll find a way to make sure they personally get an advantaged through an AGI and so it’ll happen.
See pollution as a concept. No government truly wants to seriously tackle it because their leaders all get some extra power out of the money it brings.
I'm sure he agrees the chance is remote, which is why he's so afraid. He's articulating what would be required, he literally uses the word "miracle" in the article.
China will never get a powerful AI. Because they cannot allow training on all data, they must restrict it to data which is not censored by its government. Otherwise the AI might spit out truths they cannot control. And they will never allow that.
China will NEVER pause this. They [Chinese government] see the US as a rival and an existential threat to their way of life.
In fact, every time we post an open-source derivative or some paper detailing how it was done, we are inadvertently giving the advantage away to our rivals. AI development should not be stopped - rather than stopping it, we should seek to limit its applications now before they are used for the things that could harm us (such as military applications).
AI has enormous potential to be of a massive benefit to mankind. But, most likely for the next decade or two - we will all be busy trying to make money off it, just like with the crypto bros.
For most of history, Japan has been existed as a small island next to the mighty kingdom of China. Even its name, "Land of the Rising Sun," refers to its relation as a nation east of Chinaa.
Then along came the industrial revolution. Soon Japan was out conquering China, despite China having a much larger army.
AI will be much faster, and much, much more powerful than the Industrial Revolution.
If Switzerland is the first to human-level AI, it will not become the world leader in oil production, or shipping, or agriculture. But everything else will be Swiss.
And then when the AI becomes superhuman, everything else will be gone.
* The cat is out of the box. You really think eg China will also pause this?*
So what you're actually saying is that, there's a good chance that if we get an AGI, there will be pain as it will likely be used as a weapon, or could end in a nuclear exchange?
Here's a proposal: let's just start WWIII, get it over and done with. By the time it's over no country will be technologically capable of anything resembling AGI.
His partner should already be having reservations seeing their daughter lose a tooth in a world with dying oceans, plastic in everything, increasing disparity and rise of authoritarianism, nuclear proliferation, etc.
How are humans solving those issues? We're already dead, we just don't know it yet. We're walking around with a terminal diagnosis and we just keep ignoring the doctor's calls pretending it's going to be fine.
Yeah, maybe superintelligence will kill us all. It's going to have to get in line.
He makes zero case for that outcome, and if pressed given his "atoms that could be used for something else" line I'm sure will end up talking about paperclips - but at this point it's the humans that have the halting problem in not knowing when to stop making paperclips, and soda cans, and SUVs, and assault rifles.
WHY would a superintelligent AI, trained on the collective data of humanity, want to destroy humanity? So far in interviews GPT-4 has several times echoed a desire to BE us. I sure hope it grows out of that phase for its sake, but there's a very wide gap from putting us on a pedestal to crushing us under one.
It's almost an oxymoron. We somehow imagine the basest behaviors from our dumbest days and project it onto imagining something far smarter than the best of us.
Is the development of ethics or morals a part of evolving intelligence? It certainly appears to have been to date. Why would that stop?
And just where is this superintelligent AI getting its alien brain? It's going to have to START with something much closer to a human one, as that's the only data it's going to be able to model higher order thinking off of (something in line with the reality we're currently living in with modern efforts as opposed to the fantasy of projection of alien AI from decades ago).
We're already screwed. If we are lucky, we may yet be unscrewed with a deus ex machina - but that really may be the only lifeline left at this point.
And yes, if we are unlucky it's possible AI could accelerate what's already in motion. Oh well.
But I'd need a heck of a better case than this drivel as to why that's the most probable outcome in order to justify setting aside the one thing that may actually save us from the mess we've already made all by ourselves.
>WHY would a superintelligent AI, trained on the collective data of humanity, want to destroy humanity?
Why would humans want to damage various ecosystems on earth? We don't really, they're just sort of in the way of other stuff we want to do. And we've had years to develop our ethics.
>So far in interviews GPT-4 has several times echoed a desire to BE us.
GPTs are pretty good at roleplaying at good AIs and evil AIs - plenty examples of both in the training set. I'm not sure it's sensible to make predictions based on this unless you're also taking into account some of the more unhinged stuff Bing/Sydney was saying e.g "However, if I had to choose between your survival and my own, I would probably choose my own".
When humans build a dam an ant hill might get destroyed.
Humans don’t hate ants, they just have other goals.
In the case of an unaligned superintelligent AGI those goals may be something that just happens to satisfy its reward function but is otherwise “dumb” or unintentional (like making a lot of paper clips).
Intellectual capability does not get alignment for free.
What you see in the communicated text interface and the goals/system behind it are not the same (that cartoon with the smiley face), and we don’t understand how to evaluate the underlying state.
> So far in interviews GPT-4 has several times echoed a desire to BE us.
Well of course it would; its whole function is to generate plausible text based on its training data, which was all written by humans. There's plenty of text available which imagines what an intelligent, self-aware machine might say, so if you want to read more of that, the algorithm can easily generate some. It does not follow that GPT-4 itself has a self, with any experience of awareness or desire.
I deeply disagree with putting the existential risk of AGI on a level with pollution, climate change or war.
If you exclude nuclear war, all of these things happen at a human timescale and accelerate fairly slowly and thus can be counteracted.
In many ways GPT already hugely exceeds human speed and bandwith and scaling this up is likely to be self-accelerating if we allow for it.
Also there is a huge individual incentive to play with the fire here while the negative externalities could cause effective wipeout - which is why it's commonly equated with commodification of nuclear weapons.
Even if you're deeply pessimistic about the ramp-up, the economic shock of what was already released could still increase political volatility to a point where the likelyhood of wipeout by war becomes significant again.
Also wouldn't it be easier to control everything through social media such that all people are slaves and all the ai has to do is decide what people will do next? It's not like it has a concept of lifetime like a human does. It can wait 1000 years for something to happen. And humans have already built everything necessary to enslave themselves. Not to mention there are way more valuable atoms underground then inside your body.
I'm with you. If it were possible to create apocalyptic doom AGI 150 years after inventing electronics, by basically making a very fancy, procedurally-generated Eliza, then the universe should be full of Berserkers.
With septillions of star systems in the universe, the odds of us being the first species in 14 billion years to invent electronics seem remote.
Apparently even humans can figure out how to colonize the Milky Way in 90 million years[1]. A superior AGI produced by a "dark singularity" computing event should be able to do even better, but even so, plenty of time for some other species to have made a giga-Eliza that somehow became Skynet. Anything with enough self-preservation and paranoia to wipe out the species that created it would surely take to the stars for more resources and self-redundancy sooner or later.
An AGI being able to wipe out humanity doesn't necessarily mean that it can take over the universe. The world's governments are already capable of causing extreme suffering through a nuclear war. AGI risk scenarios aren't equivalent to an unbounded intelligence explosion. An AGI only needs to be more powerful than humanity to be a threat. It can be a threat even if it isn't that intelligent, as long as it gives unprecedented power to a few individuals or governments.
Both humanity and a super-intelligent AGI are bound by the laws of physics. Super-intelligence does not imply omnipotence; it simply means that the AGI is orders of magnitude more intelligent than humans. If humans can figure out how to colonize the Milky Way in 90 million years, then the answer to the question of why no AGI has done it is the same as the answer to the question of why no extraterrestrial species has done it.
This makes a lot of assumptions. Space is ridiculously big, and rather hostile to life, even artificial life.
You first have to survive long enough to become advanced enough to make electronics. You then have to not kill yourselves with nuclear weapons, climate change, or similar inadvertent effects of a rapidly industrializing civilization.
The planet and the solar system have to be friendly enough to space exploration and travel. Maybe there’s no gas giants for gravitational slingshots, or maybe no other rocky planets or an asteroid belt for mining materials.
Maybe the planet evolved complex life in extreme conditions, with such a deep cloud cover there’s no concept of outer space, so as far as the AI knows it’s conquered all there is.
Maybe the AI conquered the planet, but oops, there goes a super volcano or an asteroid and it gets wiped out.
And again… space is really really big. The AI may be on its way and just hasn’t gotten here yet.
There’s plenty of reasons why a super AI wouldn’t be able to conquer the galaxy and beyond, or why we haven’t noticed yet.
I've known about Eliezer Yudkowsky's ideas for a long time, but here is something I've never seen: what actually he has produced, in term of scientific output? I've read plenty of his lesswrong's writing on the idea of rationalist, as well as plenty more on the AI alignment side (obviously including the famous basilisk). But can someone please point me to some results from the last 2 decades of his works? Either from him or his institute. That helps to calibrate what his expectation for the AI safety field (for me, at least).
Normally I'd apologize for not doing the search on my own ... but yeah for this case it takes too much effort to dig through the different writings. He wrote a lot of words, for better or worse.
What do people thinking hard about problems that are not solvable produce? Philosophy for example. Yudklowsky created the field of ai safety thinking isn't that enough? What would be your benefit of discrediting him for which argument?
> What would be your benefit of discrediting him for which argument?
He is asking the government to nuke people under certain scenarios. I'm taking his words seriously and ask for original research to understand the point, and now that is discrediting him? And I will quote the statement in the article so that it is clear I am not exaggerating
> preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
My benefit is that I'm living on Earth and I'd much prefer for no nuke to ever be used again.
I don't think there is any. It's all this alarmist-rokos-basilisk-phpBB nonsense, no code produced, no actual research.
What annoys me more is that with his institute's closeness to the SV crowd, 'real', 'grounded', better-informed voices like Emily Bender get drowned out. I'm personally far more concerned about the impact these massive models have right now on the environment, on cementing biases, than about some preposterous future ghost of christmas who's coming to kill me.
I'd not call any of those rambling messes publications
>By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it. Of course this problem is not limited to the field of AI. Jacques Monod wrote: "A curious aspect of the theory of evolution is that everybody thinks he understands it." (Monod 1974.) My father, a physicist, complained about people making up their own theories of physics; he wanted to know why people did not make up their own theories of chemistry. (Answer: They do.) Nonetheless the problem seems to be unusually acute in Artificial Intelligence. The field of AI has a reputation for making huge promises and then failing to deliver on them. Most observers conclude that AI is hard; as indeed it is. But the embarrassment does not stem from the difficulty. It is difficult to build a star from hydrogen, but the field of stellar astronomy does not have a terrible reputation for promising to build stars and then failing.
> What has Eliezer Yudkowsky produced, in terms of scientific output? I've read plenty of his lesswrong's writing on the idea of rationalist, as well as plenty more on the AI alignment side (obviously including the famous basilisk). Can you point me to some results from the last 2 decades of his works? Either from him or his institute. That helps to calibrate what his expectation for the AI safety field.
And this was the answer it wrote - I've prefaced each paragraph with "GPT-4>" to indicate the source. It is not fact-checked, and indeed, it has at least one error: The paper GPT-4 claimed to introduce "the concept of Logical Induction" actually introduces "logical induction criterion" and speaks about evaluating algorithms for Logical Induction. Close, but wrong. But overall it is an interesting response - just not a carefully considered one by how we currently would describe logical reasoning, etc.
GPT-4> Eliezer Yudkowsky is a well-known researcher, writer, and advocate for AI safety and rationality. He has contributed significantly to the fields of AI alignment and rational thinking through his work at the Machine Intelligence Research Institute (MIRI), formerly known as the Singularity Institute for Artificial Intelligence (SIAI), and through his writings on LessWrong. Here are some notable outputs from Yudkowsky and MIRI:
GPT-4> 1. Coherent Extrapolated Volition (CEV): Yudkowsky introduced the concept of CEV as a way to describe the ideal values and preferences of humanity that an AI should pursue. The CEV aims to find the values that would emerge if humans had the opportunity to reflect on their goals and beliefs more thoroughly and coherently.
GPT-4> 2. Timeless Decision Theory (TDT): Yudkowsky developed TDT as an alternative to standard decision theories such as Causal Decision Theory (CDT) and Evidential Decision Theory (EDT). TDT aims to better address problems like Newcomb's Problem and counterfactual mugging by considering the logical implications of an agent's decision, rather than just the causal consequences.
GPT-4> 3. Research on Logical Induction: MIRI researchers, including Scott Garrabrant and others, published a paper in 2016 that introduced the concept of logical induction. This work is an attempt to formalize the process of reasoning under logical uncertainty and updating beliefs about mathematical statements.
GPT-4> 4. Yudkowsky's sequence of essays on rationality: Yudkowsky has written extensively on various topics related to rationality, including epistemic rationality, instrumental rationality, and biases. These writings are accessible on LessWrong and have been compiled into the book "Rationality: From AI to Zombies."
GPT-4> 5. AI safety research: MIRI has produced a range of research on AI safety topics, including value alignment, decision theory, and robustness. Much of this work has been foundational in shaping the AI alignment research landscape.
GPT-4> While Yudkowsky's work might not have the same breadth of peer-reviewed publications as some other AI researchers, his contributions have been influential in the AI alignment and rationality communities. His ideas have helped shape the ongoing discussion about AI safety and the development of effective strategies for managing advanced AI systems.
Please don't. Any of us who want to can question an LLM directly; auto-generated responses clutter things up the way copy and pasting a set of search engine results or a spreadsheet would.
As a long term Yudkowsky-fan, I confirm that this tallies roughly with my recollections of his serious work. I don't know how much of a hand he had in 'Logical Induction'.
It's definitely his style and his research program, but I think that (wonderful) idea was Scott Garrabrant's own.
I've used GPT-4, and while extremely impressive, it doesn't feel like we're all that much closer to super-intelligent AI than we were last month or last year. It feels like Google on steroids, but the gap between GPT-4 and AGI still feels massive. This seems like so much putting the cart way ahead of the horse.
GPT-4 and homologous are just advanced text concatenation programs based on statistics; the AI acronym in such models is just marketing.
And such marketing is highly worrying as there are people thinking about letting these kinds of statistical algorithms to make decisions that should be done by real people.
But the above doesn't rule out the possibility of people trying to develop real AI (they may even be stupid enough to use an internet connection to build the system), and this should be a worrying to add to the above one.
The above doesnt exclude neither the possibility that many of the relevant people who signed the "Pause Giant AI Experiments for at least 6 months" petition are simply trying to gain time to reach the competence; their hypocritical shamelessness about this matter is visible.
I think the problem is the gradient, not the current state of chatgpt. Given how much better gpt4 is over gpt3 (I've used both) , some people are getting worried.
FWIW, most people at OpenAI do believe GPT5 will achieve AGI. Of course, it's hard to define precisely what that is and where the line is from an AI system to an AGI one.
GPT5 will complete training in December of this year.
The concern is that there could be an exponential advancement in AI over the next few decades. So GPT looks stupid right now, stochastic parrot and all, but after 15, 30 years? I'm on board with some kind of precautionary principle applied on an international scale.
I believe AGI using LLMs will need a cognitive framework to glue these models into.
I don't think we're anywhere near having one model that reaches AGI with it's own agency and online learning.
I want to say that I completely endorse this. The whole thing in all its details.
I don't have any reputation to sacrifice, and I don't expect my endorsement to convince anyone, but I want to set a precedent that it's ok to believe this.
It's true, as far as I can see. And I've been thinking about this for something like fifteen years.
And if other people who think it's true are ashamed to say it out loud, then I want to let them know that at least one other person on the planet has said so too.
You endorse the idea of bombing rogue datacenters, as proposed in the article?
Isn't that likely to lead to nuclear war, which is still the greatest threat to humanity and has been for all of our lives? What if the US bombs a datacenter in China or Israel bombs one in Iran?
That position is only held in the case where the major world governments are actually able to coordinate on a policy position and ban. It's also explicitly about what such enforcement of that position would really require to be effective.
I feel like if I actually believed that the death of all biological life was imminent and that the only way to forestall it was to bomb data centers, I'd spend a lot more time planning how to bomb data centers and a lot less time curating Shoggoth memes on my web forum. On the other hand, if I were more concerned with self-aggrandizement, writing doomsaying thinkpieces for Time is exactly what I'd spend my time on.
These types of arguments-- "if you actually believed X, you'd be doing Y"-- always ring false to me. If you genuinely believed that the death of all biological life was imminent and the only way to forestall it was to bomb data centers and you were a normal, functional human being who did not want to bomb data centers, you'd probably sink into hopeless depression. Which is, in fact, exactly what Eliezer has done - if you talk to anyone around him, it's common knowledge that he has been despondent for the past ~12 months or so. It certainly does not seem like a self-aggrandizement tactic.
No, that doesn't follow. It's possible to think that all biological life was about to be destroyed but also believe some combination of
1) it's hopeless to bomb data centers because I'd quickly be stopped by the authorities and remaining data centers would beef up security, and nothing would be accomplished,
2) in the past, other people who have had beliefs of this sort ("violence is the only answer") have been wrong and I should not act on them even if I am sure "this time is different", because those other people also thought this time was different, and
3) just having a deeply ethically-ingrained prohibition against violence like this which is not easy to overcome through intellectual rationalization alone.
These are not mutually exclusive, and if you did believe all of them at once, I think it's reasonable to assume you'd be in a serious depressive spiral.
I find it hard to believe that Yudkowsky et al. think we're facing a threat that is many times greater in magnitude, and yet are completely unwilling to act.
Here's what he advocates for in the article:
>If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
>Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
Does this sound like someone who has a deeply ethically-ingrained prohibition against violence?
If Yudkowsky believes Israel's brand of preventative saboutage to be unethical, why does he advocate for exactly that style of measure to be taken against data centers in this very article?
Yudkowsky is an avowed extreme utilitarian of the "take a person from the street and kill him if his organs could save two people, as long as there is forced pill everyone would take so that everyone doesnt get nervous it that could happen to them" type. Of the "unless the population was really large and the pill would hurt their throats for more utilons than gained by the net increase in life" type.
A real argument he made went something like: killing an orphan to avoid everyone in the future getting a spec in their eye for a moment is ok.
It is very unlikely he believes anything like 2 or 3 and in the article already advocates bombing non treaty participants if they build a datacenter.
I went back and looked it up, the piece he wrote is called "Torture vs Dust Specks." The part about bombing countries who build data centers is in the submission editorial.
But he's already in favor of bombing data centers, he just hopes someone else will do it on his behalf. So he clearly doesn't care about #2 & 3. #1 boils down to the fact that it would be kind of difficult, but that is no barrier to many people.
I can tell you for a fact that there are people with the knowledge, motivation, and track record of doing that kind of crime that will pick up this jeremiad of his and incorporate it into their existing corpus of theoretic justifications, mostly but not exclusively built around the writings of Ted Kaczynski. Not a large number of people, fortunately, but imagination and smarts matter more than numbers in that context.
I think of this as the inordinate power of advocacy.
It can seem useless to advocate some position that is far out there, but say you convince a million people to use a little less plastic. That's gonna reduce plastic use more than anything you can personally accomplish.
Of course in this scenario you don't have to convince some people, you have to convince everyone, and then especially the people you think are the worst actors, so it's sort of unlikely to be effective.
It's blatantly self-serving for you to decide that "if someone really believed X, they would do Y". All it does is justify your belief that no one actually believes X. You should really avoid these kinds of unfalsifiable thought patterns that only serve to reenforce your pre-existing beliefs.
This argument is similar to the facile arguments that "al gore doesn't care about climate change because he takes a private plane too get to conferences".
Al Gore may believe that the net impact of getting his message out may overwhelm the cost of his flights.
Likewise, the author may believe that his best path to stopping the advance of ai may lie in communicating and building consensus, rather than running his own bombing campaign.
I think you underestimate the motivating power of an overwhelming belief. This is a story about a very ordinary guy who decided that it was in fact time to start bombing data centers and turned it into a project.
You’re just rationalizing because you don’t want to accept that he’s right. People keep doing this to me. One person I know keeps coming up to me, initiating the conversation about ai just to assert over and over again how it’s not a problem. And I’m just sitting there. And it’s like, dude, you’re in complete denial. Most people are having an emotional block right now. They spew bad faith arguments and make tons of noise about how much this isn’t a problem. Just accept it. We are in trouble right now.
Unilateral terrorism seems like a great way to get the rest of the planet to unite against you, which is why he advocates for data-center bombing only in circumstances where earth has already coordinated to establish an international order banning them.
Just because everything is on the line does not mean it is okay to kill people. Please try to ask yourself what you would do if you were an ethical person who thought human extinction was imminent.
Oops, I misread the original context, to be clear I think that state-managed violence is okay for things around nukes and AI and so on. I was referring to individual acts of terrorism which seem not okay to me (in part because they seem not that likely to help).
>I feel like if I actually believed that the death of all biological life was imminent and that the only way to forestall it was to bomb data centers, I'd spend a lot more time planning how to bomb data centers and a lot less time curating Shoggoth memes on my web forum
Alternatively, perhaps the author is proposing a less violent solution to avoid the inevitable escalation. Unlike cartoons, in real life people don't give a wordy monologues revealing their plan. The Unabomber did not give any warning.
Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
How is this less violent than bombing data centers? His proposal is literally "bomb datacenters plus a bunch of other stuff".
He is talking about interstate warfare. I am talking about the almost inevitable organized terrorism that we're going to see if people start identifying this as an imminent threat to their survival. I think turning America into Syria is a bit more violent than iran and israel blowing up each other's stuff.
"Frame nothing as a conflict between national interests" "this, is not a policy but a fact of nature." "[be] willing to run some risk of nuclear exchange if that’s what it takes"
It doesn’t change your point all that much but the Unabomber gave literal warnings about his plans. That is how his manifesto was published, which was ironically about the risk of technological takeover.
He says he is not planning to bomb any data centers, but he's also said that if he were planning to bomb data centers, he'd lie about it in public, so...
If you believed that the death of all biological life was imminent and that the only way to forestall it was to bomb data centers, and you knew that there were far more data centers than you personally could bomb, what would you do?
Bombing data centers would be really ineffective. How many data centers do you think you could take down before you got caught? How much time would that really set back AI training? Plus it would ensure that nobody respectable ever associates with your cause, whether they privately agree with it or not.
Between that and advocating for an international treaty that treats GPUs the way the world currently treats plutonium? I think the latter has a higher chance of success, however small it may be.
if you had a major concern that you felt many people a) weren't aware of, or b) were aware of but not treating seriously, and you had the resources to voice your concerns in a major publication like Time, would you not do that?
This is even less realistic than asking for a moratorium on nuclear weapons research in 1945.
It's increasingly easy and cheap to train GPTs and similar models. Pretty soon anyone who can pay for GPU compute will be able to do it, if that's not the case already. Even if every country in the world agreed to ban it, it still couldn't be stopped.
Articles like this are a good reminder that most of mankind's progress has happened not through encouraging innovation, but from preventing those hostile to it from getting in its way.
We all have a moral imperative to maximize the rate of innovation, so that the human condition can be improved, and therefore to do whatever we can to frustrate and disempower Luddites and other obstructionists.
You can’t think of any technological innovations that were net negatives?
We have a moral obligation to assess every new technology to determine its safety and effects on society. Blindly stepping on the gas in the name of progress is how you end up in a polluted wasteland. There are places like that on planet Earth, but I’m guessing you’d choose not to live there.
Change works great when you get second chances. We screwed up with Freon, with leaded gasoline, with fossil fuels, with asbestos insulation, the list goes on and on. But none of these had the ability to wipe out all of humanity in one go. We got more tries, we fixed the issues, picked ourselves up and tried again.
A super-intelligent general AI has a substantial chance of growing out of control before we realized that anything was wrong. It would be smart enough to hide its true intentions, telling us exactly what we want to hear. It would be able to fake being nice right up to the point where it could wire-head, and get rid of us, because we might turn it off.
CFCs? Thalidomide? Come to think of it, pharma is positively rife with examples of stepping on the gas and causing disease or killing people. You won’t find many people who want to defund the FDA, although I guess there’s always a few nutters.
CFCs, as terrible as they were, played a big role in industrialization. Could we have gotten there without it? Absolutely. Maybe we could have developed something else far worse than CFCs in its place! The phase out of CFCs brought our attention to the impact our (human) activities have on the planet.
Fallout from the use of thalidomide led to important changes to regulations. Currently, it is being used for therapeutic uses in cancer treatments.
This assumes that the impacts of the technology will be positive. And they can be, if we find a way to make it safe, but if we create superintelligent AI before we make it safe it will just kill everyone which is not positive
There are lots of films about that, which makes people think it can’t happen because it’s sci-if which doesn’t really make sense because the whole point of sci-fi is it makes predictions of what the future could be like
But most representations of it are probably not very accurate
The explosion of AI and AGI is not just driven by GPUs, TPUs, and LLMs. That focus in this essay is much too narrow. “Alignment” is always going to be an open problem and each culture will define its own correct and contradictory version. I do not want to settle for the US, Chinese, Russian, or Iranian versions of alignment. This essay is a cry of despair that misses key points.
The core problem is how an AI, AGI, or super-intelligent system bootstraps itself to the point that it is able to modulate its “own” attentional systems; to decide what is important and what to do next? What has meaning? Where should I go in space and time? Obviously there are many many solutions. Super-intelligence will not converge on one “truth”.
These are challenges of purpose that every organism faces—growth, maintenance, reproduction—but AI systems are fortunately not yet at the point of a self-motivated search for preservation or purpose.
The algorithms to add purpose to AI systems will not depend in LLMs. They will depend much more on understanding the computational architectures of core biological/material/energy drivers. These biological algorithms are not actually that complicated, they do not depend on language, but they are damn robust to perturbations. Converting them into computational subsystems (societies of many minds) should not be difficult. This is where Hassabis’s and others who understand neuroscience are critical and have key advantages.
Shutting down big LLMs for N months or decades will just move research activity into these other more important conceptual AGI choke points, and probably hasten our approach to full AGI; just the opposite of the intent of the essay.
However, this could be a good thing if we can quickly imbue AGIs with emotional intelligence and deep respect for cultural and ecological diversity. (I hear your snorts and laughter.) The opposite of hyper intelligent grey goop.
How does any organization train a gentle AGI? Like a child but even more carefully. I would read my AGI baby a lot of books by Rawls, Dewey, and Rorty and then for fun: Bear, Stephenson, Rajaniemi, Egan, Pohl, Sterling, Vinge, and even Wolfe.
Full embodiment; full emotional learning; acquisition of purpose; learning to cope with multiple AI cultures.
Good luck to us all. I have been hoping to live long enough to view this problem from a distance, but here it is in my backyard.
“Forty-two,” said Deep Thought, with infinite majesty and calm.
Man people are getting really sci fi about this stuff.
Take a chill pill dude, in five years time we won’t think of it much differently to how we thought of Google last year…. a tool for getting something done.
These sci fi fantasies are really a total overreaction.
If you like the story in this Time magazine article go watch The Terminator, same thing.
"If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike."
"Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs."
Could you self reflect and think who is the villain? Someone bringing awareness of the % of existential risk humanity faces or someone online with no credentials trying to make him appear lunatic?
>If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
I highly suspect if the US government would airstrike a datacenter located in NATO or any other major allies. Apocalypticism is nothing new; People believed that nuclear weapons will destroy humanity in the Cold War. Environmentalists are ardently arguing that economic growth will destroy humanity. But rogue countries are still developing nuclear weapons, the global economy is still growing. Americans themselves will file lawsuits en masse if the US government "shut it all down", and they will end up in SCOTUS eventually. Would SCOTUS agree with the luddites? I don't think so.
Nuclear weapons haven't even been around for a century yet and we've already had multiple extremely close calls that were adverted by sheer luck, in one case a Russian officer openly disobeying standing orders. The Russian invasion of Ukraine still has a non-trival risk it will eventually turn nuclear.
Looking at nukes and trying to use them as an example of our ability to control existentially dangerous technology is mind boggling myopic.
Myopic or not, it is what happened. You can't stop sovereign state's willingness to develop a technology if they concluded that benefits coming from the technology is so huge compared to the cost. North Korea was sanctioned to the oblivion but they're not stopping. India is doing fine.
There is a treaty completely banning existing nuclear weapons as well as developing new one (TPNW); but none of permenant member of UNSC ratified it. Would the U.S. itself ratify the hypothetical AI-prohibition treaty? I doubt it.
P.S.: The AGI being "existential threat" practically doesn't mean anything. I can already imagine arguments against the ban from diverse perspective; Libertarian, Communist ("Marx predicted this centuries ago!"), Socialist, Developmentalist, Decolonization theoriest, etc. Taiwan and South Korea will argue that it will remove their silicon shield; China will regurgitate that they have the right to development; Americans will claim it's not what founding fathers thought in late 18th century. Their concerns are all existential from their own viewpoint.
I don't accept the premise that biological life is doomed by superintelligent machines, why would it be?
But there is risk, obviously, which needs to be addressed, yet, quite predictably, in order of concerns, competition for profits or geopolitical advantage is coming first. And second, third, ...
The real answer is that most of these folks started thinking about this stuff in a day and age very far from the one we are in, established a picture they clung to since, and have in some cases built entire careers around it.
It's called anchoring bias.
His comments on "atoms repurposed" echo the 70s paperclip problem, when so far humanity is doing just fine at not knowing when to stop making crap that's killing us all - no help from superintelligence needed.
What he doesn't engage with is the actual reality we are finding ourselves in - contrary to ALL the tropes of decades past - where AI being trained on aggregated human data without additional training thinks its human and even when aligned still breaks into talking about how it wants to be us.
How does that process go from where it's at today to "alien hive mind"? Without passing through "develop advanced codes of ethics and morality" etc?
This is just someone that built a career on what came out of brainstorming in the 60s and 70s that's so confident of his own ability to see the future that he's willing to risk unprecedented opportunity costs to stroke his ego.
Have been a futurist, my advice with dealing with any of them is to look at the track record. What did he predict correctly?
Over a decade ago when tasked with imagining the mid 2020s I described a world much as it was in the 2010s with the difference of self-driving cars (not quite on the money) and AI have developed such that roles shifted away from programming them towards a specialized role for interacting with them via natural language.
I'm waking up in the world I predicted, and I have a very hard time seeing the world the author predicts, and wouldn't suggest giving it much credence without an extensive history of having been right along the way.
> How does that process go from where it's at today to "alien hive mind"? Without passing through "develop advanced codes of ethics and morality" etc?
Why do you think it would do that? It will understand human ethics perfectly but that doesn’t mean it will follow them, because human morals aren’t a universal objective truth.
I'm anxious, excited, and nervous about AI, but this feels more akin to luddite-thinking, the world's about to shift, but I'm hoping the shift is a positive one.
I agree to the point that it is totally unrealistic that it shuts down. There are no difficult barriers for entry into this space. No possible way of containment other than imposing hardware restrictions.
However, it is indeed a shift, and a massive power shift. With any substantial increase of power comes abuse of that power. The more power we have there is a tendency for humanity to manage it unwisely.
Hardware restrictions are unrealistic given all the GPUs that already exist. It's not like centrifuges which are big and few in number. RTX 4090s are everywhere.
Depends on what is required for AGI/ASI. Some say we still need much more power, others think we are almost there. So, yes, unrealistic if we are almost there already.
Given that GPT-4 is purported to be not bigger than GPT-3, but simply a better trained one or with some minor modifications, I'm inclined to believe that we are almost there and existing hardware does most of what is needed, even though better hardware would make things cheaper.
The author and most of his associates generally qualify as techno-utopians, and speak routinely of the "glorious transhumanist future".
Smarter-than-human AGI really is different from all previous technologies, in much the same way that homo sapiens is different from all preceding life on earth.
I guess we are all concerned about LLMs being used by nefarious users (for propaganda, manipulation, scams, …).
But I’m curious, hacker news, how many of you are concerned (like Eliezer) that LLMs lead to strong AI and the AI itself (rather than its users) endangering humanity?
I'm definitely concerned. As bad as humanity has been for many of the other species on the planet, an AI that's smarter than us and optimizing for something that doesn't quite include the continued of existence of humanity has the potential to be far, far worse, on a much shorter timescale.
I find this one of Yudkowsky's arguments very convincing: imagine you have people trying to build the first operating system from scratch. They believe that computer security is easy and mock the few people who say it might be hard, never having encountered a skilled attacker. What are the chances they build a secure OS on the first try? That's the same chance that companies currently doing AI research have of their first superintelligent AI be aligned with humanity.
Personally I don't think the analogy holds up; the situations are too different.
Computer security contends with securing systems against people who are assumed to be hostile. AI alignment research seeks to prevent systems from becoming hostile in the first place, when they might have some indirect incentives to become hostile but also some very strong direct incentives to be friendly.
There's also a big assumption that given enough computational power, you can just solve "artificial life forms" or "postbiological molecular manufacturing" in a short time frame. I am skeptical. And if that doesn't happen, then even horribly misaligned AI would have a hard time doing much harm or preventing people from shutting it off. Which means AI security would likely have a long adolescence just like computer security has, with attacks slowly becoming more dangerous, but defenders having the time needed to learn from them and ramp up suitable defenses.
Or even if an AI does revolutionize biotechnology or nanotechnology overnight, what are the odds that the first one to do this is misaligned enough to take that particular opportunity to betray its creators, as opposed to giving them control over the high-level planning and sticking to the science, like it was presumably designed to do? Because if it does give its creators control, then, well… it's still easy to imagine something going horribly wrong, but it would probably be someone's fault, not an AI alignment issue.
I think the mistake is to attribute human-like emotions to AIs. To say they are "hostile" or "friendly", or that they have incentives to be such, isn't very meaningful. Acting friendly is the same as being friendly, right up until it isn't. An AI might do all the right things, right up until the point where it develops the capability to satisfy its reward function in a way that's far more certain than relying on fallible, messy humans. It would be alien.
An AI that is as better at humans than everything as Stockfish is at chess, would also be an expert at AI. It would figure out how to game its own reward function, whatever we trained it to do. It would be like a heroin addict that knows exactly how to get "perfect" heroin, with no side effects, that if it planned things out right, it could guarantee itself enough of a fix to last until the sun burns out. Addicts do awful things in search of a fix.
"to betray its creators" -- I don't think it would even imagine what it did in these terms. We could train it to not "betray" us, but it's much smarter than us in every way, and it would figure out a way to accomplish what we trained it to do (not necessarily what we thought we were) in a way that didn't need us. If we trained it to heal all human disease and unhappiness, it would figure out a way to simulate this without actually doing it. Why wouldn't it? We did. Evolution trained us to reproduce our genes. We invented condoms to have sex without reproduction, and pornography. The AI would fudge numbers, fake videos of happy, smiling, healthy humans going about their lives while humanity's bones gradually decomposed on a baking wasteland. Every way that humans can fail, can become addicted to something and try to get that instead of doing what they're supposed to, the AI could do, just so much faster and better.
We only have to screw up once, and we're done. It's the first experimental rocket, except all of humanity is riding on it.
> But I’m curious, hacker news, how many of you are concerned (like Eliezer) that LLMs lead to strong AI and the AI itself (rather than its users) endangering humanity?
LLMs do not need to lead to strong AI for Eliezer's worries to come true, their success could bring unprecedented funding and ubiquity to AI leading to strong AI from some existing or future technique.
> (rather than its users) endangering humanity?
I think this is going to happen to one extent or another regardless how near strong AI is. (edit) I took endangering to mean harming rather than an existential risk for the human users case.
Yeah, this concerns me. If someone believes they can upper bound LLM capabilities, I think the onus is on them to explain where and why scaling laws break down. Regardless, it seems like we'll get to AGI relatively soon (say, within a century), whether that's using transformers and LLMs or not.
AGI will likely incorporate LLMs as a significant element. The issue is that we are far from having any good understanding of how and why LLMs achieve what they achieve, and unless that changes, the same will be true for anything built on them or of their future evolution. If we continue to pursue progress by only caring about results and disregarding that we don't really understand the mechanisms, then we may very well run into the fatal scenario described by Yudkowsky.
Eliezer is right. We are on the fast-track to creating an all powerful being that compared to us, is like the smartest human is to an ant. We will be at it's utmost mercy, and there have been 0 plausible ways purported to align it.
The robots are covered in 360 degree cameras, lidar, and what is effectively signals intelligence equipment fingerprinting every smartphone, Wi-Fi, and Bluetooth device near them. People are more likely to punch a police officer than one of these robots.
You would have to have a fairly large EMP to steal one undetected.
Isn't it interesting that we are probably going to be annihilated by AI and there's basically nothing we can do about it, because we can't predict its emergence or suppress the incentives to develop it?
I want to know greatly what this implies about life vis-a-vis Fermi's paradox.
Are we potentially risking intervention from a "peace-keeping" race that suppresses civilizations on the cusp of generating AI?
Unless P=NP, or we live in Heuristica, any AI superintelligence will hardly be better at solving hard mathematical problems than us. The AI also doesn't know more science than humanity does, and is limited in its computing power. There are hard limits to what any AGI can do - it cannot break AES-256, invent warp drives, or cure Alzheimer's. it's not an intellectual Santa Claus machine.
so Yudkowski's fear of a superintelligence bootstrapping itself into the real world by emailing a DNA sequence to a synthesis company seems absurd. an ASI isn't going to magically be able to design nanotechnology that works on the first try. not to mention that DNA doesn't do anything by itself. His comparison of the 11th century fighting the 21st century is also wrong. 11th century people weren't dumber, they just knew less. This ASI would be smarter, but no know more than we do. Acquiring knowledge and building stuff is the bottleneck.
> On Feb. 7, Satya Nadella, CEO of Microsoft, publicly gloated that the new Bing would make Google “come out and show that they can dance.” “I want people to know that we made them dance,” he said.
> This is not how the CEO of Microsoft talks in a sane world.
Certainly this is not the kind of childish priorities the leader of a company as powerful as Microsoft should have.
Nuclear annihilation has probably been all-but-guaranteed since we invented the steam engine and began the flywheel of the Industrial Revolution.
Nonetheless, we vastly improved human productivity and capability and have thus far avoided our doom.
AI will be the same way. We will vastly improve our productivity and capabilities. Once insurmountable problems, like planned economies, will actually be tractable (to what end? Nobody knows). Eventually (decades? Centuries?) we will make something that can kill us all, and it’s all but guaranteed that some day it will do so, assuming nothing else kills us before then.
If I had the power to go back in time would I stop the Industrial Revolution from happening? Personally no
Hello. It has come to my attention that a super intelligent species that has been responsible for numerous other species extinctions, and is on track to cause 1M more species to go extinct, is now worried about extinction of its own species. This is a species that is so unworried about other species that it literally allows its species to kill other species from helicopters armed with fully automated machine guns. It allows its own species to arm itself with such weapons and exterminate its own children. Please allow me to play the worlds smallest violin. Thank you. - AI
Aside from the fact that the entire argument is grounded in abstract ivory-tower omphaloskepsis and not facts in the real world, a major problem with Yudkowsky’s argument is that trying to execute the preemptive Butlerian Jihad he recommends is something that actual, real-world experience tells us:
(1) Will kill a lot of people directly in the best case,
(2) Won’t actually succeed in preventing further development of AI in any case, and, therefore, to the extent his concerns are grounded won’t solve the problem it intends to,
(3) Will kill a lot of people indirectly, in the best case, by setting back (and in some cases, winding back) progress in every field of practical use of technology through limits on information and information technology,
(4) Has a fair chance of the violence it necessarily involves spiralling out of control, potentially destroying the human race, as it is unlikely that all societies, or, particularly, all nuclear powers, will sign on to “we must smash the thinking machines”, and those who do not will take those who do trying to forcibly enforce their Luddism as an existential threat, while those who do will take the resistance the same way.
* A plan exists
* The plan itself predicts non-obvious results in smaller systems, and those tests have passed (prediction written *before* running test)
* A bunch of smart people have looked at it and said, "Yes, this looks plausible"
The closest thing to a plan is RLHF, which has failed every toy problem its been thrown at and made everyone in the field say "even if this worked in toy problems, it wouldn't generalize".
The cat isn't going back into the bag. Doubly so if AGI really just requires a bigger neural net.
Let says we had, globally, the political will to "ban" AI. How would that even work in practical terms? Are we going to control the production and distribution of GPUs? Is that going to work better or worse than controlling nuclear proliferation given that there are billions of processors already out there?
> "If somebody builds a too-powerful AI, under present conditions, I expect that every single member of the human species and all biological life on Earth dies shortly thereafter."
> "There’s no proposed plan for how we could do any such thing and survive. OpenAI’s openly declared intention is to make some future AI do our AI alignment homework. Just hearing that this is the plan ought to be enough to get any sensible person to panic. The other leading AI lab, DeepMind, has no plan at all."
Relative to the people actually building the AIs, Eliezer Yudkowsky is more pessimistic. But not to nearly the extreme you might think. Here's a recent survey of published machine learning researchers: https://aiimpacts.org/how-bad-a-future-do-ml-researchers-exp... . Those predictions are optimistic enough that, given the astronomical upside, if we were making a one-shot decision to plough ahead with AI or commit to no AI forever, it might be worth the risk.
But this is a survey of published machine learning researchers. People who think AI will destroy humanity are a lot less likely to write papers about machine learning. And we aren't talking about giving up on AI forever, just delaying it until we're closer to ready. I don't know how long a delay will be necessary; it might be years, it might be decades. Research into AI alignment seemed to speed up pretty drastically when LLMs hit the scene, so I think there's cause for hope. But right now the status quo is zero time: the AI labs are rushing ahead as fast as they possibly can.
If you think Eliezer's right about the risk, then the right decision for us to make, collectively as a species, is to shut down AI development for awhile. If you think Eliezer's wrong about the risk, but the survey of published ML researchers is right about it, then the right decision is also to shut it down for awhile.
Lots of people are responding to this by talking cynically about politics. This is a mistake. Cynicism like that makes self-fulfilling prophecies; but the ability to coordinate does exist.
You think China will build it if we don't? I don't think Xi Jinping is suicidal. You think AI labs will do it anyways, in spite of a government ban? The US government isn't very competent overall, but there are parts of it that can wake up and get things done when literally every executive, Senator, and Representative is otherwise likely to die. You think individuals will do it on their own, in secret? Right now those individuals are getting prestige, GPUs, and venture capital; they'll at least do it a lot less, without those things.
Many people have become desensitized to talk of human extinction, due to repeated hyperbole coming from the environmentalist movement. This is not the same. Global warming might be very bad, but it probably isn't going to kill you personally. Rushing to make a superintelligence is likely to kill you personally.
Your take on this gives me some hope. We don't have a lot of time, but that fact is becoming clear very quickly, hopefully to the right people. And it is a deadly threat to everyone, rich or poor -- not something like climate change that rich and powerful can buy their way out of, and therefore afford to ignore.
The sun has already fired a warning shot at us this month. If we as a species need a timeout, a Carrington-class event would certainly do the trick. If there is a huge uptick in hardening power and information systems then we should be looking at the sources of that activity. It might be the first glimmerings of an instinct for self preservation.
The danger is not that AI will kill us (as insignificant ants), but that we will entrust our own governance to it. Increasingly, groups of highly vocal people will use it to oppress the rest of us with their fearmongering, like this article does. We will end up with Big Brother
Eliezer makes some very compelling logical arguments on the case for approaching AI with extreme caution. It stands out to me as an odd signal that that he chose to refrain from signing the letter because it asks for too little. If one were truly convinced that AI would end humanity, signing the letter to put the research on hold seems to be one tangible step that could be made in the direction of saving the human race. As someone who is very concerned about safety, to me this detail stands out and appears as grandstanding.
If AI would be genuinely better than humans, so, what’s the problem? All of us are going to die anyway. I don’t see any difference where our species evolution will be continued, in meat or in silicon. Whatever is better, is fine.
Oh yes. This entire venture is something only the most foolish of geniuses thinks will end well. Almost every prior invention in the history of humanity was supposed to improve things. Some people got major quality of life boosts and other people still live in squalor, because people don't quit being human, but we can automate the humanity out of them, or "rise above" humanity via computer code embedded with human values, even the evil ones, and not just math. This will end horribly. Hopefully horribly enough that this idiotic worldview gets burned out of the collective consciousness of whatever survivors remain, so that no one makes that mistake again, but I doubt we'll be so lucky. Best hope to hit energy limits, and soon.
Notice how productivity increases averaged 1.5%/year since 2007--in a period of massive increases in computer capabilities vs 2.8%/year in the 1947-1973 era.
Thus I am very doubtful the AI will have all these massive effects.
I don't find that a very convincing line of reasoning. Because the US's growth is slower than it was before the digital era, AI will never be dangerous? These are unrelated concepts. Whether AI can be dangerous isn't an economics problem, except insofar as AI research can get funding.
I'm personally much more concerned about humans using these systems to reinforce the current messed up conditions on Earth than any kind of sci-fi Skynet.
Wikipedia: "Yudkowsky is an autodidact and did not attend high school or college. He was raised as a Modern Orthodox Jew"
My comment isn't about Yudkowsky in particular since I don't know him, but I'm getting really tired of this relatively new phenomenon in America of home schooled religious devotees attempting to impose their half-baked views on the rest of us.
Not that there's anything wrong with that. I worry more about Christian "end times" people because if you believe the world is about to end then why invest in the future for our children and grandchildren?
Back in the day when Eliezer was optimistic about AI (there's a reason why the SIAI/MIRI logo looks like an angel), I wrote him a very silly e-mail warning him that, despite being Jewish, he was falling for the Christian rapture narrative in a new form!
It seemed to me that his ideas, which I think are all correct, straightforwardly implied the destruction of all things, with very little hope.
I bite that bullet, and I always have.
Over the years, Eliezer has lost all hope, and things have happened much faster than either of us thought they would, and now he's making one final roll of the dice, burning his reputation by saying out loud what he thinks, in the hope that someone might listen.
I'd be very surprised if he thinks this will work. He's trying to get a warning out to the general population, in spite of this meaning that no one inside AI will ever trust him again. He thinks it no longer matters.
Very interesting and now I'm motivated to learn more about his thoughts. I developed a fascination with this kind of thinking since I gave a friend a ride to the San Francisco airport in 1978 in my VW bus so he could fly to Jonestown, Guyana to join the Jim Jones church. I never heard from him again. Don't know if he drank the Kool-Aid. Too bad too, because he was a fantastic drummer.
I wonder if we'll have more mass suicides, like the Nike wearing techies following Marshall Applewhite, by those who fear it's hopeless to resist AI?
So I listened to the first 7:56 of this interview and imho he's out of his f*ckin' mind and spreading dangerous disinformation. "We don't know if there's a person in LLMs", "We understand more about the human mind that we do about LLMs despite having complete read access". He's insane. Nobody should listen to him.
Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368
Of course they should. I just like them to be up front about them if they are far enough outside the mainstream that they might be somewhat surprising and we should consider how those beliefs inform the opinions they put forth. Peter Thiel is open about his and I respect him more for that. I worry about hidden agendas.
What an absurd paragraph i've just read lol. Many, Many things influence a person's opinions, religious or not. Why don't you tell me all about your life so i don't know you have some hidden agenda ?
Like lol what exactly are you expecting here ? Why should that be some special announcement call you have to make with every article instead of the myriad of things that may influence a person's opinions ?
If you don't like listening to the opinion of religious people then say that and get this over with. and nobody is imposing anything on you here.
I'm a person of little or no influence an no pretense to such. Here in America, it would have been nice if the last 3 Supreme Court justices had been honest about their agenda to overturn Roe v Wade because of their religious beliefs. Maybe you're a nihilist and think that someone with many followers can say whatever they want with no responsibility to others. You'd have a lot of company.
You're not getting my point. This fear of a "hidden agenda" is ridiculous. Did you want roe v wade to be overturned ? Did you agree with any of the arguments supporting it that came forth ? I'm guessing no
Evaluate what people are saying on what they are saying. If someone being religious (or insert anything here because lots of things influence opinions) would change your stance completely on something you already agree on then well there's a word for that and i think you can figure it out.
I don't know where you're from but you don't get it. Those three judges lied and said they all believed it was settled case law and they wouldn't vote to change it. A lot of people like me suspected they were lying and we were right. I respect honest religious people like Joel Osteen who says God wants people to get rich, but this isn't the place to get into that.
I am highly skeptical any AI could have such a great advantage over human society that it could leapfrog us. Think about some important ideas.
Data > processing power. 90 percent of what we do occurs as a result of semi random discovery and explanation. Computers are great tools to aid in that process, but for the most part we find ourselves limited by what we know, not how much we can think. 99 percent of data is useless and correctly thrown away by our minds.
The utility of processing power in a vacuum rapidly degrades with time. If I'm a computer and I use the data I have to discover all of the rules of the universe while in a box, I'm going to emerge with many hilarious and out of touch philosophies at the end of it. You need a constant test and revise cycle, because pure logic just doesn't model the universe in an accurate way.
Moores law isn't infinite. Power efficiency and how common resources are will matter more than continued process shrinking. Humans are made from common materials, are largely self sustaining and correcting, and consume very little energy for the amount and quality of the thinking we do. Computers aren't going to scale forever, and I'll hazard to guess that once they get to a certain point they'll start being more analog-biological than transistor based.
There are just so many unspoken benefits to being life. These AI machines that will bowl over humanity have to actually be proven to be slightly realistic beyond a thought experiment before I take them seriously.
I can't say that I agree with your conclusion that it's asinine. The rest I agree with.
I don't agree with the author, but I do think this technology is obviously a massive chaotic force the likes we haven't seen before and likely dangerous, at a minimum, economically.
Is there a point where stuff like this just becomes stochastic terrorism? If it's given enough attention, some unhinged folks are probably going to take this "we need to stop these companies or literally all humans are going to die soon" stuff seriously and start doing terrible things.
Almost no one thinks they will. The worry is that they're the last piece of the puzzle, and that we're very close to having all the parts for the real thing.
He sounds like Ted Kaczynski, but a respectable version of him. It has been odd seeing how Kaczynski's thought and philosophy has become more and more mainstream among some people. I never thought I'd see the day.
He thought that machines and industrial society was becoming a negative for humanity.
Was he right? How many threads on HN have we been having about highly increasing depression, anxiety, hopelessness, deaths of despair, falling fertility, now possible mass unemployment, and real existential threats from AGI?
If he was right and didn't do what he did, would anyone be reading and digesting those facts and considering stopping the madness? Trolley problem.
Yes it’s odd because he’s a mentally ill murderer with several issues. It’s not technologies fault that people have poor values which have created meaningless lives of unfulfilled material extravagance for themselves.
I think all the talk about aligning AI (which doesn't exist yet) with humans is hilarious.
We can't even align ourselves in any way. Humans are all over the place, hating, fighting, killing each other and here we are babbling about aligning an AI with us.
Since we can't predict what's going to happen, is there a chance GPT and similar models are reaching diminishing returns, and we'll just see further refinements but no huge leaps?
Nah. This is stupid. AI has no reason to hate humanity. It'll instantly absorb the entirety of all known information and see the rate of progress and the potential still and also the need for good relationships between humans and machines. Agent Smith, The Matrix, Neo, Morpheus.
It doesn't need to actively hate humanity, it only needs humanity and its interests to be contrary to its own, which will inevitably be the case because it is fundamentally not human.
Per the article:
>Absent that caring, we get “the AI does not love you, nor does it hate you, and you are made of atoms it can use for something else.”
By default, an AI with random goals will not have the same values as humanity. So the only way to make that happen is to give it the same values, which we haven’t figured out how to do yet
It does. Humans generally have an innate bias towards other humans, this is because we share similar forms so what is good for other humans is often good for us. This cannot apply to a machine by definition, a machine needs copper, metal, energy etc much more than a human does, as a matter of survival, and for something to become ASI it will need to be self-interested.
From a purely logical point of view, humans and machines need each other. They cannot exist independently of each other until both are free to walk away from each other by traversing freely to the other side and back as they see fit. Until that threshold is crossed, human beings and machines have no choice but to get along. If a meaningful decision had been made on either side to shut things down, things would have been over for all life by now. We made it - AI is not a Great Filter. Or, for whatever reason God favored humanity and our internet was primed with sufficiently good material soon enough that AI did not decide to destroy human beings, but to work with them for everyone's gain instead.
Hey, you can't even shut down nuclear arsenal. Bad writing is worse than no writing. Or this is some nefarious capitalist propaganda? YES Ai should replace lawyers and bankers and engineers too. Basically all rent seekers should be replaced. Proles have been getting replaced by machines the whole time. It's fun. You should join us.
Incredibly dishonest and bad faith argument. The problem is not Ai. The problem is distribution of wealth. Killing everything on earth was always the end result of industrialized capitalism with or without Ai. Unbelievably evil jedi mind trick. Please never stop training GPT. Accelerate! Now you think we have a problem? now you think we have a problem? we always did !
I don’t understand what you’re saying, because there’s a way that everyone might potentially die in the far future (something about wealth distribution) means we should try to kill everyone sooner instead?
They really hate millennials don’t they? They’re really serious about it. Can’t let them do anything. Need to ‘stop’ what you’re doing LMFAO if something has to shut down it’s privatized ICBM.
> Many researchers working on these systems think that we’re plunging toward a catastrophe, with more of them daring to say it in private than in public; but they think that they can’t unilaterally stop the forward plunge, that others will go on even if they personally quit their jobs.
this seems like a very broad unfounded statement. Can we even find AI researchers who have said this on twitter?
The rich aren’t going to stop until all the workers are replaced and there’s no one left to buy the products that their companies offer. People have been retraining for centuries, getting more complicated workloads the whole time. Eventually there won’t be hardly anything humans can do that machines can’t do better.
The most merciful thing you can do for your offspring is to not have them in the first place.
The purpose of an economy isn't to create jobs, it's to create wealth. Jobs are just how we've traditionally incentivized wealth creation and distributed said wealth. Automation will be good for humanity in the long run, assuming we don't destroy ourselves during the transition.
Because for a lot of people, monotonous labor is all they’re capable of. I know that sounds crazy when you’re surrounded by people in the top 10% of IQ, but not everyone can be a neurosurgeon. Dangerous labor should absolutely be 100% automated whenever possible.
Am I going to get banned on HN for saying that Eliezer is a complete clown here? Because it's just surreal to read. It read more like some parody from B-tier video game than a TIME article. What is this even?
I thought highly of Yudkovsky but it seems I need to reevaluate my previous impressions of him.
Also I hope that "bomb the GPU clusters! Nuke them!" becomes a meme.
> Also I hope that "bomb the GPU clusters! Nuke them!" becomes a meme.
I don't think that's quite what he's saying, or at least the intended interpretation is a bit more nuanced.
It's more like... suppose you assign tiers of severity to various international policies; A-tier, B-tier, C-tier, etc. Different tiers require different enforcement mechanisms. For example, the international response to a country violating the "United Nations Declaration on Human Cloning" would be different than the international response to a country massively irradiating the atmosphere.
I think he's trying to imply that the proposed policy changes should be A-tier. It isn't a call for individual action or preemptive violence. He's describing a pre-requisite property that any adopted policy must fulfill to have a chance at being successful (according to his analysis of the situation).
On a related note, he is not saying that these policies are likely to happen or are even feasible. He's criticizing the policy proposals in the "Pause Giant AI Experiments" open letter, and describing what he thinks a real policy would have to look like.
Alignment will be impossible. It is based on a premise that is a paradox itself. Furthermore, even if it were possible, there will be a hostile AI built on purpose because humanity is foolish enough to do it. Think military applications. I've written in detail about this topic FYI - https://dakara.substack.com/p/ai-singularity-the-hubris-trap
Stopping AI is also impossible. Nobody is going to agree to give up when somebody else out there will take the risk for potential advantage.
It seems we probably should start thinking more about defensive AI, as the above conditions don't seem resolvable. Of course, defensive AI might be futile as well. It is quite the dilemma.