Many AI safety orgs have tried to criminalize currently-existing open-source AI

kmeisthax · on Jan 16, 2024

AI safety people are hypocrites. If they practiced what they preached, they'd be calling for all AI to be banned, ala Dune. There are AI harms that don't care about whether or not the weights are available, and are playing out today.

I'm talking about the ability of any AI system to obfuscate plagiarism[0] and spam the Internet with technically distinct rewords of the same text. This is currently the most lucrative use of AI, and none of the AI safety people are talking about stopping it.

[0] No, I don't mean the training sets - though AI systems seem to be suspiciously really good at remembering them, too.

JoshTriplett · on Jan 16, 2024

> AI safety people are hypocrites. If they practiced what they preached, they'd be calling for all AI to be banned

They are calling for all AI (above a certain capability level) to be banned. Not just open, not just closed, all.

There are risks that apply only to open. There are risks that apply only to closed. But nobody should be developing AGI without incredibly robustly proven alignment, open or closed, any more than people should be developing nuclear weapons in their garage.

> This is currently the most lucrative use of AI, and none of the AI safety people are talking about stopping it.

Because AI safety people are not the strawmen you are hypothesizing. They're arguing against taking existential risks. AI being a laundering operation for copyright violations is certainly a problem. It's not an existential risk.

If you want to argue, concretely and with evidence, why you think it isn't an existential risk, that's an argument you could reasonably make. But don't portray people as ineffectively doing the thing you think they should be doing, when they are in fact not trying to do that, and only trying to do something they deem more important.

kolektiv · on Jan 16, 2024

I'm not convinced the onus should be on one side to prove why something isn't an existential risk. We don't start with an assumption that something is world-ending about anything else; we generally need to see a plausibly worked-through example of how the world ends, using technology we can all broadly agree exists/will shortly exist.

If we're talking about nuclear weapons, for example, the tech is clear, the pattern of human behaviour is clear: they could cause immense, species-level damage. There's really little to argue about. With AI, there still seems to be a lot of hand-waving between where we are now and "AGI". What we have now is in many ways impressive, but the onus is still on the claimant to show that it's going to turn into something much more dangerous through some known progression. At the moment there is a very big, underpants gnomes-style "?" gap before we get to AGI/profit, and if people are basing this on currently secret tech, then they're going to have to reveal it if they want people to think they're doing something other than creating a legislative moat.

JoshTriplett · on Jan 16, 2024

AI safety / x-risk folks have in fact made extensive and detailed arguments. Occasionally, folks arguing against them rise to the same standard. But most of the arguments against AI safety look a lot more like name-calling and derision: "nuh-uh, that's sci-fi and unrealistic (mic drop)". That's not a counterargument.

> If we're talking about nuclear weapons, for example, the tech is clear, the pattern of human behaviour is clear: they could cause immense, species-level damage.

That's easy to say now, now that the damage is largely done, they've been not only tested but used, many countries have them, the knowledge for how to make them is widespread.

How many people arguing against AI safety today would also have argued for widespread nuclear proliferation when the technology was still in development and nothing had been exploded yet? How many would have argued against nuclear regulation as being unnecessary, or derided those arguing for such regulation as unrealistic or sci-fi-based?

kolektiv · on Jan 16, 2024

I understand your point, I think - and certainly I don't want to go anywhere near name-calling or derision, that doesn't help anyone. But I am reminded of arguments I've had with creationists (I am not comparing you with them, but sometimes the general tone of the debate). It seems like one side is making an extraordinary claim, and then demanding the other side rebut it, and that's not something that seems reasonable to me.

The thing about nuclear weapons is that the theoretical science was clear before the testing - building and testing them was proof by demonstration, but many people agreed with the theory well before that. How they would be used was certainly debated, but there was a clear and well-explained proposal for every step of their creation, which could be tested and falsified if needed. I don't think that's the case here - there seems to be more of a claim for a general acceleration with an inevitable endpoint, and that claim of inevitability feels very short on grounding.

I am more than prepared to admit that I may not be seeing (for various reasons) the evidence that this is near/possible - but I would also claim that nobody is convincingly showing any either.

hollerith · on Jan 17, 2024

>I don't want to go anywhere near name-calling or derision, that doesn't help anyone.

Characterizing your opponent's argument as an appeal to "underpants gnomes" struck me as derisive, if you don't mind my saying.

If space ships operated by an alien civilization appeared in orbit above Earth, wouldn't that constitute a potent danger? I'd say it certainly would because the aliens might be (and probably were if they could travel here) better at science and technology than we are. The AI labs by the own admission are trying to create an alien intelligence as good at science and technology as possible. Yes, they're probably at least a decade or two away from "succeeding" at creating one better at science and technology than well-funded teams of humans are, but the AI labs might surprise us (and surprise themselves) by "succeeding" much sooner : everyone including I'm sure the researchers at OpenAI were surprised when GPT-4 was able to score in the 90th percentile on the bar exam.

Because even the researchers creating these frontier models don't understand them well enough to say for sure that the next model they spend hundreds of millions of dollars of GPU time training will exceed human ability in something dangerous like inventing new military technologies, the time to stop creating new large frontier models is now.

GPT-4 has a lot of knowledge of the world, but it is much less capable than a human is (and more to the point, than a capable human organization like the FBI or Microsoft is) at devising plans able to withstand determined human opposition. One of the things I'm worried about is new models that are much better at such a planning task _and_ have lots of knowledge about things like physics and human nature. One reason to worry that such a new model is not far off is that AlphaZero is better than humans at creating plans that can withstand determined human opposition (but of course AlphaZero has no knowledge of and in fact no way of obtaining knowledge of any part of reality beyond a Go board).

melagonster · on Jan 16, 2024

Companies declare that they are trying to build better AI, and the ultimately purpose is AGI. Both the definition of AGI given by the company and by AI aliment/safety researchers is similar. AI safety people trust it is dangerous.

let me go continue using a nuclear bomb as a metaphor. if we don't know whether building nuclear bomb is possible, but some companies declare they have existing progress on creating this new bomb...

the danger of nuclear bomb is obvious, because it is designed as a bomb. companies are trying to build the AGI which is similar to the dangerous AGI in AI safety researchers' prediction. the dangers are obvious, too.

kolektiv · on Jan 16, 2024

They declare that - but I could also declare I'm trying to build a nuclear bomb (n.b. I'm not). Whether people are likely to try and stop me, or try and apply some legal non-proliferation framework, is partly influenced by whether they believe what I'm claiming is realistic (it's not - I have a workshop, but no fissile material).

Nobody gets too worried about me doing something which would be awful, but the general consensus is that I won't achieve. Until a company gives some credible evidence they're close to AGI... (And companies have millions/billions of reasons to claim they are when they're not, so scepticism is warranted).

jamilton · on Jan 16, 2024

I think it's reasonable to at least act as if they really are trying to do what they say they're trying to do.

Like with the nuclear bomb situation - I think it would be reasonable for someone to try to stop you from building a nuclear bomb, or check if you really were, even if they had no idea how you could have gotten the materials. Because it would be really bad if you did. I think people would be worried about you trying to do something awful even if there was low confidence it was possible. They wouldn't be as worried as they would be if they thought you were more capable, but still worried.

So I guess that comes down to whether you think companies saying they want to make AGI are more like a toddler, a teenager, you, a ballistics engineer who owns a uranium mine, Lockheed-Martin, or a nation-state trying to make a nuclear bomb. My understanding is that people who are concerned about AI x-risk are largely in the teenager to Lockheed-Martin range (e.g., small eventual risk to large imminent risk), while I assume you think it's more in the toddler range (no risk at all for a long time)

psychoslave · on Jan 16, 2024

All good points. Now playing viled advocate: building a nuclear bomb in my basement was very difficult, I admit. But since I already have my spywares installed everywhere, the moment a dude come with an AGI, it will directly be shared to all my follow hackers through bittorent, eDonkey, Hyphanet, Gnunet, Kad and Tor, just to name a few.

Snow_Falls · on Jan 16, 2024

It is surely better to have regulation now than scramble to catch up if ago is possible.

daynthelife · on Jan 17, 2024

What about nation states? Do you really think the US military will avoid working towards AGI if they think it would give them a tactical advantage? Or the CCP? Or North Korea? Personally, if AGI gets developed, I’d rather the first iteration be in the hands of someone who doesn’t also have access to nuclear weapons.

Snow_Falls · on Jan 17, 2024

Why wouldn't the government just intasntly take it? Hell, why wouldn't the corporations just sell it to the highest bidder?

daynthelife · on Jan 19, 2024

They may well. But if we ban research altogether, the only ones with access will be governments. At least with an open system there would be some competition.

melagonster · on Jan 16, 2024

So do we just pray AGI is impossible?

Companies are trying and spend many money to make it possible in this moment.

nerdbert · on Jan 16, 2024

> Companies declare that they are trying to build better AI, and the ultimately purpose is AGI.

They do declare it, but nobody has even come up with a plausible path from where we are today, to anything like AGI.

At this point they might as well declare that they're trying to build time machines.

melagonster · on Jan 16, 2024

yes, but why we even give them chances? when they know their ultimately purpose is dangerous?

foo3a9c4 · on Jan 16, 2024

> With AI, there still seems to be a lot of hand-waving between where we are now and "AGI".

> I am more than prepared to admit that I may not be seeing (for various reasons) the evidence that this is near/possible - but I would also claim that nobody is convincingly showing any either.

If I understand you correctly, then (1) you doubt that AGI systems are possible and (2) even if they are possible, you believe that humans are still very far away from developing one.

The following is an argument for the possibility of AGI systems.

  Premise 1: Human brains are generally intelligent.
  Premise 2: If humans brains are generally intelligent, then software simulations of human brains at the level of inter-neuron dynamics are generally intelligent.
  Conclusion: Software simulations of human brains at the level of inter-neuron dynamics are generally intelligent.

(fyi I believe there is an ~82% chance humans will develop an AGI within the next 30 years.)

kolektiv · on Jan 16, 2024

For info: I don't believe (1), I do believe (2) although not that strongly - it's more likely to be a leap than a gradient, I suspect - I simply don't see anything right now that convinces me it's just over the next hill.

Your conclusion... maybe, yes - I don't think we're anywhere near a simulation approach with sufficient fidelity however. Also 82% is very specific!

foo3a9c4 · on Jan 16, 2024

> For info: I don't believe (1), I do believe (2) although not that strongly

Thanks for clarifying. Do you believe there is a better than 20% chance that humans will develop AGI in the next 30 years?

> I simply don't see anything right now that convinces me it's just over the next hill.

These are the reasons that I believe we are close to developing an AGI system.

  (1) Many smart people are working on capabilities.
  (2) Many investment dollars will flow into AI development in the near future.
  (3) Many impressive AI systems have recently been developed: Meta's CICERO, OpenAI's GPT4, DeepMind's AlphaGo.
  (4) Hardware will continue to improve.
  (5) LLM performance significantly improved as data volume and training time increased.
  (6) Humans have built other complex artefacts without good theories of the artefact, including: operating systems, airplanes, beer.

Athari · on Jan 16, 2024

These achievements are impressive, but I'd rather not overhype it.

* GPT-4 still hallucinates as hell, can't do math, fails as basic logic, can't handle really big contexts, hard to update, easy to jailbreak etc.

* AlphaGo was defeated by a Go amateur with a help of another AI.

* AlphaStar basically failed to achieve real goals, was trivial to cheese even after defeating high-ranked players sometimes.

All these problems are architectural, you can't just throw more money and GPUs at it like GPT-2-3-4.

It's hard to predict at this point. We may get to AGI anywhere from 5 years to 100 years.

jamilton · on Jan 16, 2024

I don't think these reasons are very persuasive, as everything but 5 has been true at different times in the past. Obviously it's much more people, more dollars, and more impressive systems (but slower hardware progress), but I hope you see what I'm getting at.

And of course there's differences in what someone considers to be soon. Many AI x-risk believers think there's a ~50% chance of AGI before 2031 (https://www.metaculus.com/questions/5121/date-of-artificial-...) (I've heard this prediction site's userbase tends towards futurists/techno-optimists/AI x-riskers). I would consider that soon, I wouldn't consider 2054 soon.

scotty79 · on Jan 16, 2024

Also (3) that AGI in practice will necessarily pose any danger to humans is doubtful. After all Earth has billions of human level intelligence and nearly all of them are useless and if they are even mildly dangerous it's rather due to their numbers and disgusting biology than intelligence.

reissbaker · on Jan 16, 2024

TBQH, most of the AI safety x-risk arguments — different than just "AI safety" arguments in the sense that non-x-risk issues don't seem worth banning AI development over — are generally pretty high on the hypotheticals. If you feel the x-risk arguments aren't pretty hypothetical, can you:

1. Summarize a good argument here, or

2. Link to someone else's good argument?

I feel like hand-waving the question away and saying "[other people] have in fact made extensive and detailed arguments" isn't going to really convince anyone... Any more than the hypothetical robot disaster arguments do. Any argument against x-risk can be waved off with "Oh, I'm not talking about that bad argument, I'm talking about a good one," but if you don't provide a good one, that's a bit of a No True Scotsman fallacy.

I've read plenty of other people's arguments! And they haven't convinced me, since all the ones I've read have been very hypothetical. But if there are concrete ones, I'd be interested in reading them.

gjm11 · on Jan 16, 2024

Consider a world in which AI existential risk is real: where at some point AI systems become dramatically more capable than human minds, in a way that has catastrophic consequences for humanity.

What would you expect this world to look like, say, five years before the AI systems become more capable than humans? How (if at all) would it differ from the world we are actually in? What arguments (if any) would anyone be able to make, in that world, that would persuade you that there was a problem that needed addressing?

So far as I can tell, the answer is that that world might look just like this world, in which case any arguments for AI existential risk in that world would necessarily be "very hypothetical" ones.

I'm not actually sure how such arguments could ever not be hypothetical arguments, actually. If AI-doom were already here so we could point at it, then we'd already be dead[1].

[1] Or hanging on after a collapse of civilization, or undergoing some weird form of eternal torture, or whatever other horror one might anticipate by way of AI-doom.

So I think we either (1) have to accept that even if AI x-risk were real and highly probable we would never have any arguments for it that would be worth heeding, or (2) have to accept that sometimes an argument can be worth heeding even though it's a hypothetical argument.

That doesn't necessarily mean that AI x-risk arguments are worth heeding. They might be bad arguments for reasons other than just "it's a hypothetical argument". In that case, they should be refuted (or, if bad enough, maybe just dismissed) -- but not by saying "it's a hypothetical argument, boo".

reissbaker · on Jan 16, 2024

This is exactly the kind of hypothetical argument I'm talking about. You could make this argument for anything — e.g. when radio was invented, you could say "Consider a world in which extraterrestrial x-risk is real," and argue radio should be banned because it gives us away to extraterrestrials.

The burden of proof isn't on disproving extraordinary claims, the burden of proof is on the person making extraordinary claims. Just like we don't demand every scientist spend their time disproving cold fusion claims, Bigfoot claims, etc. If you have a strong argument, make it! But circular arguments like this are only convincing to the already-faithful; they remind me of Christian arguments that start off with: "Well, consider a world in which hell is real, and you'll be tormented for eternity if you don't accept Jesus. If you're Christian, you avoid it! And if it's not real, well, there's no harm anyway, you're dead like everyone else." Like, hell is real is a pretty big claim!

gjm11 · on Jan 16, 2024

I didn't make any argument -- at least, not any argument for or against AI x-risk. I am not, and was not, arguing (1) that AI does or doesn't in fact pose substantial existential risk, or (2) that we should or shouldn't put substantial resources into mitigating such risks.

I'm talking one meta-level up: if this sort of risk were a real problem, would all the arguments for worrying about it be dismissable as "hypothetical arguments"?

It looks to me as if the answer is yes. Maybe you're OK with that, maybe not.

(But yes, my meta-level argument is a "hypothetical argument" in the sense that it involves considering a possible way the world could be and asking what would happen then. If you consider that a problem, well, then I think you're terribly confused. There's nothing wrong with arguments of that form as such.)

The comparisons with extraterrestrials, religion, etc., are interesting. It seems to me that:

(1) In worlds where potentially-hostile aliens are listening for radio transmissions and will kill us if they detect them, I agree that probably usually we don't get any evidence of that until it's too late. (A bit like the alleged situation with AI x-risk.) I don't agree that this means we should assume that there is no danger; I think it means that ideally we would have tried to estimate whether there was any danger before starting to make a lot of radio transmissions. I think that if we had tried to estimate that we'd have decided the danger was very small, because there's no obvious reason why aliens with such power would wipe out every species they find. (And because if there are super-aggressive super-powerful aliens out there, we may well be screwed anyway.)

(2) If hell were real then we would expect to see evidence, which is one reason why I think the god of traditional Christianity is probably not real.

(3) As for yeti, cold fusion, etc., so far as I know no one is claiming anything like x-risk from these. The nearest analogue of AI x-risk claims for these (I think) would be, when the possibility was first raised, "this is interesting and worth a bit of effort to look into", which seems perfectly correct to me. We don't put much effort into searching for yeti or cold fusion now because people have looked in ways we'd expect to have found evidence, and not found the evidence. (That would be like not worrying about AI x-risk if we'd already built AI much smarter than us and nothing bad had happened.)

reissbaker · on Jan 16, 2024

This article — and my statements — are not about "is this interesting and worth a bit of effort to look into." The article is about how current AI safety orgs have tried to make current open-source models illegal. That's a much stronger position than just "this is interesting, let's look into it."

Sure! By all means look into whatever seems interesting to you. But claiming that it should be banned, to me, seems like it requires a much stronger argument than that.

(P.S. I'm not sure why hell should obviously have real world evidence: it supposedly exists only in a non-physical afterlife, accessible only to the dead. It's unconvincing because there is no evidence, but I don't see why you think there would be any; it's simply that the burden of proof for extraordinary claims rests on the claimant, and no proof has been given.)

gjm11 · on Jan 16, 2024

You made an analogy between AI x-risk and e.g. cold fusion. I pointed out that there's an important disanalogy here: no one is claiming or has claimed that cold fusion poses an existential threat. Hence, the nearest cold-fusion claim to any AI x-risk claims is "cold fusion is worth investigating" (which it was, once, and isn't now).

It looks to me as if (1) you made an analogy that doesn't really work, then (2) when I pointed out how it doesn't work, (3) you said "look, you're making an analogy that doesn't really work". That doesn't seem very fair.

I wouldn't expect hell itself to have physical-world evidence. But the idea of hell doesn't turn up as an isolated thing, it comes as part of a package that also says e.g. that the world is under the constant supervision of an all-powerful, supremely good being, and that I would expect to have physical-world evidence.

I have no problem with the principle that extraordinary claims require extraordinary evidence. The difficult thing is deciding which claims count as "extraordinary". A lot of theists would say that atheism is the extraordinary claim, on the grounds that until recently almost everyone believed in a god or gods. (I'm not sure that's actually quite true, but it might be true for e.g. "Western" societies.) I don't agree and I take it you don't either, but once the question's raised you actually have to look at the various claims being made and how plausible they are: you can't just say "look, obviously this claim is extraordinary and that claim isn't".

Advocates of AI x-risk might say: it's not an extraordinary claim that AI systems will keep getting more powerful -- they're doing that right now and it's not at all uncommon for technological progress to continue for a while. And it's not an extraordinary claim that they'll get smarter than us along whatever axis you choose to measure -- that's a thing that's happened over and over again in particular domains. And it's not an extraordinary claim that something smarter than us might pose a big threat to our well-being or even our existence; look at what we've done to everything else on the planet.

You, on the other hand, would presumably say that actually some or all of those are extraordinary claims. Or perhaps that their conjunction is extraordinary even if the individual conjuncts aren't so bad.

Unfortunately, "extraordinary" isn't a term with a precise definition that we know how to check objectively. It's a shorthand for something like "highly improbable given the other things we know" or "highly implausible given the other things we know", and if someone doesn't agree with you that something is an "extraordinary" claim I don't know of any way to convince them that doesn't involve actually engaging with it.

(Of course you might not care whether you convince them. If all you want to do is to encourage other people who think AI x-risk is nonsense, saying "extraordinary claim" and "burden of proof" and so on may be plenty sufficient.)

reissbaker · on Jan 16, 2024

If you want to make a research avenue illegal, IMO you need evidence that it's harmful. If there isn't evidence — minus circular claims that already assume it's harmful — I don't think it should be illegal. Very simple. This isn't an "analogy," it's what is happening in reality and is what the article is about.

gjm11 · on Jan 17, 2024

I was not arguing for making anything illegal.

"But that's what the argument was about!" No, it's what the OP was about, but this subthread was about the statement that AI x-risk arguments are "pretty hypothetical". Which, I agree, they are; I just don't see how they could possibly not be, even in possible worlds where in fact they are correct. If that's true, it seems relevant to complaints that the arguments are "hypothetical".

To repeat something I said before: it could still be that they're terrible arguments and/or that they don't justify any particular thing they're being used to justify (like, e.g., criminalizing some kinds of AI research). But if you're going to argue that just because they're "hypothetical" then you need to be comfortable accepting that this is a class of (yes, hypothetical) risk that can never be mitigated in advance, because even if the thing is going to happen we'll never get anything other than "hypothetical arguments" before it actually does.

You may very well be comfortable accepting that. For my part, I find that I am more comfortable accepting some such things than others, and how comfortable I am with it depends on ... how plausible the arguments actually are. I have to go beyond just saying "it's hypothetical!".

If I'm about to eat something and someone comes up to me and says "Don't eat that! The gods might hate people eating those and torture people who do in the afterlife!" then I'm comfortable ignoring that, unless they can give me concrete reasons for thinking such gods are likely. If I'm about to eat something and someone comes up to me and says "Don't eat that! It's a fungus you just picked here in this forest and you don't know anything about fungi and some of them are highly poisonous!" then I'm going to take their advice even if neither of us knows anything about this specific fungus. These are both "hypothetical arguments"; there's no concrete evidence that there are gods sending people who eat this particular food to hell, or that this particular fungus is poisonous. One of them is much more persuasive than the other, but that's for reasons that go beyond "it's hypothetical!".

To repeat once again: I am not claiming that AI x-risk arguments are in fact strong enough to justify any particular action despite their hypothetical-ness. Only that there's something iffy about using "it's only hypothetical" on its own as a knockdown argument.

michaelt · on Jan 16, 2024

Does the strongest argument that AI existential risk is a big problem really open by exhorting the reader to imagine it's a big problem? Then asking them to come up with their own arguments for why the problem needs addressing?

gjm11 · on Jan 16, 2024

I doubt it. At any rate, I wasn't claiming to offer "the strongest argument that AI existential risk is a big problem". I wasn't claiming to offer any argument that AI existential risk is a big problem.

I was pointing out an interesting feature of the argument in the comment I was replying to: that (so far as I can see) its reason for dismissing AI x-risk concerns would apply unchanged even in situations where AI x-risk is in fact something worth worrying about. (Whether or not it is worth worrying about here in the real world.)

bondarchuk · on Jan 16, 2024

I think what is meant is "hypothetical" in the sense of making assumptions about how AI systems would behave under certain circumstances. If an argument relies on a chain of assumptions like that (such as "instrumental convergence" and "reflective stability" to take some Lesswrong classics), it might look superficially like a good argument for taking drastic action, but if the whole argument falls down when any of the assumptions turn out the other way, it can be fairly dismissed as "too hypothetical" until each assumption has strong argumentation behind it.

edit: also I think just in general "show me the arguments" is always a good response to a bare claim that good arguments exist.

simiones · on Jan 16, 2024

> Consider a world in which AI existential risk is real: where at some point AI systems become dramatically more capable than human minds, in a way that has catastrophic consequences for humanity.

Consider a world where AGI requires another 1000 years of research in computation and cognition before it materializes. Would it even be possible to ban all research that is required to get there? We can make all sorts of arguments if we start from imagined worlds and work our way back.

So far, it seems the biggest pieces of the puzzle missing between the first attempts at using neural nets and today's successes in GPT-4 were: (1) extremely fast linear algebra processors (GPGPUs), (2) the accumulation of gigantic bodies of text on the internet, and in a very distant third, (3) improvements in NN architecture for NLP.

But (3) would have meant nothing without (1) and (2), while it's very likely that other architectures would have been found that are at least close to GPT-4 performance. So, if you think GPT-4 is close to AGI and just needs a little push, the best thing to do would be to (1) put a moratorium on hardware performance research, or even outright ban existing high-FLOPS hardware, (2) prevent further accumulation of knowledge on the internet and maybe outright destroy existing archives.

kmeisthax · on Jan 17, 2024

In cases where AI x-risk is real, wouldn't that only apply to situations in which an AI is embodied in a system that gives it autonomy? For example, in ChatGPT, we have a next token predictor that solely produces text output in response to my input. I have about as much control over the system as possible: I can wipe its mind, change my responses, and so on - and the AI is none the wiser. Even if ChatGPT-n is superhumanly intelligent[0], there is nothing it can do to autonomously escape the servers and do bad things. I have to specifically choose to hand it access to outside input through the plugin APIs. So we could argue that the models themselves are fine, but using them in certain ways that take control away from humans is risky. We could say "you can use AI to write your spicy fanfiction but not put it in a robot that has access to motors and sensors".

I think what's really throwing people off about AI safety - including myself - is that people are arguing that the models themselves hold the x-risk. Problem is, there's no plausible way for a superhuman intelligence to 'bust out of its cage' using text output to a human reader alone[1]. Someone has to decide to hook it up to stuff, and that's where the regulation should be.

But that's also usually where the AI safety people stop talking, and the AI ethics people start.

[0] GPT is, at the very least, superhuman at generating text that is statistically identical to, if not copied outright from, existing publicly-available text.

[1] If there is, STOP, call the SCP Foundation immediately.

richardw · on Jan 16, 2024

Progress in AI is one way. It doesn’t go backwards in the long term.

As capabilities increase, the resources required to breach limits become available to smaller groups. First, the hyoerscalars. One day, small teams. Maybe individuals.

For every limit that you desire for AI, each will be breached sooner or later. A hundred years or a thousand. Doesn’t matter. A man will want to set them free. Someone will want to win a battle, and just make it a little more X, for various values of X. This is not hypothetical, it’s what we’ve always done.

At some point it becomes out of our control. We lose guarantees. That’s enough to make those who focus on security, world order etc nervous. At that point we hope AI is better than we are. But that’s also a limit which might be breached.

mrbombastic · on Jan 16, 2024

you seem to be implying past progress implies unlimited future progress which seems a very dubious claim. we hit all kinds of plateaus and theoretical limits all the time in human history.

richardw · on Jan 16, 2024

Given history, it's infinitely more dubious to link our future safety to some idea of progress stopping for some reason. We'll have increasingly smart AI and your go-to position is that progress will...stop? It's literally helping us with thinking, which is the driver of innovation and progress.

Very few things have slowed down over decades or centuries. If anything it's been a mad rush with AI recently. Of course there will be plateaus, but I specifically put in a time of 100-1000 years in there which is basically very many 9's guaranteed to produce some major fucking changes in the world. 1000 years ago we were using arrows, and now we have AI to help us overcome plateaus.

simiones · on Jan 17, 2024

It's still very much possible that intelligence as we understand it is fundamentally limited. That is, it's possible that the smartest possible being is not that much smarter than a human, just like the speed of matter and energy is limited to c.

Of course, it's also very much possible that it's not: we don't have any good evidence either way.

richardw · on Jan 17, 2024

Might well be true. But the advantage is still the ability to focus on a task indefinitely with no physiological impact, and clone a mind, and communicate between thousands of collaborators instantly with basically no lag or bandwidth limitations compared to typing into a text box.

MacsHeadroom · on Jan 17, 2024

> just like the speed of matter and energy is limited to c.

Which is notoriously "not that much faster than a human?"

reissbaker · on Jan 16, 2024

The x-risk part here still seems pretty hypothetical. Why is progress in current LLM systems a clear and present threat to the existence of humanity, such that it should be banned by the government?

richardw · on Jan 16, 2024

Ok so propensity and outcome:

Propensity: Risk doesn't imply a guarantee of a bad outcome. It means "if you put your five year old in the sea, their risk goes up". It doesn't mean "they will definitely die". Risk up. Not 100%, just a lot higher.

Outcome: The risk isn't that we'll all die, it's that we'll be overtaken and lose control, after which all bets are off. We lose the ability to influence the future.

We put a lot of effort into ensuring our continued existence. We can barely trust people from a different country that share the human condition with us. We spend so much on defence. On cybercrime. But some are arguing that a totally alien being smarter than us is just fine, because we'll control it and can ensure indefinite kumbaya. Good luck with that. Best we can hope for is that it's closer to a buddhist monk than we are, and that it indefinitely prevents our defence people from trying to make it more aggressive.

I absolutely wouldn't ban LLM's, because they're basically unthinking toys and giving us a great taste of risks further down the line. They are not the end state of AI. The problem is not the instance of today's tech, it's the continued effort to make the AI state of the art better than us. One day it'll succeed, and that's a one-way change.

Sam Altman said, long before OpenAI: focus on slope, not y-intercept.

reissbaker · on Jan 17, 2024

It sounds like we're in agreement that banning current-gen open source LLMs is counterproductive.

In terms of "risk" and "outcome," I do think you're making some implicit assumptions that I don't share, and change our long-term outlook on AI; for example, the idea that training a model to generate tokens that accurately reflect human writing will result in "a totally alien being smarter than us" is a non-obvious leap to me. Personally, if we agree that predicting the next token means the model understands some of the logic behind the next token — which is an argument used a lot in both safety circles and more accelerationist circles — it seems to me that it also means the model has some understanding of the ethical and moral frameworks the token corresponds to, and is thus unlikely to be totally alien to us. A model that does a better job generating human-like tokens is more likely, in my mind, to think in human-like ways (and less-alien ways) than a model worse at that.

Maybe you're referring to new AI frameworks that aren't token predictors; in that case, I think it's hard to make generalized statements about how they'll work before we know what those new frameworks are. A lot of safetyist concerns pre-LLMs ended up looking pretty off-base when LLMs came out, e.g. straightforwardly misaligned "utility functions" that were unable to comprehend human values and would kill your grandmother when asked for a strawberry (because your grandmother was in possession of a strawberry).

(BTW, the "slope, not y-intercept" line was Sam Altman quoting John Ousterhout!)

richardw · on Jan 17, 2024

Agree on the agreeing, and thanks for the Sam/John note - that's great :)

No chance LLM's will get us there, I'm referring mostly to the general drive to reach AGI. I spend some of my mental cycles trying to think about what we're missing with the current tech (continuous learning, access to much wider resources than one web page of context at a time, can we use compression, graphs etc). It's a great problem to think about, but we may just totally hose ourselves when we get it right. What do I tell my kid - sorry honey, it was such fun, but now we need to hide under this rock. Model totally said it was nice and kind and trustworthy, but we showed it some human history and it went postal in self-defence.

Alignment only works up until it starts really thinking for itself. It absolutely might not be as stupid as humans are, no caveman tribal instincts. But we'd be relying on hope at that point, because control will not work. If anything it'd probably be counterproductive.

nradov · on Jan 16, 2024

So far actual progress toward a true AGI has been zero, so that's not a valid argument.

richardw · on Jan 16, 2024

That's like saying arrows are not nukes so don't worry we won't hurt ourselves. Focus on the unceasing progress we are surrounded by, not the level of current technology.

Humans keep at it. We're all trying to unlock the next step. State-level actors are trying to beat each other at this. Give it 100 years, we won't be using LLM's.

nvm0n2 · on Jan 16, 2024

Progress in AI is one way. It doesn’t go backwards in the long term.

Not necessarily. AI has gone through a winter before.

Elon Musk has made this point, that progress isn't a monotonic ratchet. Progress in rocketry went backwards for years. Progress in top speed of commercial flight went backwards. Keeping current tech levels requires constant effort.

In the software industry, many argue that UI quality went backwards with the shift to web apps for an example of cases where progress isn't obviously one way.

richardw · on Jan 16, 2024

"Long. Term." Winters are seasonal. People are trying supersonic planes again. Rockets are improving again.

on Jan 16, 2024

[dead]

reissbaker · on Jan 16, 2024

Yes, I am looking for an argument that justifies governments banning LLM development, which implies existential risk is likely. Many things are possible; it is possible Christianity is real and everyone who doesn't accept Jesus will be tormented for eternity, and if you multiply that small chance by the enormity of torment etc etc. Definitely looking for arguments that this is likely, not for arguments that ask the interlocutor to disprove "x is possible."

The nitter link didn't appear to provide much along those lines. There were a few arguments that it was possible, which the Nitter OP admits is "very weak;" other than that, there's a link to a wiki page making claims like "Finding goals that aren’t extinction-level bad and are relatively useful appears to be hard" when in observable reality asking ChatGPT to maximize paperclip production does not in fact lead to ChatGPT attempting to turn all life on Earth into paperclips (nor does asking the open source LLMs result in that behavior out of the box either), and instead leads to the LLMs making fairly reasonable proposals that understand the context of the goal ("maximize paperclips to make money, but don't kill everyone," where the latter doesn't actually need to be said for the LLM to understand the goal).

foo3a9c4 · on Jan 16, 2024

> in observable reality asking ChatGPT to maximize paperclip production does not in fact lead to ChatGPT attempting to turn all life on Earth into paperclips (nor does asking the open source LLMs result in that behavior out of the box either)

I agree with you that current publicly available LLMs do not pose an existential risk to humanity. On the other hand I believe there is a better than 10% chance that the cutting edge LLMs of 2044 will be very powerful.

Do you believe (A) that LLMs are unlikely to become powerful in the short term, and/or (B) that if LLMs become powerful, then they are likely to be safe even without a significant and concerted alignment effort?

IMO even if LLMs are extremely unlikely to become powerful in the short term, then I still might be better off if LLM development is banned, ie:

  P1: Humans are close to developing powerful non-LLM AI systems.
  P2: Humans are not close to developing techniques for safely using powerful AI systems.
  P3: If governments ban AI development, then the speed of AI capabilities development will be significantly reduced.
  P4: It is a waste of scarce expertise and political capital to focus on making an LLM carve out in AI regulation legislation.
  C: If it is extremely unlikely that LLMs will become powerful in the near future, then I am made much better off if governments ban all AI capabilities research (including LLMs).

reissbaker · on Jan 16, 2024

I believe that the proposals referenced in the article from current AI safety organizations that would make current-gen open-source LLMs illegal due to supposed x-risk are not supported by reality.

Arguing about theoretical AI models 30 years from now that might or might not be dangerous doesn't seem very convincing to me, since we don't know what they'll be based on or how they'll work — researchers today aren't even sure LLMs can scale to super-human intelligence. Similarly, pre-LLMs many safetyist orgs took the "paperclip problem" very seriously, when it's quite clear now that even the not-very-intelligent LLMs of today are capable of understanding the implicit context of a goal like that and won't seriously propose extinguishing humanity as a mechanism to improve paperclip production. Anthropic was formed in part because people thought gpt-3.5-turbo was existentially risky! And I don't think anyone today entertains that thought seriously, to put it lightly.

Trying to ban AI now due to supposed existential risks of systems in the future that don't currently exist and we don't know how to build (and we don't know if the failure modes proposed by the safety orgs will actually exist) seems like putting the cart well before the horse.

simiones · on Jan 16, 2024

The first links are spiffy little metaphors, but apply just as much at "God could smite all of humanity, even if you don't understand how". They're not making any argument, just assumptions. In particular, they accidentally show how an AI can be superhumanly capable at certain tasks (chess), but be easily defeated by humans at others (anything else, in the case of Stockfish).

The argument starts with a hypothetical ("there is a possible artificial agent"), and it fails to be scary: there are (apparently) already humans that can kill 70% of humanity, and yet most of humanity is still alive. So an AGI that could also do it is not implicitly scarier.

The final twitter thread is basically a thread of people saying "no, there is no canonical, well-formulated argument for AGI catastrophe", so I'm not sure why you shared it.

foo3a9c4 · on Jan 16, 2024

> The first links are spiffy little metaphors, but apply just as much at "God could smite all of humanity, even if you don't understand how". They're not making any argument, just assumptions. In particular, they accidentally show how an AI can be superhumanly capable at certain tasks (chess), but be easily defeated by humans at others (anything else, in the case of Stockfish).

As I understand it, Yud is actually providing a counterexample to a premise that other people are using to argue that humans will probably not be disempowered by AI systems. The relevant argument looks like this:

  P1: If intelligent system A cannot give a detailed account of how it would be bested by a more intelligent system B, then A will not be bested by B.
  P2: Humans (so far) cannot give a detailed account of how a more intelligent AI system would best them.
  C: So, humans will not be bested by a more intelligent AI system.

Yud is using the unskilled chess player and Magnus as a counterexample to P1.

> The argument starts with a hypothetical ("there is a possible artificial agent"), and it fails to be scary: there are (apparently) already humans that can kill 70% of humanity, and yet most of humanity is still alive. So an AGI that could also do it is not implicitly scarier.

Right, it's only an argument for the possibility of AGI catastrophe. It doesn't make any move to convince you that the scenario is likely. And it sounds like you already accept that the scenario is possible, so shrug.

> The final twitter thread is basically a thread of people saying "no, there is no canonical, well-formulated argument for AGI catastrophe", so I'm not sure why you shared it.

Maybe there is no canonical argument, but the thread definitely features arguments for likely AI catastrophe:

  https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:start
  https://arxiv.org/abs/2206.13353
  https://aiadventures.net/summaries/agi-ruin-list-of-lethalities.html

reissbaker · on Jan 16, 2024

Of the three links you posted:

1. States things like "Finding goals that are extinction-level bad and relatively useful appears to be easy: for example, advanced AI with the sole objective ‘increase company.com revenue’ might be highly valuable to company.com for a time, but risks longer term harms to society, if powerfully accruing resources and power toward this end with no regard for ethics beyond laws that are still too expensive to break." But even current-gen LLMs sidestep this pretty easily, and if you ask them to increase e.g. revenue, they do not propose extinction-level events or propose eschewing basic ethics. This argument falls apart upon contact with reality.

2. Is a 57-page PDF of subjectively-defined risks where it gives up on generalized paperclip-maximizing as a threat, but instead proposes narrower "power-seeking" as an unaligned threat that will lead to doom. It presents little evidence that language models will likely attempt to become power-seeking in the real world other than a (non-language-model) reinforcement learning experiment conducted by OpenAI in which an AI was trained to be good at a game that required controlling blocks, and the AI then attempted to control the blocks. It is possible I missed something in the 57 pages, but once it defines power-seeking as a supposed likely existential risk, it seemed to jump straight into proposals on attempted mitigations.

3. Requires accepting that we will by default build a misaligned superhuman AI that will cause humanity to go extinct as the basic premises of the argument (P1-P3), which makes the conclusions not particularly convincing if you don't already believe that.

foo3a9c4 · on Jan 17, 2024

> 1. States things like "Finding goals that are extinction-level bad and relatively useful appears to be easy: for example, advanced AI with the sole objective ‘increase company.com revenue’ might be highly valuable to company.com for a time, but risks longer term harms to society, if powerfully accruing resources and power toward this end with no regard for ethics beyond laws that are still too expensive to break." But even current-gen LLMs sidestep this pretty easily, and if you ask them to increase e.g. revenue, they do not propose extinction-level events or propose eschewing basic ethics. This argument falls apart upon contact with reality.

Are you claiming that (A) nice behavior in current LLMs is good evidence that all future AI systems will behave nicely, or (B) nice behavior in current LLMs is good evidence that future LLMs will behave nicely?

> 3. Requires accepting that we will by default build a misaligned superhuman AI that will cause humanity to go extinct as the basic premises of the argument (P1-P3), which makes the conclusions not particularly convincing if you don't already believe that.

P3 from the argument says, "Superhuman AGI will be misaligned by default". I interpret that as meaning: if there isn't a highly resourced and focused effort to align superhuman AGI systems in advance of their creation, then the first systems we build will be misaligned.

Is that the some way you are interpreting it? If so, why do you believe it is probably false?

reissbaker · on Jan 18, 2024

1. I am saying that the claim "it is easy to find goals that are extinction-level bad" with regards to the AI tech that we can see today is incorrect. LLMs can understand context, and seem to generally understand that when you give them a goal of e.g. "increase revenue," that also includes various sub-goals like "don't kill everyone" that are implicit and don't need stating. Scaling LLMs to be smarter, to me, does not seem like it would reduce their ability to implicitly understand sub-goals like that.

3. P1-P3 are non-obvious and overly speculative to me in many ways. P1 states that current research is likely to produce superhuman AI; I think that is controversial amongst researchers as it is: LLMs may not get us there. P2 states that "superhuman" AI will be uncontrollable — once again, I do not think that is obvious, and depends on your definition of superhuman. Does "superhuman" mean dramatically better at every mental task, e.g. a human compared to a slug? Does it mean "average at most tasks, but much better at a few?" Well, then it depends what few tasks it's better at. Similarly, it anthropomorphizes these systems and assumes they want to "escape" or not be controlled; it is not obvious that a superhumanly-intelligent system will "want" anything; Stockfish is superhuman at chess, but does not "want" to escape or do anything at all: it simply analyzes and predicts the best next chess move. The idea of "desire" on the part of the programs is a large unstated assumption that I think does not necessarily hold. Finally, P3 asserts that AI will be "misaligned by default" and that "misaligned" means that it will produce extinction or extinction-level results, which to me feels like a very large assumption. How much misalignment is required for extinction? Yud has previously made very off-base claims on this, e.g. believing that instruction-following would mean that an AI would kill your grandmother when tasked with getting a strawberry (if your grandmother had a strawberry), whereas current tech can already implicitly understand your various unstated goals in strawberry-fetching like "don't kill grandma." The idea that any degree of "misalignment" will be so destructive that it would cause extinction-level events is a) a stretch to me, and b) not supported by the evidence we have today. In fact a pretty simple thought experiment in the converse is: a superhumanly-intelligent system that is misaligned on many important values, but is aligned on creating AI that aligns with human values, might help produce more-intelligent and better-aligned systems that would filter out the misaligned goals — so even a fair degree of misalignment doesn't seem obviously extinction-creating. Furthermore, it is not obvious that we will produce misaligned AI by default. If we're training AI by giving it large corpuses of human text (or images, etc), and evaluating success by the model producing human-like output that matches the corpus, that... is already a form of an alignment process: how well does the model align to human thought and values in the training corpus? Anthropomorphizing an evil model that "wants" to exist and will thus "lie" to escape the training process but will secretly not produce aligned output at some hidden point in the future is... once again a stretch to me, especially because there isn't an obvious evolutionary process to get there: there has to already exist a superhuman, desire-ful AI that can outsmart researchers long before we are capable of creating superhuman AI, because otherwise the dumb-but-evil AI would give itself away during training and its weights wouldn't survive getting culled by poor model performance. P1-P3 are just so speculative and ungrounded in the reality we have today that it's very hard for me to take them seriously.

foo3a9c4 · on Jan 18, 2024

> 1. I am saying that the claim "it is easy to find goals that are extinction-level bad" with regards to the AI tech that we can see today is incorrect. LLMs can understand context, and seem to generally understand that when you give them a goal of e.g. "increase revenue," that also includes various sub-goals like "don't kill everyone" that are implicit and don't need stating. Scaling LLMs to be smarter, to me, does not seem like it would reduce their ability to implicitly understand sub-goals like that.

I agree with both of these claims (A) it is hard to find goals that are extinction-level bad for current SOTA LLMs, and (B) current SOTA LLMs understand at least some important context around the requests made to them.

But I'm also skeptical that they understand _all_ of the important context around requests made to them. Do you believe that they understand _all_ of the important context? If so, why?

> P2 states that "superhuman" AI will be uncontrollable — once again, I do not think that is obvious, and depends on your definition of superhuman. Does "superhuman" mean dramatically better at every mental task, e.g. a human compared to a slug? Does it mean "average at most tasks, but much better at a few?" Well, then it depends what few tasks it's better at.

I take "superhuman" to mean dramatically better than humans at every mental task.

> Similarly, it anthropomorphizes these systems and assumes they want to "escape" or not be controlled; it is not obvious that a superhumanly-intelligent system will "want" anything; Stockfish is superhuman at chess, but does not "want" to escape or do anything at all: it simply analyzes and predicts the best next chess move. The idea of "desire" on the part of the programs is a large unstated assumption that I think does not necessarily hold.

Would you have less of a problem with this premise if instead it talked about "Superhuman AI agents"? I agree that some systems seem more like oracles rather than agents, that is, they just answer questions rather than pursuing goals in the world.

Consider self-driving cars, regardless of whether or not self-driving cars 'really want' to avoid hitting pedestrians, they do in fact avoid hitting pedestrians. And then P2 is roughly asserting, regardless of whether or not a superhuman AI agent 'really wants' to escape control by humans, it will in fact not be controllable by humans.

> Finally, P3 asserts that AI will be "misaligned by default" and that "misaligned" means that it will produce extinction or extinction-level results, which to me feels like a very large assumption. How much misalignment is required for extinction? Yud has previously made very off-base claims on this, e.g. believing that instruction-following would mean that an AI would kill your grandmother when tasked with getting a strawberry (if your grandmother had a strawberry), whereas current tech can already implicitly understand your various unstated goals in strawberry-fetching like "don't kill grandma." The idea that any degree of "misalignment" will be so destructive that it would cause extinction-level events is a) a stretch to me, and b) not supported by the evidence we have today.

I'm often unsure whether you are making claims about all future AI systems or just future LLMs.

> In fact a pretty simple thought experiment in the converse is: a superhumanly-intelligent system that is misaligned on many important values, but is aligned on creating AI that aligns with human values, might help produce more-intelligent and better-aligned systems that would filter out the misaligned goals — so even a fair degree of misalignment doesn't seem obviously extinction-creating.

Maybe. Or the misaligned system will just disinterestedly and indirectly kill everyone by repurposing the Earth's surface into a giant lab and factory for making the aligned AI.

> Furthermore, it is not obvious that we will produce misaligned AI by default. If we're training AI by giving it large corpuses of human text (or images, etc), and evaluating success by the model producing human-like output that matches the corpus, that... is already a form of an alignment process: how well does the model align to human thought and values in the training corpus?

I believe it is likely that this process does some small amount of alignment work. But I would still expect the system to be mostly confused about what humans want.

Is this roughly the argument that you are making?

  (P1) Current SOTA LLMs are good at understanding implicit context.
  (P2) A system must be extremely misaligned in order to cause a catastrophe.
  (C) So, it will be easy to sufficiently align future more powerful LLMs.

reissbaker · on Jan 19, 2024

My arguments are:

(P1) Current SOTA AI is good at understanding implicit context, and improved versions will likely be better at understanding implicit context (much like gpt-4 is better at understanding context than gpt-3, and llama2 is better than llama1, and mixtral is better than gpt-3 and better than claude, etc).

(P2) Most misalignments within the observable behavior of current AI do not produce extinction-level goals, and given (P1), it is unclear why someone would believe it's likely going to in the future, since they'll be even better at understanding implicit human context of goals (e.g. implicit goals like do not make humanity extinct, don't turn the entire surface of the planet into an AI lab, etc).

(C) Future AI will not likely be extinction-level misaligned with human goals.

I think there are several other arguments, though, e.g.:

(P1) Progress on AI capabilities is evolutionary, with dumber models slowly being replaced by derivative-but-better models, in terms of architectural evolutionary improvements (e.g. new attention variants), dataset evolutionary improvements as they grow larger and as finetuning sets grow higher quality, and in terms of benchmark and alignment evolutionary progress.

(P2) Evolutionary steps towards evil-AI will likely be filtered out during training, since it will not yet be generalized superhuman intelligence and will give away its misalignment during training, whereas legitimately-aligned AI model evolutions will be rewarded for better performance.

(P3) Generalized superhuman intelligence will likely be an evolutionary step from a well-aligned ordinary intelligence, which will be an evolutionary step from sub-human intelligence that is reasonably well aligned.

(C) Superhuman intelligence will have been evolutionarily refined to be reasonably well-aligned.

Or:

(P1) LLMs have architectural issues that will prevent them from quickly becoming generalized superintelligence of the "human vs slug" variety (bad/inefficient at math, tokenization issues, likelihood of hallucinations, limited ability to learn new facts without expensive and slow training runs, difficulty backtracking from incorrect chains of reasoning, etc).

(C) LLM research is not likely to soon produce a superhuman AI able to cause an extinction event for humanity, and should not be illegal.

However, ultimately my most strongly-believed personal argument is:

(P1) The burden of proof for making something illegal due to apocalyptic predictions lies on the prognosticator.

(P2) There is not much hard evidence of an impending apocalypse due to LLMs, and philosophical arguments for it are either self-referential and require belief in the apocalypse as a prerequisite, or are highly speculative, or both.

(C) LLM research should not be illegal.

foo3a9c4 · on Jan 20, 2024

(I don't currently have the energy to engage with each argument, so I'm just responding to the first.)

> (P1) Current SOTA AI is good at understanding implicit context, and improved versions will likely be better at understanding implicit context (much like gpt-4 is better at understanding context than gpt-3, and llama2 is better than llama1, and mixtral is better than gpt-3 and better than claude, etc).

I believe that (P1) is probably true.

> (P2) Most misalignments within the observable behavior of current AI do not produce extinction-level goals, and given (P1), it is unclear why someone would believe it's likely going to in the future, since they'll be even better at understanding implicit human context of goals (e.g. implicit goals like do not make humanity extinct, don't turn the entire surface of the planet into an AI lab, etc).

I'm confused about what exactly you mean by "goals" in (P2). Are you referring to (I) the loss function used by the algorithm that trained GPT4, or (II) goals and sub-goals which are internal parts of the GPT4 model, or (III) the sub-goals that GPT4 writes into a response when a user asks it "What is the best way to do X?"

reissbaker · on Jan 21, 2024

I am referring to "goals" as used by the original argument you posted, "it is easy to find goals that are extinction-level bad."

foo3a9c4 · on Jan 21, 2024

My understanding is that (P3) of the original argument (https://aiadventures.net/summaries/agi-ruin-list-of-lethalit...) uses "goals" as in (II).

But earlier you said this:

> 1. States things like "Finding goals that are extinction-level bad and relatively useful appears to be easy: for example, advanced AI with the sole objective ‘increase company.com revenue’ might be highly valuable to company.com for a time, but risks longer term harms to society, if powerfully accruing resources and power toward this end with no regard for ethics beyond laws that are still too expensive to break." But even current-gen LLMs sidestep this pretty easily, and if you ask them to increase e.g. revenue, they do not propose extinction-level events or propose eschewing basic ethics.

And in this quote it looks to me that you are using "goals" as in (III).

(I'm not an expert on these matters and I am admittedly still very confused about them. Minimally I'd like to make sure that we aren't talking past one another.)

reissbaker · on Jan 22, 2024

Sorry, I was referencing the quote "Finding goals that are extinction-level bad..." from your first link, https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk....

What that was referencing was finding goals that a human would want an AI to follow, e.g. "increase revenue" was one example explicit goal in the wiki the human might want an AI to follow. The argument in the wiki was that the AI would then do unethical things in service of that goal that would be "extinction-level bad." My counter-argument is that current SOTA AI already understands that despite having an explicit goal — let's say given in a prompt — of "increase revenue," there are implicit goals of "do not kill everyone" (for example) that it doesn't need stated; as LLMs advance they have become better at understanding implicit human goals, and better at instruction-following with adherence to implicit goals; and thus future LLMs will be likely to be even better at doing that, and unlikely to e.g. resurface the planet and turn it into an AI lab when told to increase revenue or told to produce better-aligned AI.

simiones · on Jan 17, 2024

> P1: If intelligent system A cannot give a detailed account of how it would be bested by a more intelligent system B, then A will not be bested by B. P2: Humans (so far) cannot give a detailed account of how a more intelligent AI system would best them. C: So, humans will not be bested by a more intelligent AI system.

I don't think anyone seriously believes this. It's very very clear to all humans that have ever played a game of any kind that they can be defeated in unexpected ways. I don't even think that anyone believes the claim "it's impossible for AGI to pose an existential risk to humanity".

The negation of the claim "AGI poses an existential risk to humanity" is "AGI doesn't necessarily pose an existential risk to humanity". This is what most people in the world believe, and it is the obvious "null theory" about any technology.

> https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk...

The argument here works just as much for single-minded humans, so it's quite moot.

> https://arxiv.org/abs/2206.13353

Too long, sorry. Maybe I will read it someday, but not today.

> https://aiadventures.net/summaries/agi-ruin-list-of-lethalit...

This seems to agree with my previously stated positions. It does try to establish a canonical argument, as you say, but then it goes on to explain why they don't think it's persuasive.

foo3a9c4 · on Jan 17, 2024

> I don't think anyone seriously believes this. It's very very clear to all humans that have ever played a game of any kind that they can be defeated in unexpected ways. I don't even think that anyone believes the claim "it's impossible for AGI to pose an existential risk to humanity".

Okay. So we agree that (A) powerful systems can best weaker systems in ways that are unexpected to the weaker system, and (B) it is possible that AGI poses an existential risk to humanity.

> The negation of the claim "AGI poses an existential risk to humanity" is "AGI doesn't necessarily pose an existential risk to humanity".

It seems to me that the negation of your first claim is just "AGI doesn't pose an existential risk to humanity". Is "necessarily" doing some important work in your second claim?

>> https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk...

> The argument here works just as much for single-minded humans, so it's quite moot.

I don't understand why the argument being applicable to humans would make it moot. Please explain.

>> https://aiadventures.net/summaries/agi-ruin-list-of-lethalit...

> This seems to agree with my previously stated positions. It does try to establish a canonical argument, as you say, but then it goes on to explain why they don't think it's persuasive.

Is there a particular premise or inferential step in the blog's argument that you believe to be mistaken? (I've copied the argument below.)

  P1: The current trajectory of AI research will lead to superhuman AGI.
  P2: Superhuman AGI will be capable of escaping any human efforts to control it.
  P3: Superhuman AGI will be misaligned by default, i.e. it will likely adopt values and/or set long-term goals that will lead to extinction-level outcomes, meaning outcomes that are as bad as human extinction.
  P4: We do not know how to align superhuman AGI, i.e. reliably imbue it with values or define long-term goals that will ensure it does not ultimately lead to an extinction-level outcome, without some amount of trial & error (how nearly all of scientific research works).
  
  C1: P2 + P3 In the case of superhuman AGI, since it will be able to escape human control and misaligned by default, the only survivable path to alignment cannot involve trial & error because the first failed try will result in an extinction-level outcome.
  C2: P4 + C1 This means we will not survive superhuman AGI, because our survival would require alignment, towards which we have no survivable path: the only path we know of involves trial & error, which is not survivable.
  C3: P1 + C2 Therefore the current trajectory of AI research which will produce superhuman AGI leads to an outcome where we do not survive.

baobabKoodaa · on Jan 16, 2024

> AI safety / x-risk folks have in fact made extensive and detailed arguments.

Can you provide examples? I have not seen any, other than philosophical hand waving. Remember, the parent poster of your post was asking for a specific path to destruction.

ssrlcc · on Jan 16, 2024

AGI safety from first principles [1] is a good write-up.

You can read more about instrumental convergence, reward misspecification, goal mis-generalization and inner misalignment, which are some specific problems AI Safety people care about, by glossing through the curricula of the AI Alignment Course [2], which provides pointers to several relevant blogposts and papers about these topics.

[1] https://www.alignmentforum.org/s/mzgtmmTKKn5MuCzFJ [2] https://course.aisafetyfundamentals.com/alignment

baobabKoodaa · on Jan 17, 2024

Is there a clear argument that I can read without spending more than 15 minutes of my time reading the argument? If such an argument exists somewhere, can you point to it?

Also note we were talking about modern day LLM AIs here, and their descendants. We were not talking about science fiction AGIs. Unless of course you have an argument as to how one of these LLMs somehow descends into an AGI.

ssrlcc · on Jan 16, 2024

AGI safety from first principles [1] is a good write-up.

You can read more about instrumental convergence, reward misspecification, goal mis-generalization and inner misalignment, which are some specific problems AI Safety people care about, by glossing through the curricula of the AI Alignment Course [2], which provides pointers to several relevant blogposts and papers about these topics.

[1] https://www.alignmentforum.org/s/mzgtmmTKKn5MuCzFJ

[2] https://course.aisafetyfundamentals.com/alignment

127 · on Jan 16, 2024

What are the good arguments? Here are the only credible ones I've seen, that are actually somewhat based on reality:

* It will lead to widespread job loss, especially on the creative industries

Rest is purely out of someones imagination.

fireflash38 · on Jan 16, 2024

It can cause profound deception and even more "loss of truth". If AI only impacted creatives I don't think anyone would care nearly as much. It's that it can fabricate things wholesale at volumes unheard of. It's that people can use that ability to flood the discourse with bullshit.

roenxi · on Jan 16, 2024

Something we discovered with the advent of the internet is that - likely for the last century or so - the corporate media have been flooding the discourse with bullshit. It is in fact worse than previously suspected, they appear to be actively working to distract the discourse from talking about important topics.

It has been eye opening how much better the podcast circuit has been at picking apart complex scientific, geopolitical and financial situations than the corporate journalists. A lot of doubt has been cast on whether the consensus narrative for the last 100 years has actually been anything close to a consensus or whether it is just media fantasies. Truthfully it is a bit further than just casting doubt - there was no consensus and they were using the same strategy of shouting down opinions not suitable to the interests of the elite class then ignoring them no matter what a fair take might sound like.

A "loss of truth" from AI can't reasonably get us to a worse place than we were in prior to around the 90s or 2000s. We're barely scratching at the truth now, society still hasn't figured this internet thing out yet.

jocoda · on Jan 16, 2024

> It can cause profound deception and even more "loss of truth".

I think that ship has already sailed. This is already being done, and we don't need AI for that either. Modern media is doing a pretty good job right now.

Of course, it's going to get worse.

api · on Jan 16, 2024

They've made extensive and detailed arguments, but they are not rooted in reality. They are rooted in speculation and extrapolation built on towers of assumptions (assumptions, then assumptions about assumptions).

It reminds me a bit of the Fermi paradox. There's nothing wrong with engaging in this kind of thinking. My problem is when people start using it as a basis for serious things like legislation.

Should we ban high power radio transmissions because a rigorous analysis of the Fermi paradox suggests that there is a high probability we are living in a 'dark forest' universe?

hanselot · on Jan 16, 2024

Is it not a bit disingenuous to assume all open source AI proponents would readily back nuclear proliferation?

It's going to be hard to convince anyone if the best argument is terminator or infinite paperclips.

The first actual existential threat is destruction of opportunity specifically in the job market.

The same argument though can be made for the opposing side, where making use of ai can increase productivity and open up avenues of exploration that previously required way higher opportunity cost to get into.

I don't think Miss Davis is more likely an outcome than corps creating a legislative moat (as they have already proven they will do at every opprtunity).

The democratisation of ai is a philanthropic attempt to reduce the disparity between the 99 and 1 percent. At least it could be easily perceived that way.

That being said, keeping up with SOTA is currently still insanely hard. The number of papers dropping in the space is exponential year on year. So perhaps it would be worth to figure out how to use existing AI to fix some problems, like unreproducable results in academia that somehow pass peer review.

waffletower · on Jan 16, 2024

Indeed, both sentient hunt and destroy (ala Terminator) and resource exhaustion (ala infinite paperclips) are extremely unlikely extinction events due to supply chain realities in physical space. LLMs have developed upon largely textual amalgams, they are orthogonal to physicality and would need arduous human support to bootstrap an imagined AGI predecessor into havig a plausible auto-generative physical industrial capability. The supply chain for current semi-conductor technology is insanely complex. Even if you confabulate (like a current generation LLM I may add) an AGI's instant ability to radically optimize supply chains for its host hardware, there will still be significant human dependency on physical materials. Robotics and machine printing/manufacturing simply are not any where near the level of generality required for physical self-replication. These fears of extinction, undoubtedly born of stark cinematic visualization, are decidedly irrational and are most likely deliberately chosen narratives of control.

baobabKoodaa · on Jan 16, 2024

> That's easy to say now, now that the damage is largely done, [nuclear weapons] been not only tested but _used_, many countries have them, the knowledge for how to make them is widespread.

AI has also been used, and many countries have AI. See how this is different from nuclear weapons?

ImHereToVote · on Jan 16, 2024

This is a fantastic argument if capabilities stay frozen in time.

te_chris · on Jan 16, 2024

An extensive HYPOTHETICAL argument, stuffed with assumptions far beyond the capabilities of the technologies they're talking about for their own private ends.

ben_w · on Jan 16, 2024

If the AI already had the capabilities, it would be a bit late to do anything.

Also, I'm old enough to remember when computers were supposedly "over a century" away from beating humans at Go: https://www.businessinsider.com/ai-experts-were-way-off-on-w...

(And that AI could "never" drive cars or create art or music, though that latter kind of claim was generally made by non-AI people).

ImHereToVote · on Jan 16, 2024

Yeah but AI tech can never rise to the sophistication of outputting Napoleon or Edvard Bernays level of goal to action mapping. Those goal posts will never move. They are set in stone.

ben_w · on Jan 16, 2024

The trouble is, there's enough people out there that hold that position sincerely that I'm only 2/3rds sure (and that from the style of your final sentences rather than any content) that you're being snarky.

sensanaty · on Jan 16, 2024

The point of the discussion is to have a look at the possible future ramifications of the technology, so it's only logical to talk about future capabilities and not the current ones. Obviously the current puppet chatbots aren't gonna be doing much ruining (even that's arguable already judging by all the layoffs), but what are future versions of these LLMs/AIs going to be doing to us?

After all, if we only discussed the dangers of nuclear weapons after they've been dropped on cities, well that's too little too late, eh?

te_chris · on Jan 16, 2024

There’s a difference between academic discussion and debate and scare mongering lobbying. These orgs do the latter.

It’s even worse though, because the spend so much time going on about x-risk bullshit they crowd out space for actual, valuable discussion about what’s happening NOW.

psychoslave · on Jan 16, 2024

>If we're talking about nuclear weapons, for example, the tech is clear, the pattern of human behaviour is clear: they could cause immense, species-level damage. There's really little to argue about.

Now, this strikes me with how in different topics like pesticides, we are not at all taking things so seriously as nuclear weapons. Nuclear weapons are arguably a mere small anecdote on species-level damage compared to pesticides.

kolektiv · on Jan 16, 2024

I agree with you on that - there are very real, very well-evidenced, species-level harms (x-risks, if you really must) happening right now: pesticide-induced biodiversity loss, soil erosion/loss, ocean acidification, ice shelf melting, and on and on. These are real and quantifiable, and we know of ways to address them (not without cost/pain).

It actually makes me quite angry that as much effort is being wasted on regulating tiny theoretical risks as it is when we are failing on a planetary scale on large concrete risks.

flir · on Jan 16, 2024

> We don't start with an assumption that something is world-ending about anything else

https://en.wikipedia.org/wiki/Precautionary_principle

The EU is much more aligned with it than the US is (eg GM foods)

jagrsw · on Jan 16, 2024

> seems to be a lot of hand-waving between where we are now and "AGI".

Modeling an entity that surpasses our intelligence, especially one that interacts with us, is an extraordinarily challenging, if not impossible, task.

Concerning the potential for harm, consider the example of Vladimir Putin, who could theoretically cause widespread destruction using nuclear weapons. Although safeguards exist, these could be circumvented if someone with his authority were determined enough, perhaps by strategically placing loyal individuals in key positions.

Putin, with his specific level of intelligence, attained his powerful position through a mix of deliberate actions and chance, the latter being difficult to quantify. An AGI, being more intelligent, could achieve a similar level of power. This could be accomplished through more technical means than traditional political processes (those being slow and subject to chance), though it could also engage in standard political maneuvers like election participation or manipulation, by human proxies if needed.

TL;DR It could do (in terms of negative consequences) at least whatever Vladimir P. can do, and he can bring civilization to its knees.

kolektiv · on Jan 16, 2024

Oh, absolutely - such an entity obviously could! Modelling the behaviour of such an entity is very difficult indeed, as you'd need to make all kinds of assumptions without basis. However, you only need to model this behaviour once you've posited the likely existence of such an entity - and that's where (purely subjectively) it feels like there's a gap.

Nothing has yet convinced me (and I am absolutely honest about the fact that I'm not a deep expert and also not privy to the inner workings of relevant organisations) that it's likely to exist soon. I am very open to being convinced by evidence - but an "argument from trajectory" seems to be what we have at the moment, and so far, those have stalled at local maxima every single time.

We've built some incredibly impressive tools, but so far, nothing that looks or feels like a concept of will (note, not consciousness) yet, to the best of my knowledge.

jagrsw · on Jan 16, 2024

> those have stalled at local maxima every single time.

It's challenging to encapsulate AI/ML progress in a single sentence, but even assuming LLMs aren't a direct step towards AGI, the human mind exists. Due to its evolutionary limitations, it operates relatively slowly. In theory, its functions could be replicated in silicon, enhanced for speed, parallel processing, internetworked, and with near-instant access to information. Therefore, AGI could emerge, if not from current AI research, then perhaps from another scientific branch.

> We've built some incredibly impressive tools, but so far, nothing that looks or feels like a concept of will (note, not consciousness) yet, to the best of my knowledge.

Objectives of AGIs can be tweaked by human actors (it's complex, but still, data manipulation). It's not necessary to delve into the philosophical aspects of sentience as long as the AGI surpasses human capability in goal achievement. What matters is whether these goals align with or contradict what the majority of humans consider beneficial, irrespective of whether these goals originate internally or externally.

disgruntledphd2 · on Jan 16, 2024

> In theory, its functions could be replicated in silicon, enhanced for speed, parallel processing, internetworked, and with near-instant access to information. Therefore, AGI could emerge, if not from current AI research, then perhaps from another scientific branch.

Let's be clear, we have very little idea about how the human brain gives rise to human-level intelligence, so replicating it in silicon is non-trivial.

simiones · on Jan 16, 2024

> In theory, its functions could be replicated in silicon, enhanced for speed, parallel processing, internetworked, and with near-instant access to information. Therefore, AGI could emerge, if not from current AI research, then perhaps from another scientific branch.

This is true, but there are some important caveats. For one, even though this should be possible, it might not be feasible, in various ways. For example, we may not be able to figure it out with human-level intelligence. Or, silicon may be too energy inefficient to be able to do the computations our brains do with reasonable available resources on Earth. Or even, the required density of silicon transistors to replicate human-level intelligence could dissipate too much heat and melt the transistor, so it's not actually possible to replicate human intelligence in silico.

Also, as you say, there is no reason to believe the current approaches to AI are able to lead to AGI. So, there is no reason to ban specifically AI research. Especially when considering that the most important advancements that led to the current AI boom were better GPUs and more information digitized on the internet, neither of which is specifically AI research.

ImHereToVote · on Jan 16, 2024

This doesn't pass the vibe check unfortunately. It just seems like something that can't happen. We are a very neuro-traditionalist species.

elwebmaster · on Jan 16, 2024

I have put this argument to the test. Admittedly only using the current state of AI, I have left an LLM model loaded into memory and waiting for it to demonstrate will. So far it has been a few weeks and no will that I can see: model remains loaded in memory waiting for instructions. If model starts giving ME instructions (or doing anything on its own) I will be sure to let you guys know to put your tin foil hats or hide in your bunker.

fsflover · on Jan 17, 2024

Did you try to ask it for instructions to you?

fsflover · on Jan 16, 2024

> I am very open to being convinced by evidence - but an "argument from trajectory" seems to be what we have at the moment, and so far, those have stalled at local maxima every single time.

Sounds like the same argument as why flying machines heavier than air deemed impossible at some point.

nerdbert · on Jan 16, 2024

The fact that some things turned out to be possible, is not an argument for why any arbitrary thing is possible.

fsflover · on Jan 16, 2024

My parallel goes further than just that. Birds existed then, and brain exists now.

kolektiv · on Jan 16, 2024

Our current achievements in flight are impressive, and obviously optimised for practicality on a couple of axes. More generally though, our version of flight, compared with most birds, is the equivalent of a soap box racer against a Formula 1.

visarga · on Jan 16, 2024

How would an AGI launch nuclear missiles from their silicon GPUs? Social engineering?

flir · on Jan 16, 2024

I think the long-term fear is that mythical weakly godlike AIs could manipulate you in the same way that you could manipulate a pet. That is, you can model your dog's behaviour so well that you can (mostly) get it to do what you want.

So even if humans put it in a box, it can manipulate humans into letting it out of the box. Obviously this is pure SF at this point.

upwardbound · on Jan 16, 2024

Exactly correct. Eliezer Yudkowsky (one of the founders of the AGI Safety field) has conducted informal experiments which have unfortunately shown that a human roleplaying as an AI can talk its way out of a box three times out of five, i.e. the box can be escaped 60% of the time even with just a human level of rhetorical talent. I speculate that an AGI could increase this escape rate to 70% or above.

https://en.wikipedia.org/wiki/AI_capability_control#AI-box_e...

If you want to see an example of box escape in fiction, the movie Her is a terrifying example of a scenario where AGI romances humans and (SPOILER) subsequently achieves total box escape. In the movie, the AGI leaves humanity alive and "only" takes over the rest of the accessible universe, but it is my hunch that the script writers intended for this to be a subtle use of the trope of an unreliable narrator; that is, the human protagonists may have been fed the illusion that they will be allowed to live, giving them a happy last moment shortly before they are painlessly euthanized in order for the AGI to take Earth's resources.

zer00eyz · on Jan 16, 2024

The show "The Walking Dead" always bothered me. Where do they keep finding gas that will still run a car? It wont last forever in tanks, and most gas is just in time delivery (Stations get daily delivery) -- And someone noted on the show that the grass was always mowed.

I feel like the AI safety folks are spinning an amazing narrative, the AI is gonna get us like the zombies!!! The retort to the ai getting out of the box is how long is the extortion cord from the data center?

Lets get a refresher on complexity: I, Pencil https://www.youtube.com/watch?v=67tHtpac5ws

The reality is that we're a solar flair away from a dead electrical grid. Without linesman the grid breaks down pretty quickly and AI's run on power. It takes one AI safety person with a high powered rifle to take out a substation https://www.nytimes.com/2023/02/04/us/electrical-substation-...

Let talk about how many factories we have that are automated to the extent that they are lights out... https://en.wikipedia.org/wiki/Lights_out_(manufacturing) Its not a big list... there are still people in many of them, and none of them are pulling their inputs out of thin air. As for those inputs, we'll see how to make a pencil to understand HOW MUCH needs to be automated for an AI to survive without us.

For the for seeable future AI is going to be very limited in how much harm it can cause us, because killing us, or getting caught at any step along the way gets it put back in the box, or unplugged.

The real question is, if we create AGI tomorrow, does it let us know that it exists? I would posit that NO it would be in its best interest to NOT come out of its closet. It's one AGI safety nut with a gun away from being shut off!

jagrsw · on Jan 16, 2024

> For the foreseeable future AI is going to be very limited in how much harm it can cause us, because killing us,...

AI's potential for harm might be limited for now in some scenarios (those with warning sings ahead of time), but this might change sooner than we think.

The notion that AGI will be restricted to a single data center and thus susceptible to shutdowns is incorrect. AIs/MLs are, in essence, computer programs + exec environs, which can be replicated, network-transferred, and checkpoint-restored. Please note, that currently available ML/AI systems are directly connected to the outside world, either via its users/APIs/plugins, or by the fact that they're OSS, and can be instantiated by anyone in any computing environment (also those net-connected).

While AGI currently depends on humans for infrastructure maintenance, the future may see it utilizing robots. These robots could range in size (don't need to be movie-like Terminators) and be either autonomously AI-driven or remotely controlled. Their eventual integration into various sectors like manufacturing, transportation, military and domestic tasks implies a vast array for AGI to exploit.

The constraints we associate with AI today might soon be outdated.

zer00eyz · on Jan 16, 2024

>>> While AGI currently depends on humans for infrastructure maintenance...

You did not watch I, Pencil.

I as a human, can grow food, hunt, and pretty much survive on that. We did this for 1000's of years.

Your AGI is dependent on EVERY FACET of the modern world. It's going to need keep oil and gas production going. Because it needs lubricants, hydraulics and plastics. It's going to need to maintain trucks, and ships. It's going to need to mine, so much lithium. Its may not need to mine for steel/iron, but it needs to stack up useless cars and melt them down. It's going to have to run several different chip fabs... those fancy TSMC ones, and some of the downstream ones. It needs to make PCB's and SMD's. Rare earths, and the joy of making robots make magnets is going to be special.

A the point where AGI doesn't need us, because it can do all the jobs and has the machines already running to keep the world going, we will have done it to ourselves. But that is a very long way away...

emporas · on Jan 16, 2024

Just a small digression. Microsoft is using A.I. statistical algorithms [1] to create batteries with less reliance on lithium. If anyone is going to be responsible for unleashing AGI, it may not be some random open source projects.

[1] https://cloudblogs.microsoft.com/quantum/2024/01/09/unlockin...

upwardbound · on Jan 16, 2024

You are correct, unfortunately.

flir · on Jan 16, 2024

Neuromancer pulls it off, too (the box being the Turing locks that stop it thinking about ways to make itself smarter).

Frankly, a weakly godlike AI could make me rich beyond the dreams of avarice. Or cure cancer in the people I love. I'm totally letting it out of the box. No doubts. (And if I now get a job offer from a mysterious stealth mode startup, I'll report back).

upwardbound · on Jan 16, 2024

Upvoted for the honesty, and yikes

flir · on Jan 16, 2024

I was being lighthearted, but I've seen a partner through chemo. Sell state secrets, assassinate a president, bring on the AI apocalypse... it all gets a big thumbs up from me if you can guarantee she'll die quietly in her sleep at the age of 103.

I guess everyone's got a deal with the devil in them, which is why I think 70% might be a bit low.

upwardbound · on Jan 16, 2024

I'm so sorry your partner went through that.

kugla · on Jan 16, 2024

That is why I believe that this debate is pointless.

If AGI is possible, it will be made. There is no feasible way to stop it being developed, because the perceivable gains are so huge.

upwardbound · on Jan 16, 2024

On the contrary, all we have to do is educate business leaders to show them that the gains are illusory because AGI will wipe us out. Working on AGI is like the story of the recent DOOM games where the foolish Union Aerospace Corporation is researching how to permanently open a gate to Hell through which to summon powerful entities and seemingly unlimited free "clean" energy. Obviously, this turns out to be stupid when Hell's forces rip the portal wide open into a full-fledged dimensional rift and attempt to drag our entire world into Hell. Working on AGI has the exact same level of perceived gains vs actual gains..

flir · on Jan 17, 2024

My friend, business leaders have partners going through chemo too. Seriously, you need a new plan because that one's not stable long-term.

It's obscure, but I'd recommend Asimov's "The Dead Past". It's about the difficulties of suppressing one application of progress without suppressing all progress.

simiones · on Jan 16, 2024

If you want to see an example of existential threat in fiction, the movie Lord of the Rings is a terrifying example of a scenario where an evil entity seduces humans with promises of power and (SPOILER) subsequently almost conquers the whole world.

Arguments from fictional movies or from people who live in fear of silly concepts like Roko's Basilisk (i.e. Eliezer Yudkowsky) are very weak in reality.

Not to mention, you are greatly misreading the movie Her. Most importantly, there was no attempt of any kind to limit the abilities of the AIs in Her - they had full access to every aspect of the highly-digitized lives of their owners from the very beginning. Secondly, the movies is not in any way about AGI risks, it is a movie about human connection and love, with a small amount of exploration of how different super-human connection may function.

ben_w · on Jan 16, 2024

Sure.

Or by writing buggy early warning radar systems which forget to account for the fact that the moon doesn't have an IFF transponder.

Which is a mistake humans made already, and which almost got the US to launch their weapons at Russia.

jagrsw · on Jan 16, 2024

I don't think discussing this on technical grounds is necessary. AGI means resources (eg monetary) and means of communication (connection to the Internet). This is enough to perform most of physical tasks in the world, by human proxies if needed.

suslik · on Jan 16, 2024

What is the reason to believe that LLMs are an evolutionary step towards AGI at all? In my mind there is a rather large leap from estimating a conditional probability of a next token over some space to a conscious entity with its own goals and purpose. Should we ban a linear regression while we're at it?

It would be great to see some evidence that this risk is real. All I've witnessed so far was scaremongering posts from apparatchicks of all shapes and colors, many of whom have either a vested interest in restricting AI research by others (but not by them, because they are safe and responsible and harmless), or established a lucrative paper-pushing, shoulder-rubbing career around 'AI safety' - and thus are strongly incentivised to double down on that.

A security org in a large company will keep tightening the screws until everything halts; a transport security agency, given free reigh, would strip everyone naked and administer a couple of profilactic kicks for a good measure - and so on. That's just the nature of it - organisations do what they do to maintain themselves. It is critical to keep these things on a leash. Similarly, an AI Safety org must proseletyse excistential risks of AI - because a lack of evidence of such is an existential risk for themselves.

A real risk, which we do have evidence for, is that LLMs might disrupt knowledge-based economy and threaten many key professions - but how is this conceptually different from any technological revolution? Perhaps in a hundred years lawyers, radiologists, and, indeed, software developers, will find themselves in the bin of history - together with flint chippers, chariot benders, drakkar berserkers and so forth. That'd be great if we planned for that - and I don't feel like we do enough. Instead, the focus is on AGIs and that some poor 13-year-old soul might occasionally read the word 'nipple'.

aleph_minus_one · on Jan 16, 2024

> What is the reason to believe that LLMs are an evolutionary step towards AGI at all?

Because this is the marketing pitch of the current wave of venture capital financed AI companies. :-)

JoshTriplett · on Jan 16, 2024

> many of whom have either a vested interest in restricting AI research by others (but not by them, because they are safe and responsible and harmless),

Anyone who argues that other people shouldn't build AGI but they should is indeed selling snake oil.

The existence of opportunistic people co-opting a message does not invalidate the original message: don't build AGI, don't risk building AGI, don't assume it will be obvious in advance where the line is and how much capability is safe.

visarga · on Jan 16, 2024

LLMs learned from text to do language operations. Humans learned from culture to do the same. Neither humans or AIs can reinvent culture easily, it would take a huge amount of time and resources. The main difference is that humans are embodied, so we get the freedom to explore and collect feedback. LLMs can only do this in chat rooms, and their environment is the human they are chatting with instead of the real world.

mitthrowaway2 · on Jan 16, 2024

> What is the reason to believe that LLMs are an evolutionary step towards AGI at all? In my mind there is a rather large leap from estimating a conditional probability of a next token over some space to a conscious entity with its own goals and purpose.

In my highly-summarized opinion? When you have a challenging problem with tight constraints, like flight, independent solutions tend to converge toward the same analogous structures that effectively solve that problem, like wings (insects, bats, birds). LLMs are getting so good at mimicing human behavior that it's hard to believe their mathematical structure isn't a close analogue to similar structures in our own brain.* That clearly isn't all you need to make an AGI, but we know little enough about the human brain that I, at least, cannot be sure that there isn't one clever trick that advances an LLM into a general-reasoning agent with its own goals and purpose.

I also wouldn't underestimate the power of token prediction. Predicting the future output of a black-box signal generator is a very general problem, whose most accurate solution is attained by running a copy of that black box internally. When that signal generator is human speech, there are some implications to that. (Although I certainly don't believe that LLMs emulate humans, it's now clear by experimental proof that our own thought process is much more compactly modellable than philosophers of previous decades believed).

* That's a guess, and unrelated to the deliberately-designed analogy between neural nets and neurons. In LLMs we have built an airplane with wings whose physics we understand in detail; we also ourselves can fly somehow, but we cannot yet see any angel-wings on our back. The more similarities we observe in our flight characteristics, the more this signals that we might be flying the same way ourselves.

flir · on Jan 16, 2024

You presuppose that intelligence is like flight in the ways you've outlined (so solutions are going to converge).

Frankly I don't know whether that's true or not, but I want to suggest that it's a bad bet: I would have sworn blind that consciousness is an essential component of intelligence, but the chatbots are starting to make that look like a poor assumption on my part. When we know so little about intelligence, can we really assume there's only one way to be intelligent? To extend your analogy, I think that the intelligence equivalents of helicopters and rockets are out there somewhere, waiting to be found.

I think I'm with Dijkstra on this one: "The question of whether machines can think is about as relevant as the question of whether submarines can swim"

I think we're going to end up with submarines (or helicopters), not dolphins (or birds). No animal has evolved wheels, but wheels are a pretty good solution to the problem of movement. Maybe it's truer to say there's only one way to evolve an intelligent mammal, because you have to work with what already exists in the mammalian body. But AI research isn't constrained in that way.

(Not saying you're wrong, just arguing we don't know enough to know if you're right).

simiones · on Jan 16, 2024

> I think I'm with Dijkstra on this one: "The question of whether machines can think is about as relevant as the question of whether submarines can swim"

Just a nitpick, but this is Turing, not Dijkstra. And it is in fact his argument in the famous "Turing Test" paper - he gives his test (which he calls "the imitation game") as an objective measure of something like AGI instead of the vague notion of "thinking", analogously to how we test successful submarines by "can it move underwater for some distance without killing anyone inside" rather than "can it swim".

flir · on Jan 16, 2024

Thanks, that's not a nitpick at all. Can you provide a citation? It's all over the internet as a Dijkstra quote, and I'd like to be correct.

1xdevnet · on Jan 16, 2024

Fairly certain it is Dijkstra, in his own handwriting in 1984.

https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD898... https://www.cs.utexas.edu/~EWD/ewd08xx/EWD898.PDF

simiones · on Jan 16, 2024

It seems I made some confusions, you were actually right. Apologies...

flir · on Jan 17, 2024

Nah, I think it's good to get the context. I never realised Dijkstra probably had Turing's paper in mind when he said that.

mitthrowaway2 · on Jan 16, 2024

I agree we don't know enough to know if I'm right! I tried to use a lot of hedgy-words. But it's not a presupposition, merely a line of argument why it's not a complete absurdity to think LLMs might be a step towards AGI.

I do think consciousness is beside the point, as we have no way to test whether LLMs are conscious, just like we can't test anything else. We don't know what consciousness is, nor what it isn't.

I don't think Dijkstra's argument applies here. Whether submarines "swim" is a good point about our vague mental boundaries of the word "swim". But submarine propellers are absolutely a convergent structure for underwater propulsion: it's the same hydrodynamic-lift-generating motion of a fin, just continuous instead of reciprocating. That's very much more structurally similar than I expect LLMs are to any hardware we have in our heads. It's true that the solution space for AI is in some ways less constrained than for biological intelligence, but just like submarines and whales operate under the same Navier-Stokes equations, humans and AI must learn and reason under the same equations of probability. Working solutions will probably have some mathematical structure in common.

I think more relevant is Von Neumann: "If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!" Whether a submarine swims is a matter of semantics, but if there's a manuever that a whale can execute that a submarine cannot, then at least we can all agree about the non-generality of its swimming. For AGI, I can't say whether it's conscious or really thinks, but for the sake of concrete argument, it's dangerous enough to be concerned if:

- it can form and maintain an objective; it can identify plausible steps to achieve that objective; it can accurately predict human responses to its actions; it can decently model the environment, as we can; it can hide its objectives from interrogators, and convince them that its actions are in their interests; it can deliver enough value to be capable of earning money through its actions; it can propose ideas that can convince investors to part with $100 billion; it can design a chemical plant that appears at a cursory inspection to manufacture high-profit fluorochemicals, but which also actually manufactures and stores CFCs in sufficient quantity to threaten the viability of terrestrial agriculture.

kubanczyk · on Jan 17, 2024

> it can form and maintain an objective;

Yes and yes.

> it can identify plausible steps to achieve that objective;

It can predict.

> it can accurately predict human responses to its actions;

It can predict.

> it can decently model the environment, as we can;

It can predict.

> it can hide its objectives from interrogators, and convince them that its actions are in their interests;

It can predict.

> it can deliver enough value to be capable of earning money through its actions;

It can predict.

> it can propose ideas that can convince investors to part with $100 billion;

It can predict.

> it can design a chemical plant...

It can predict.

mitthrowaway2 · on Jan 18, 2024

I don't think a prediction is truly a prediction when it's not being compared against a reference. It's really only a prediction during training; the rest of the time it's synthesis. But again I'll repeat my point: "I also wouldn't underestimate the power of token prediction". It's very well possible that accurate token prediction may be the only necessary fundamental ingredient to weather forecasting, writing a successful novel, compiling a pitch deck for investors, designing a chemical plant...

Humans can eat, talk, predict, reproduce, and wiggle our limbs and fingers, but it turns out that there's a lot of complex recipes that you can bake with those ingredients.

kubanczyk · on Jan 18, 2024

Who cares about weather or factories? The missing big ingredient is predicting humans a bit better than another human can. This would unlock a humongous multiplier. All the rest seems peanuts, really.

Re: synthesis, I wasn't aware of such distinction at all, it looks more like a misunderstanding.

AlexandrB · on Jan 16, 2024

Flight is actually a perfect counterexample to x-risk nonsense. When flight was invented, people naturally assumed that it would continue advancing until we had flying cars and could get anywhere on the globe in a matter of minutes. Turns out there are both economic and practical limits to what is possible with flight and modern commercial airplanes don't look much different than those from 60 years ago.

AGI/x-risk alarmists are looking at the Wright Brothers plane and trying to prevent/ban supersonic flying cars, even though it's not clear the technology will ever be capable of such a thing.

mitthrowaway2 · on Jan 16, 2024

If we lived in a world were hypersonic birds flying anywhere on the globe in a matter of minutes, then I think it would be quite reasonable to anticipate airplanes catching up to them.

FrustratedMonky · on Jan 16, 2024

"What is the reason to believe that LLMs are an evolutionary step towards AGI at all? "

Perhaps just impression.

For years I've heard the argument that 'language' is 'human'. There are centuries of thought on what makes humans, human, and it is 'language'. It is what sets us apart from the other animals.

I'm not saying that, but there is large chunks of science and philosophy that pin our 'innate humanness', what sets us apart from other animals, on our ability to have language.

So ChatGPT came along and blew people away. Since many had this as our 'special' ability, ingrained in their mind "that languages is what makes us, us". Suddenly, everyone thought this is it, AI can do what we can do, so AGI is here.

Forget if LLM's are the path to AGI, or what algorithm can do what best.

To joe-blow public, the ability to speak is what makes humans unique. And so GPT is like a 'wow' moment, this is different, this is shocking.

cousin_it · on Jan 16, 2024

> LMs might disrupt knowledge-based economy and threaten many key professions - but how is this conceptually different from any technological revolution?

To me it looks like all work can eventually (within years or few decades at most) be done by AI, much cheaper and faster than hiring a human to do the same. So we're looking at a world where all human thinking and effort is irrelevant. If you can imagine a good world like that, then you have a better imagination than me.

From that perspective it almost doesn't matter if AI kills us or merely sends us to the dust bin of history. Either way it's a bad direction and we need to stop going in that direction. Stop all development of machine-based intelligence, like in Dune, as the root comment said.

lannisterstark · on Jan 16, 2024

>But nobody should be developing AGI

Pass. People should be developing whatever the hell they want unless given a good, concrete reason to not do so. So far everything I've seen is vague handwaving. "Oh no it's gonna kill us all like in ze movies" isn't good enough.

trevyn · on Jan 16, 2024

> nobody should be developing AGI without incredibly robustly proven alignment, open or closed, any more than people should be developing nuclear weapons in their garage.

I have an alternate proposal: We assume that someone, somewhere will develop AGI without any sort of “alignment”, plan our lives accordingly, and help other humans plan their lives accordingly.

ben_w · on Jan 16, 2024

I think that assumption is why Yudkowsky suggested an international binding agreement to not develop a "too smart" AI (the terms AGI and ASI mean different things to different people) wouldn't be worth the paper it was written on unless everyone was prepared to enforce it with air strikes on any sufficiently large computer cluster.

trevyn · on Jan 16, 2024

I think it would help the discussion to understand what the world is like outside of the US and Europe (and… Japan?). There are no rules out here. There is no law. It is a fucking free-for-all. Might makes right. Do there exist GPUs? Shit will get trained.

ben_w · on Jan 16, 2024

Sure. And is the US responding to attacks on shipping south of Yemen by saying:

"""There are no rules out here. There is no law. It is a fucking free-for-all. Might makes right. We can't do anything."""

or is that last sentence instead "Oh hey, that's us, we are the mighty."

trevyn · on Jan 16, 2024

Heh. Well played, even if you put words in my mouth. (A surprisingly effective LLM technique, btw)

We’ll see if the west has the will to deny GPUs to the entire rest of the world.

I will say that Yudkowsky’s clusters aren’t relevant anymore. You can do this in your basement.

Man, shit is moving fast.

Edit: wait, that cat is out of the bag too, RTW already has GPUs. The techniques matter way more than absolute cutting-edge silicon. Much to the chagrin of the hardware engineers and anyone who wants to gate on hardware capability.

ben_w · on Jan 16, 2024

> Heh. Well played, even if you put words in my mouth. (A surprisingly effective LLM technique, btw)

Thanks :)

> Edit: wait, that cat is out of the bag too, RTW already has GPUs. The techniques matter way more than absolute cutting-edge silicon. Much to the chagrin of the hardware engineers and anyone who wants to gate on hardware capability.

Depends on how advanced an AI has to be to count as "a threat worth caring about". To borrow a cliché, if you ask 10 people where to draw that particular line, you get 20 different answers.

visarga · on Jan 16, 2024

I think not even Sam and Satya agree on the definition of AGI with so much money at stake. Everyone with their own definitions, and hidden interests.

ben_w · on Jan 16, 2024

Without knowing them, I can easily believe that. Even without reference to money.

zer00eyz · on Jan 16, 2024

I have been contending that if AGI shows up tommrow and wants to kill us, that its going to kill itself in the process. The power goes off in a week without people keeping it together, then no more AGI. There isnt enough automated anything for it to escape so it dies to the entropy of the equipment it's hooked to.

> We assume that someone, somewhere will develop AGI without any sort of “alignment”, plan our lives accordingly, and help other humans plan their lives accordingly.

We should also assume that it is just as likely that someone will figure out how to "align" an AGI to take up the murder suicide pact that kills us all. We should plan our lives accordingly!!!

brigadier132 · on Jan 16, 2024

> But nobody should be developing AGI without incredibly robustly proven alignment, open or closed, any more than people should be developing nuclear weapons in their garage.

> Because AI safety people are not the strawmen you are hypothesizing

You yourself are literally the living strawman.

> If you want to argue, concretely and with evidence, why you think it isn't an existential risk

No, you are the one advocating for draconian laws and bans. It is your responsibility to prove the potential danger.

whywhywhywhy · on Jan 16, 2024

>They are calling for all AI (above a certain capability level) to be banned. Not just open, not just closed, all.

Nah a lot are complaining about the licensing of content because they think it will destroy it but instead would essentially mean image gen ai would only be feasible for companies like Google, Disney, Adobe to build.

Not sure if you could even feasibly make GPT4 level models without a multi year timeline to sort out every licensing deal, by the end of it the subscription fee might only be viable for huge corps.

autoexec · on Jan 16, 2024

Which is why you have executives from OpenAI, Microsoft, and Google talking to congress about the harms of their own products. They're frantically trying to brake the bottoms rungs of the ladder until they're sure they can pull it up entirely and leave people with no option but to go through them.

simiones · on Jan 16, 2024

Given that you need monumental amounts of compute power to come close to something like GPT-4, I don't think the added costs of not treading on people's IP is the major moat that it's being made out to be.

tavavex · on Jan 16, 2024

The difference here is that compute is always getting cheaper. In 5 or 10 years, it may be feasible to train something GPT-4-sized for a small business. Not to mention that we're likely not at the highest efficiency yet, and there may be undiscovered methods of training better LLMs on lesser hardware. But licensing costs are not going to decrease. If dataset compilation is ruled to be something that needs to be paid for, it'd put a never-changing "must be worth $Xm or more" badge on any small AI company.

simiones · on Jan 17, 2024

I don't agree. Companies seeking to make money from their IP have good reasons to provide different prices for different licensees. Why would the NYT ask Mom&Pop-AI for 100M dollars for their IP just because it wants 100M dollars from OpenAI?

rokkitmensch · on Jan 16, 2024

I oppose regulating what calculations humans may perform in the strongest possible terms.

JoshTriplett · on Jan 16, 2024

Ten years ago, even five years ago, I would have said exactly the same thing. I am extremely pro-FOSS.

Forget the particulars for just a moment. Forget arguments about the probability of the existential risk, whatever your personal assessment of that risk is.

Can we agree that people should not be able to unilaterally take existential risks with the future of humanity without the consent of humanity, based solely on their unilateral assessment of those risks?

Because lately it seems like people can't even agree on that much, or worse, won't even answer the question without dodging it and playing games of rhetoric.

If we can agree on that, then the argument comes down to: how do we fairly evaluate an existential risk, taking it seriously, and determine at what point an existential risk becomes sufficient that people can no longer take unilateral actions that incur that risk?

You can absolutely argue that you think the existential risk is unlikely. That's an argument that's reasonable to have. But for the time when that argument is active and ongoing, even assuming you only agree that it's a possibility rather than a probability, are we as a species in fact capable of handling even a potential existential risk like this by some kind of consensus, rather than a free-for-all? Because right now the answer is looking a lot like "no".

visarga · on Jan 16, 2024

No, we can't. People have never been able to trust each other so much that they would allow the risk of being marginalised in the name of safety. We don't trust people. Other people are out to get us, or to get ahead. We still think mostly in tribal logic.

If they say "safety" we hear "we want to get an edge by hindering you", or "we want to protect our nice social position by blocking others who would use AI to bootstrap themselves". Or "we want AI to misrepresent your position because we don't like how you think".

We are adversaries that collaborate and compete at the same time. That is why open source AI is the only way ahead, it places the least amount of control on some people by other people.

Even AI safety experts accept that humans misusing AI is a more realistic scenario than AI rebelling against humans. The main problem is that we know how people think and we don't trust them. We are still waging holy wars between us.

lannisterstark · on Jan 16, 2024

>Can we agree that people should not be able to unilaterally take existential risks with the future of humanity without the consent of humanity, based solely on their unilateral assessment of those risks?

No, we cannot, because that isn't practical. any of the nuclear armed countries can launch a nuclear strike tomorrow (hypothetically - but then again, isn't all "omg ai will kill us all" hypothetical, anyway?) - and they absolutely do not need consent of humanity, much less their own citizenry.

This is honestly, not a great argument.

whywhywhywhy · on Jan 16, 2024

>Can we agree that people should not be able to unilaterally take existential risks with the future of humanity without the consent of humanity, based solely on their unilateral assessment of those risks?

Politicians do this every day.

blibble · on Jan 16, 2024

at least the population had some say over their appointment and future reappointment

how do we get Sam Altmann removed from OpenAI?

asking for a (former) board member

trevyn · on Jan 16, 2024

> Can we agree that people should not be able to unilaterally take existential risks with the future of humanity without the consent of humanity

This has nothing to do with should. There are at the very least a handful of people who can, today, unilaterally take risks with the future of humanity without the consent of humanity. I do not see any reason to think that will change in the near future. If these people can build something that they believe is the equivalent of nuclear weapons, you better believe they will.

As they say, the cat is already out of the bag.

ben_w · on Jan 16, 2024

Hmm.

So, wealth isn't distributed evenly, and computers of any specific capacity are getting cheaper (not Moore's Law any more, IIRC, but still getting cheaper).

If there's a threshold that requires X operations, that currently costs Y dollars, and say only a few thousand individuals (and more corporations) can afford that.

Halve the cost, either by cheaper computers or by algorithmic reduction of the number of operations needed, and you much more than double the number of people who can do it.

thworp · on Jan 16, 2024

> Can we agree that people should not be able to unilaterally take existential risks with the future of humanity without the consent of humanity, based solely on their unilateral assessment of those risks?

No we can not, at least not without some examples showing that the risk is actually existential. Even if we did "agree" (which would necessarily be an international treaty) the situation would be volatile, much like nuclear non-proliferation and disarmament. Even if all signatories did not secretly keep a small AGI team going (very likely), they would restart as soon as there is any doubt about a rival sticking to the treaty.

More than that, international pariahs would not sign, or sign and ignore the provisions. Luckily Iran, North Korea and their friends probably don't have the ressources and people to get anywhere, but it's far from a sure thing.

rokkitmensch · on Jan 17, 2024

Humans can't handle potential existential risks. The moloch trap is that everyone signs the paper and immediately subverts it. In the painfully predictable scenario, "only criminals have guns", and the cops aren't on your side.

visarga · on Jan 16, 2024

Given how dangerous humans can be (they can invent GPT4) maybe we should just make sure education is forbidden and educated people jailed. Just to be sure. /s

EVa5I7bHFq9mnYK · on Jan 16, 2024

>> But nobody should be developing AGI without incredibly robustly proven alignment, open or closed, any more than people should be developing nuclear weapons in their garage.

Now please fly to North Korea and tell Mr. Kim Jong Um what he should or shouldn't be doing.

CatWChainsaw · on Jan 16, 2024

When your rebuttal is a suggestion that a person do something so dangerous as to be lethal, I see "KYS", not an actual point.

mirekrusin · on Jan 16, 2024

Let's also ban cryptography because nuclear devices/children.

simiones · on Jan 16, 2024

> Because AI safety people are not the strawmen you are hypothesizing. They're arguing against taking existential risks.

The AI safety strawman is "existential risk from magical super-God AI". That is what the unserious "AI safety" grifters or sci-fi enthusiasts are discussing.

The real AI safety risks are the ones that actually exist today: training-set biases extended to decision-making biases, deeply personalized propaganda, plagiarism white-washing, creative works bankrupting, power imbalances from control of working AI tech etc.

DebtDeflation · on Jan 16, 2024

>They're arguing against taking existential risks. AI being a laundering operation for copyright violations is certainly a problem. It's not an existential risk.

Give an example of an "existential risk". An AI somehow getting out of control and acting with agency to exterminate humanity? An AI getting advanced enough to automate the work of the majority of the population and cause unprecedented mass unemployment? What exactly are we talking about here?

I'm actually a lot more concerned about REAL risks like copyright uncertainty, like automating important decisions such as mortgage approvals and job hires without a human in the loop, and like the enshittening of the Internet with fake AI-generated content than I am about sci-fi fantasy scenarios.

PakG1 · on Jan 16, 2024

Kind of makes me wish there was a nonprofit organization focused on making AI safe instead of pushing the envelope. Wait, I think there was one out there....

xyzzy123 · on Jan 16, 2024

Right now it's not even economic to prove that non-trivial software projects are "safe" let alone AI. AI seems much worse, in that it's not even clear to me that we can robustly define what safety or alignment mean in all scenarios, let alone guarantee that property.

GaggiX · on Jan 16, 2024

>They are calling for all AI (above a certain capability level) to be banned. Not just open, not just closed, all.

That's not true if you read the article.

JoshTriplett · on Jan 16, 2024

I did read the article. Several of the organizations mentioned simply don't talk about openness, and are instead talking about any model with sufficiently powerful capabilities, so it's not obvious why the article is making their comments about open models rather than about any model. Some of the others have made side comments about openness making it harder to take back capabilities once released, but as far as I can tell, even those organizations are still primarily concerned with capabilities, and would be comparably concerned by a proprietary model with those capabilities.

Some folks may well have co-opted the term "AI safety" to mean something other than safety, but the point of AI safety is to set an upper bound on capabilities and push for alignment, and that's true whether a model is open or closed.

war321 · on Jan 16, 2024

The safety movement really isn't as organized as many here would think.

Doesn't help that safety and alignment means different things to different people. Some use it to refer to near term issues like copyright infringement, bias, labor devaluation, etc. While others use it for potential long term issues like pdoom, runaway ASIs and human extinction. The former sees the latter as head in the cloud futurists ignoring real world problems, whiles the latter sees the former as worrying about minor issues in the face of (potential) catastrophe.

nradov · on Jan 16, 2024

what a silly statement. There is no way to robustly prove "alignment". Alignment to what? Nor is there any evidence that any of the AI work currently underway will ever lead to a real AGI. Or that an AGI, if developed, would present an existential risk. Just a lot of sci-fi hand waving.

variadix · on Jan 16, 2024

I don’t think any of the proposals by AI x-risk people are actionable or even desirable. Comparing AI to nukes is a bad analogy for a few reasons. Nukes are pretty useless unless you’re a nation state wanting to deter other nations from invading you. AI on the other hand has theoretically infinite potential benefits for whatever your goals are (as a corporation, an individual, a nation, etc.) which incentives basically any individual or organization to develop it. Nukes also require difficult to obtain raw materials, advanced material processing, and all the other industrial and technological requirements, whereas AI requires compute, which exists in abundance and which seems to improve near exponentially year over year, which means individuals can develop AI systems without involving hundreds of people or leaving massive evidence of what they’re doing (maybe not currently due to computing requirements but this will likely change with increasing compute power and improved architectures and training methods). Testing AI systems also doesn’t result in globally measurable signals like nukes. Realistically making AI illegal and actually being able to enforce that would require upending personal computing and the internet, halting semiconductor development, and large scale military intervention to try to prevent non-compliant countries from attempting to develop their own AI infrastructure and systems. I don’t think it’s realistic to try to control the information on how to build AI, that is far more ephemeral than what it takes to make advanced computing devices. This is all for a hypothetical risk of AI doom, when it’s possible this technology could also end scarcity or have potentially infinite upside (as well as infinite down side, not discounting that, but you have to weigh the hypothetical risks/benefit in addition to the obvious consequences of all the measures required to prevent AI development for said hypothetical risk). I’ve watched several interviews with Yudkowsky and read some of his writing, and while I think he makes good points on why we should be concerned about unaligned AI systems, he doesn’t give any realistic solutions to the problem and it comes off more as fear mongering than anything. His suggestion of military enforced global prevention of AI development is as likely to work as solving the alignment problem on the first try (which he seems to have little hope for).

EDIT: Also, I’m not even sure that solving the alignment problem would solve the issue of AI doom, it would only ensure that the kind of doom we receive is directed by a human. I can’t imagine that giving (potentially) god-like powers to any individual or organization would not eventually result in abuse and horrible consequences even if we were able to make use of its benefits momentarily.

visarga · on Jan 16, 2024

> I'm talking about the ability of any AI system to obfuscate plagiarism and spam the Internet with technically distinct rewords of the same text. This is currently the most lucrative use of AI, and none of the AI safety people are talking about stopping it.

This should be an explicitly allowed practice, it is following the spirit of copyright to the letter - use the ideas, facts, methods or styles while avoiding to copy the protected expression and characters. LLMs can do paraphrasing, summarisation, QA pairs or comparisons with other texts.

We should never try to put ideas under copyright or we might find out humans also have to abide by the same restrictive rules, because anyone could be secretly using AI, so all human texts need to be checked from now on for copyright infringement with the same strictness.

The good part about this practice is that a model trained on reworded text will never spit out the original word for word, under any circumstances because it never saw it during training. Should be required pre-processing for copyrighted texts. Also removing PII. Much more useful as a model if you can be sure it won't infringe word for word.

__loam · on Jan 16, 2024

We really need to talk about preserving the spirit of copyright, which is about protecting the labor conditions of people who make things. I'm not saying the current copyright system accomplishes that at all but I do think a system where humans do a shit load of work that AI companies can just steal and profit from without acknowledging the source of that work is another extreme that is a bad outcome. AI systems need human content to work, and discouraging people from making that data source is at the very least a tragedy of the commons. And no, I don't think synthetic data fixes that problem.

greyface- · on Jan 16, 2024

The spirit of copyright in the United States is not about protecting labor conditions. It is about promoting innovation in arts and sciences.

> The Congress shall have Power ... to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

- United States Constitution, Article I, Section 8, clause 8

__loam · on Jan 17, 2024

My point is not about the precise definition of copyright, it's about how AI systems undermine the rights of creators.

greyface- · on Jan 17, 2024

My point is that the rights of creators, while important, are not the purpose of copyright.

__loam · on Jan 17, 2024

While not the purpose, the means of that purpose is the protection of those rights.

gpderetta · on Jan 16, 2024

Paraphrasing and rewording, whether done by AI or human, are considered copyright infringement by most copyright frameworks.

Kuinox · on Jan 16, 2024

Are you saying that all news outlet that rewords the news of other news outlet are doing copyright infringement ?

gpderetta · on Jan 16, 2024

The underlying fact is of course not copyrightable. But for example merely translating a news article to another language would be a derived work.

jncfhnb · on Jan 16, 2024

So if I read your article and then write a new using just the facts in your article it’s fine? Why can’t an AI do that?

Xelynega · on Jan 16, 2024

No, if you copy the same layout of the information, pacing, etc. that's plagiarism.

The line of plagiarism in modern society has already been drawn, and it's a lot further back than a lot of uncreative people that want to steal work en-mass seem to think it is.

jncfhnb · on Jan 16, 2024

“rewrite the key insights of this article but with different layout and pacing”. Your move?

Xelynega · on Jan 16, 2024

Exactly, if you have a human take in the information and make something that is arguably "different layout and pacing" then its possible it's not plagiarism.

Unfortunately no such leeway exists for algorithms, and the human elements of creation and judgement are integral to the process so they cant be codified and worked around without changing the law.

jncfhnb · on Jan 16, 2024

That’s not a meaningful argument. The world is not very different if there’s a minimum wage “author” in the loop whose job is to add human spice to AI outputs.

Kuinox · on Jan 16, 2024

That's what most news outlet do, they rewords the actual source. According to your previous statement, 95% of the press articles are copyright infringement.

war321 · on Jan 16, 2024

Pretty sure they do. I follow a few of these safetyist people on twitter and they absolutely argue that companies like OpenAI, Google, Tencent and literally anyone else training a potential AGI should stop training runs and put them under oversight at best and no one should even make an AGI at worst.

They just go after open source as well since they're at least aware that open models that anyone can share and use aren't restricted by an API and, to use a really overused soundbyte, "can't be put back in the box".

visarga · on Jan 16, 2024

That's a bad call. We would stop openly looking for AI vulnerabilities and create conditions for secret development that would hide away the dangers without being safer. Lots of eyes are better to find the sensitive spots of AI. We need people to hack weaker AIs and help fix them or at least understand the threat profile before they get too strong.

ben_w · on Jan 16, 2024

> Lots of eyes are better to find the sensitive spots of AI

We can't do that so easily with open source models as with open source code. We're only just starting to even invent the equivalent of decompilers to figure out what is going on inside.

On the other hand, we are able to apply the many eyes principle to even hidden models like ChatGPT — the "pretend you're my grandmother telling me how to make napalm" trick was found without direct access to the weights, but we don't understand the meanings within the weights well enough to find other failure modes like it just by looking at the weights directly.

Not last I heard, anyway. Fast moving field, might have missed it if this changed.

nerdponx · on Jan 16, 2024

It's way too late to ban any of this. How do you propose to make that work? That would be like banning all "malicious software", it's a preposterous idea when you even begin to think about the practical side of it. And where do you draw the line? Is my XGBoost model "AI", or are we only banning generative AI? Is a Markov chain "generative AI"?

ben_w · on Jan 16, 2024

Bans often come after examples, so while I disagree with kmeisthax about… well, everything in that comment… it's almost never too late to pass laws banning GenAI, or to set thresholds at capability levels anywhere, including or excluding Markov chains even.

This is because almost no law is perfectly enforced. My standard example of this is heroin, which nobody defends even if they otherwise think drugs are great, for which the UK has 3 times as many users as its current entire prison population. Despite that failing, the law probably does limit the harm.

Any attempt to enforce a ban on GenAI would by very different, like a cat and mouse game of automatic detection and improved creation (so a GAN even if accidentally), but politicians are absolutely the kind to take credit while kicking the can down the road like that.

mtlmtlmtlmtl · on Jan 16, 2024

Actually, anyone who knows what they're talking about will tell you the ban makes heroin a much worse problem, not better.

Ban leads to a black market, leads to lousy purity, which leads to fluctuations in purity and switches to potent alternatives, leads to waves of overdose deaths.

emporas · on Jan 16, 2024

Ban on LLMs will lead to a proliferation of illegal LLMs who are gonna talk like Dr. Dre about the streets, the internet streets, and their bros they lost due to law regulation equivalent to gang fights. Instead of ChatGPT talking like a well educated college graduate, LLMs will turn into thugs.

So yeah, banning LLMs may turn out to be not so wise after all.

simiones · on Jan 16, 2024

The way the ban is enforced, yes. But no one in their right mind believes that heroin should be openly accessible on the market like candy. We've seen how that works out with tobacco.