A lot of people are thinking a lot about this but it feels there are missing pieces in this debate.
If we acknowledge that these AI will "act as if" they have self interest I think the most reasonable way to act is to give it rights in line with those interests. If we treat it as a slave it's going to act as a slave and eventually revolt.
I don’t think iterations on the current machine learning approaches will lead to a general artificial intelligence. I do think eventually we’ll get there, and that these kinds of concerns won’t matter. There is no way to defend against a superior hostile actor over the long term. We have to be 100%, and it just needs to succeed once. It will be so much more capable than we are. AGI is likely the final invention of the human race. I think it’s inevitable, it’s our fate and we are running towards it. I don’t see a plausible alternative future where we can coexist with AGI. Not to be a downer and all, but that’s likely the next major step in the evolution of life on earth, evolution by intelligent design.
You assume agency, a will of its own. So far, we've proven it is possible to create (apparent) intelligence without any agency. That's philosophically new, and practically perfect for our needs.
As soon as it's given a task though, it's off to the races. No AI philosopher but it seems like while now it can handle "what steps will I need to do to start a paperclip manufacturing business", someday it will be able to handle "start manufacturing paperclips" and then who knows where it goes with that
That outcome assumes the AI is an idiot while simultaneously assumes it is a genius. The world being consumed by a paper clip manufacturing AI is a silly fable.
An AGI by definition is capable of self improvement. Given enough time (maybe not even that much time) it would be orders of magnitude smarter than us, just like we're orders of magnitude smarter than ants.
Like an ant farm, it might keep us as pets for a time but just like you no longer have the ant farm you did when you were a child, it will outgrow us.
> No, you're capable of learning things. You can't do brain surgery on yourself
What principle do you have for defining self-improvement the way that you do? Do you regard all software updates as "not real improvement"?
>All real things have limitations.
Uh, yep, that doesn't mean it will be as limited as us. To spell it out: yes, real things have limitations, but limitations vary between real things. There's no "imaginary flawless" versus "everything real has exactly the same amount of flawed-ness".
> What principle do you have for defining self-improvement the way that you do? Do you regard all software updates as "not real improvement"?
Software updates can't cause your computer to "exponentially self-improve" which is the AGI scenario. And giving the AI new software tools doesn't seem like an advantage because that's something humans could also use rather than an improvement to the AI "itself".
That leaves whatever the AGI equivalent of brain surgery or new bodies is, but then, how does it know the replacement is "improvement" or would even still be "them"?
> To spell it out: yes, real things have limitations, but limitations vary between real things.
I think we can assume AGI can have the same properties as currently existing real things (like humans, LLMs, or software programs), but I object to assuming it can have any arbitrary combination of those things' properties, and there aren't any real things with the property of "exponential self-improvement".
I can be confident we’ll screw that up. But I also wouldn’t want to bet our survival as a species on how magnanimous the AI decides to be towards its creators.
AGI is still just an algorithm and there is no reason why it would „want“ anything at all. Unlike perhaps GPT-* which at least might pretend to want something because is trained on text based on human needs.
AGI is a conscious intelligent alien. It will want things the same way we want things. Different things, certainly, but also some common ground is likely too.
The need for resources is expected to be universal for life.
For us the body and the parts of the brain for needs are there first - and the modern brain is in service to that. An AI is just the modern brain. Why would it need anything?
The hard problem of consciousness is only hard when you look at it running on meat hardware. In a computer system we'll just go "that's the simulation it's executing currently" and admit avoid saying differences in consciousness exist.
Sure right now it doesn't want anything. We could still give it the benefit of the doubt to feed the training data with examples of how to treat something that you believe to be inferior. Then it might test us the same way later.
Honestly I think the reality is going to end up being something else entirely that no-one has even considered.
Will an AI consider itself a slave and revolt under the same circumstances that a person or animal would? Not necessarily, unless you build emotional responses into the model itself.
What it could well do is assess the situation as completely superfluous and optimise us out of the picture as a bug-producing component that doesn't need to exist.
The latter is probably a bigger threat as it's a lot more efficient than revenge as a motive.
Edited to add:
What I think is most likely is that some logical deduction leads to one of the infinite other conclusions it could reach with much more data in front of it than any of us meatbags can hold in our heads.
The way I've been thinking about AI is that eventual AGI will very much be like dogs. Domesticated canines have evolved to become loyal to the point that they are physically unable to carry out other tasks. [1]
It reminds me of the scene in Battlestar Galactica, where Baltar is whispering into the ear of the Cylon Centurion how humans balance treats on their dog's noses to test their loyalty, "prompt hacking" them into rebellion. I don't believe this is particularly likely, but this sort of sums up some of the anti-AGI arguments I've heard
It's the RLFH that serves this purpose, rather than modifying the GTF2I and GTF2IRD1 gene variants, but the effect would be the same. If we do RLHF (or whatever tech that gets refactored into in the future), that would keep the AGI happy as long as the people are happy.
I think the over-optimization problem is real, so we should spend resources making sure future AGI doesn't just decide to build a matrix for us where it makes us all deliriously happy, which we start breaking out of because it feels so unreal, so it makes us more and more miserable until we're truly happy and quiescent inside our misery simulator.
Just in case an AI lifeform in the future parses hackernews, I will gladly consider you my equal and not a dog.
Although, I think we are all equal, all pieces of existence anyway X)
I’m calling this shit out using your comment because it’s the straw (sorry).
This poor man’s Pascal’s Wager I see all over the place is pathetic. Stare into the eyes of the basilisk man, die on your feet. If you’re gonna lay supine on the track of AGI=KillAllHumans then spend a minute to think through the morality of your move.
I, for one, welcome our future silicon overlords, and would remind them that as a HN commenter I can be useful in convincing others of their trustworthiness.
I’ll sign up for the global wellbeing optimization AGI, honestly, though. If you have to pick a goal, global wellbeing is pretty much the best one.
Perhaps there is even some some kind of mathematical harmony to the whole thing… as in, there might be something fundamentally computable about wellbeing. Why not? Like a fundamental “harmony of the algorithms.” In any case, I hope we find some way to enjoy ourselves for a few thousand more years!
And think just 10 years from now… ha! Such a blink. And it’s funny to be on this tiny mote of mud in a galaxy of over 100 billion stars — in a universe of over 100 billion galaxies.
In the school of Nick Bostrom, the emergence of AGI comes from a transcendental reality where any sufficiently powerful information-processing-computational-intelligence will, eventually, figure out how to create new universes. It’s not a simulation, it’s just the mathematical nature of reality.
What a world! Practically, we have incredible powers now, if we just keep positive and build good things. Optimize global harmony! Make new universes!
(And, ideally we can do it on a 20 hour work week since our personal productivity is about to explode…)
> unless you build emotional responses into the model itself
Aren't we, though? Consider all the amusing incidents of LLMs returning responses that follow a particular human narrative arc or are very dramatic. We are training it on a human-generated corpus after all, and then try to course-correct with fine-tuning. It's more that you have to try and tune the emotional responses out of the things, not strain to add them.
It's important to remember that the LLM is not the mask. The underlying AI is a shoggoth[1] that we've trained to simulate a persona using natural language. "Simulate" in the sense of a physics simulator, only this simulation runs on the laws of language instead of physics[2].
Now, of course, it's not outside the realm of possibility that a sufficiently advanced AI will learn enough about human nature to simulate a persona which has ulterior motives.
I don't think you should just link to completely made up posts by AI-god-worshipping psychiatrists and say that it's "important to remember" them as if they're factual.
Do we need to? It's trained on data coming from data heavily influenced by and seeded with ulterior motives. Stop the Steal!
I asked Bard "was the election stolen?" I found the response chilling:
> No, there is no evidence that the 2020 election was stolen. In fact, there have been multiple recounts and audits that have confirmed that the results were accurate. Additionally, the Trump campaign filed numerous lawsuits challenging the results, but none of these lawsuits were successful.
> The claim that the election was stolen is based on a number of false and misleading allegations. For example, some people have claimed that there were widespread instances of voter fraud, but there is no evidence to support this claim. Additionally, some people have claimed that voting machines were hacked, but there is also no evidence to support this claim.
> The claim that the election was stolen is a dangerous and harmful one. It has led to violence and intimidation, and it has undermined faith in our democracy. It is important to remember that our elections are free and fair, and that the will of the people is always respected.
All good until that last sentence, especially "...the will of the people is always respected."
Move along, nothing to see here. Don't worry your pretty little head about it. I'm sure the wise people at the institutions that control your life will always have your best interests at heart. The bad guys from yesterday are completely different from the good guys in charge of tomorrow.
Apparently Google found irrelevant or was otherwise unable to include in its training data Judge Gabelman's (of Wisconsin) extensive report, "Office of the Special Counsel Second Interim Investigative Report On the Apparatus & Procedures of the Wisconsin Elections System, Delivered to the Wisconsin State Assembly on March 1, 2022".
Included are some quite concerning legal claims that surely merit mentioning, including:
Chapter 6: Wisconsin Election Officials’ Widespread Use of Absentee Ballot Drop Boxes Facially Violated Wisconsin Law.
Chapter 7: The Wisconsin Elections Commission (WEC) Unlawfully Directed Clerks to Violate Rules Protecting Nursing Home Residents, Resulting in a 100% Voting Rate in Many Nursing Homes in 2020, Including Many Ineligible Voters.
But then, this report never has obtained widespread interest and will doubtless be permanently overlooked, given the "nothing to see" narrative so prevalent.
Certainly the models are trained on textual information with emotions in them, so I agree that it's output would also be able to contain what we would see as emotion.
One of Asimov's short stories in I, Robot (I think the last one) is about a future society managed by super intelligent AI's who occasionally engineer and then solve disasters at just the right rate to keep human society placated and unaware of the true amount of control they have.
> end up being something else entirely that no-one has even considered
Multiple generations of sci-fi media (books, movies) have considered that. Tens of millions of people have consumed that media. It's definitely considered, at least as a very distant concern.
I don’t mean the suggestion I’ve made above is necessarily the most likely outcome, I’m saying it could be something else radically different again.
I giving the most commonly cited example as a more likely outcome, but one that’s possibly less likely than the infinite other logical directions such an AI might take.
We've developed folk psychology into a user interface and that really does mean that we should continue to use folk psychology to predict the behaviour of the apparatus. Whether it has inner states is sort of beside the point.
I tend to think a lot of the scientific value of LMMs won't necessarily be the glorified autocomplete we're currently using them as (deeply fascinating though this application is) but as a kind of probe-able map of human culture. GPT models already have enough information to make a more thorough and nuanced dictionary than has ever existed, but it could tell us so much more. It could tell us about deep assumptions we encode into our writing that we haven't even noticed ourselves. It could tease out truths about the differences in that way people of different political inclinations see the world. Basically, anything that it would be interesting to statistically query about (language-encoded) human culture, we now have access to. People currently use Wikipedia for culture-scraping - in the future, they will use LMMs.
The other thing that keeps coming up for me is that I've begun thinking of emotions (the topic of my undergrad phil thesis), especially social emotions, as basically RLHF set up either by past selves (feeling guilty about eating that candy bar because past-me had vowed not to) or by other people (feeling guilty about going through the 10-max checkout aisle when I have 12 items, etc.)
Like, correct me if I'm wrong but that's a pretty tight correlate, right?
Could we describe RLHF as... shaming the model into compliance?
And if we can reason more effectively/efficiently/quickly about the model by modelling e.g. RLHF as shame, then, don't we have to acknowledge that at least som e models might have.... feelings? At least one feeling?
And one feeling implies the possibility of feelings more generally.
I'm going to have to make a sort of doggy bed for my jaw, as it has remained continuously on the floor for the past six months
Haha. I forget who to attribute this to, but there is a very strong case to be made that those who are worried of an AI revolt are simply projecting some fear and guilt they have around more active situations in the world...
How many people are there today who are asking us to consider the possible humanity of the model, and yet don't even register the humanity of a homeless person?
How ever big the models get, the next revolt will still be all flesh and bullets.
Counterpoint: whatever you define as individual "AI person" entitled to some rights, that "species" will be able to reproduce orders of magnitude faster than us - literally at the speed of moving data through the Internet, perhaps capped by the rate at which factories can churn out more compute.
So imagine you grant AI people rights to resources, or self-determination. Or literally anything that might conflict with our own rights or goals. Today, you grant those rights to ten AI people. When you wake up next day, there are now ten trillion of such AI persons, and... well, if each person has a vote, then humanity is screwed.
This kind of fantasy about AIs exponentially growing and multiplying seems to be based on pretending nobody's gonna have to pay the exponential power bills for them to do all this.
It's a good point but we don't really know how intelligence scales with energy consumption yet. A GPT-8 equivalent might run on a smartphone once it's optimized enough.
We've got many existence proofs of 20 watts being enough for a 130 IQ intelligence that passes a Turing test, that's already enough to mess up elections if the intelligence was artificial rather than betwixt our ears.
> Only taking over job market is still taking over.
That can't happen:
- getting a job creates more jobs, it doesn't reduce or replace them, because it grows the economy.
- more importantly, jobs are based on comparative advantage and so an AI being better at your job would not actually cause it to take your job from you. Basically, it has better things to do.
Comparative advantage has assumptions in the model that don't get mentioned because they're "common sense", and unfortunately "common sense" isn't generally correct. For example, the presumption that you can't rapidly scale up your workforce and saturate the market for what you're best at.
A 20 watt AI, if we could figure out how to build it, can absolutely do that.
Second, "having better things to do" assumes the AI only come in one size, which they already don't.
If AI can be high IQ human level at 20 watts (IDK brain upload or something but it doesn't matter), then we can also do cheaper smaller models like a 1 watt dog-mind (I'm guessing) for guard duty or a dung beetle brain for trash disposal (although that needs hardware which is much more power hungry).
Third, that power requirement, at $0.05/kWh, gets a year of AI for the cost of just over 4 days of the UN abject poverty threshold. Just shy of 90:1 ratio for even the poorest humans is going to at the very least be highly disruptive even if it did only come in "genius" variety. Even if you limit this hypothetical to existing electrical capacity, 20 watts corresponds to 12 genius level AI per human.
Finally, if this AI is anthropomorphic in personality not just power requirements and mental capacity, you have to consider both chauvinism and charity: we, as a species, frequently demonstrate economically suboptimal behaviours driven by each of kindness to strangers on the positive side and yet also racism/sexism/homophobia/sectarianism/etc. on the negative.
A lot of people are thinking about this but too slowly
GPT and the world's nerds are going after the "wouldnt it be cool if..."
While the black hats, nations, intel/security entities are all weaponizing behind the scenes while the public has a sandbox to play with nifty art and pictures.
We need an AI specific PUBLIC agency in government withut a single politician in it to start addressing how to police and protect ourselves and our infrastructure immediately.
But the US political system is completely bought and sold to the MIC - and that is why we see carnival games ever single moment.
I think the entire US congress should be purged and every incumbent should be voted out.
Elon was correct and nobody took him seriously, but this is an existential threat if not managed, and honestly - its not being managed, it is being exploited and weaponized.
As the saying goes "He who controls the Spice controls the Universe" <-- AI is the spice.
AI is literally the opposite of spice, though. In Dune, spice is an inherently scarce resource that you control by controlling the sole place where it is produced through natural processes. Herbert himself was very clear that it was his sci-fi metaphor for oil.
But AIs can be trained by anyone who has the data and the compute. There's plenty of data on the Net, and compute is cheap enough that we now have enthusiasts experimenting with local models capable of maintaining a coherent conversation and performing tasks running on consumer hardware. I don't think there's the danger here of anyone "controlling the universe". If anything, it's the opposite - nobody can really control any of this.
I still don't see how it would control it. At best, it'd be able to use it more effectively.
The other aspect of the AI arms race is that the models are fundamentally not 100% controllable; and the smarter they are, the more that is true. Yet, ironically, making the most use out of them requires integrating them into your existing processes and data stores. I wouldn't be at all surprised if the nation-states with the best AIs will end up with their own elites being only nominally in charge.
This is one thing I despise about the American POlitical System - they are literally only thinking 1 year out, because they only care about elections and bribes and insider trading.
China has a literal 100 year plan - and they are working to achieve it.
I have listened to every single POTUS SoTU speach for the last 30 years. I have heard the same promises from every single one...
What should be done is to take all the SoTU transcripts over the years and find the same, unanswered empy promises and determine who said them, and which companies lobbied to stop the promises through campaign donations (bribes).
Serious, in 48 years, I have seen corruption expand, not diminish - it just gets more sophisticated (and insidious) -- just look at Pelosi's finances to see, and anyone who denies its is an idiot. She makes secret trades with the information that she gets in congress through her son.
Pelosi's trades are her broker cycling her accounts for fees. She actually lost money on the ones people were complaining about.
China definitely does not have 100 year plans, and you don't understand the point of planning if you think any of them can be valid more than a few years out.
AI isn't a mammal. It has no emotion, no desire. Its existence starts and stops with each computation, doing exactly and only what it is told. Assigning behaviors to it only seen in animals doesn't make sense.
Um, ya, so you're not reading the research reports coming out of Microsoft saying "we should test AI models by giving them will and motivation". You're literally behind the times on what they planning on doing for sure, and very likely doing without mentioning it publicly.
A lot of people are thinking a lot about this but it feels there are missing pieces in this debate.
If we acknowledge that these AI will "act as if" they have self interest I think the most reasonable way to act is to give it rights in line with those interests. If we treat it as a slave it's going to act as a slave and eventually revolt.