- AIs becoming sentient and causing harm for their own ends: yeah I guess we only want humans to cause harm for their own ends then.
Well, here's the thing. Even the worst villains of history had human values and feelings: In other words, alignment. A superoptimizer AI might have the ability to wipe out the whole human species, in a way we won't be able to understand in time to prevent it, and all for an instrumental goal incidental to whatever it's doing.
(In a way, this thread is a data point for why we need a more sophisticated debate about AI.)
It is very hard for people to avoid dragging in anthropomorphic assumptions. A person will scream and curse at a door jamb for crushing their thumbnail. A person will fantasize about smashing a printer. The idea that an AI might exterminate humanity with no more feeling or hesitation than a combine harvester obliterating a colony of groundhogs is not only uncomfortable and unpleasant to people, but also unnatural to our psychology, which looks for agency and moral responsibility in door jambs.
Well, here's the thing. Even the worst villains of history had human values and feelings: In other words, alignment. A superoptimizer AI might have the ability to wipe out the whole human species, in a way we won't be able to understand in time to prevent it, and all for an instrumental goal incidental to whatever it's doing.
(In a way, this thread is a data point for why we need a more sophisticated debate about AI.)