An AI wolf that preferred suicide over eating sheep

spywaregorilla · on July 6, 2021

Seems like a nothing story. Just looking at the game, there's obviously a constant decision to be made of chase more sheep or instantly die. It sounds like in the original model they had a max of 20 seconds, so it's not surprising that you would just tank your losses to maximize your score every now and then.

Anyone who tries to devise optimal strategies for things should be able to see this isn't especially interesting.

Social metaphors are wildly out of place.

They say "unintended consequences of a blackbox" but I doubt that's true. Make it a deterministic turn based game and run it through a perfectly transparent optimization model and I wouldn't be surprised to learn this was just the best strategy for the rules they devised. I really hate when people describe an ai as something that cannot be understood because they personally don't understand it.

markwkw · on July 6, 2021

Exactly, from technical perspective it's a nothing story.

It's interesting, though, how strong of a reaction general public had to this. The story must have strongly resonated with what some folks were already feeling. When you squint (pretend to understand the technology not at all) it's a tragic story. The situation of the wolf seems similar to the situation of some people. Chasing their careers in a highly structured, sort of dehumanized, environment of constant pursuit. "Supreme Intelligence" (that's what a layperson may think of AI) looks at a situation of the wolf and decides that it makes no sense to continue the pursuit. Moreover, what is "optimal" is the most tragic result - suicide.

wombatmobile · on July 6, 2021

> The story must have strongly resonated with what some folks were already feeling.

Yes, because we don't see things as they are, we see them as we are.

edoceo · on July 6, 2021

At a Grateful Dead show in Oakland this geezer said to me:

Your perception IS your reality man!

BoxOfRain · on July 6, 2021

I'd have loved to have been arond to a Dead show! I know it sounds a little ungrateful coming from someone who lives in a period of unprecedented access to all kinds of wonderful music being written all the time, but there's something about the Dead that really connects with me that I can't quite put my finger on.

sitkack · on July 6, 2021

Dark Star Orchestra is your current best bet https://www.youtube.com/watch?v=y8_THRZLSi4

Darvokis · on July 6, 2021

Schopenhauer: World as representation.

acituan · on July 6, 2021

From the article in contrast to what you said;

> Perhaps the true lesson to be learnt here isn’t about helplessness and giving up. It’s about getting up, trying again and again, and staying with the story till the end.

I find the possibility of contrasting interpretations absurd. The problem with using any dead matter for our meaning making needs is it is ultimately a self-referential justification for how we think we should feel, while being equally or even more prone to self deception traps.

AI being the object is irrelevant here, this is nothing different than astrology or divination from tea leaves etc. It is 2000 BC level religious thinking with new toys.

Patoria_Coflict · on July 6, 2021

Any programmer would have seen the issue and made the change about rewarding suicide.

The ONLY reason this was written was because the researches hired a programmer to make a specific thing, then is was too expensive for them to make more changes so they published the mistake.

SubiculumCode · on July 6, 2021

Exactly. It is a social commentary story where a result from a student's project was a lucid analogy of the plight of their lived rat-race in modern China, with the lesson being: Cut your losses and lie flat. To those within ML field, this is less than new, but as a commentary on how such ML issues can be a teachable and easily understood analogy to people's lives certainly makes the story interesting to me.

lancengym · on July 7, 2021

Spot on. Funny results from poorly specified AI experiments have cropped up since the 90's. But the interestingly angel her is how this one came out of nowhere at the right time and resonated with young working class Chinese.

er4hn · on July 6, 2021

> Exactly, from technical perspective it's a nothing story.

I think that one thing it points to is how technology can discover novel iterations on a system. Imagine if this was a system modeled around a network and the agent was trying to figure out how to get from the outside to read a specific system asset. With the right (read: very detailed) modeling you could create a pentesting agent.

chaostheory · on July 6, 2021

> Chasing their careers in a highly structured, sort of dehumanized, environment of constant pursuit.

They have a word for it over there: involution i.e. no matter how much effort you put in, you get the same result.

ajuc · on July 6, 2021

Similarly I've seen A LOT of people posting stories about "chat bot exposed to internet started praising Hitler and became racist/sexist/antisemitic" as a proof that "supreme intellect sees through leftist political correctness and knows that alt-right is correct about everything".

BoxOfRain · on July 6, 2021

It's really not that deep, people will always find sport in scandalising people with a stronger disgust reaction than themselves. It's more a new way of teaching a parrot to say "fuck" rather than a heartfelt statement of political belief in my opinion.

a1369209993 · on July 6, 2021

> It's more a new way of teaching a parrot to say "fuck"

This is a excellent analogy for this sort of behaviour, thank you.

frozenport · on July 6, 2021

I think a lot of those people are joking?

spywaregorilla · on July 6, 2021

Shrug. Another way to frame this is a poker bot learned to fold when given a bad hand, and they only gave it the same bad hand.

Yes, yes, woe is the individual in modern capitalist society but the only reason people are reacting to this are that they don't understand it and they've been told it's something much more emotionally impactful than it actually is.

colinmhayes · on July 6, 2021

>but the only reason people are reacting to this are that they don't understand it

I think it's much more likely that they're reacting like this because they see their own plight in the wolf. It doesn't matter why the wolf killed itself, it became a meme that allowed many Chinese to reflect together on a common plight.

sdenton4 · on July 6, 2021

I think there's a bit more to the analogy than just the suicidal wolf, though. The wolf is offing itself to minimize loss because there's no clear path to a better outcome.

This seems like a common refrain when we see radicalized engineering students from less-developed countries, who are notably common in extremist groups. They're people on a very difficult path (an engineering program!) with no real path to success (living in a society where unemployment for people with degrees is very high). Cost for continuing on the path is high, and there's no obvious path to get the good outcomes.

spywaregorilla · on July 6, 2021

Having reread the article, it seems like the concept of suicide doesn't weigh into the cultural reaction at all. It's just giving up on the chase.

lancengym · on July 7, 2021

Yes you are quite right. The social media reactions did not suggest an attitude of suicide at all. It was more of laid back life instead of meeting expectations and attaining so-called success.

rob74 · on July 6, 2021

It's not surprising from the perspective of an "AI actor". But if you call it a "wolf", most people will assume that it will behave at least roughly like a real-world creature, and the self-preservation instinct is one of the most basic traits of all living beings, so the "AI wolf" not having that is indeed surprising for a layperson.

mcguire · on July 6, 2021

"I really hate when people describe an ai as something that cannot be understood because they personally don't understand it."

On the other hand, keep in mind that a significant weakness of most modern AI research is that it's extremely difficult to understand: you have the input, the output, and a bag of statistical weights. In the story, you know the (trivially bad) function that is being optimized; in general you may not. It's not without implications for other systems.

Further,

"At the end of the day, student and teacher concluded two things:

"* The initial bizarre wolf behavior was simply the result of ‘absolute and unfeeling rationality’ exhibited by AI systems.

"* It’s hard to predict what conditions matter and what doesn’t to a neural network."

spywaregorilla · on July 6, 2021

The tooling for understanding complex models is a lot better than what most people assume.

> The initial bizarre wolf behavior was simply the result of ‘absolute and unfeeling rationality’ exhibited by AI systems.

This is a bad quote. They should not say this. It's a poorly trained agent doing a decent job of a poorly defined environment. Absolute rationality conjures images of some greater thinking but its actually a really stupid model that hit a local maxima. Calling it unfeeling implies the model has some concept of "wolf" and "suicide" but it does not. Replace the visuals with single pixel dots if you want an honest depiction of the room for feelings.

> It’s hard to predict what conditions matter and what doesn’t to a neural network."

This is generally true, but it isn't true here.

vnorilo · on July 6, 2021

If we play the analogy further: life is suffering, apart from the brief ecstasy of eating sheep. The AI was trying not to suffer, thus chose the boulder.

Did my best to translate the (misguided) fitness function to fiction.

SubiculumCode · on July 6, 2021

Man is born crying, and when he's cried enough, he dies. -Kyoami in Ran.

Cutting one's losses early may appear to be the most rational act if trying to minimize an agent's total suffering.

syntheticnature · on July 6, 2021

David Benatar reached a similar philosophic conclusion due to his utilitarian views, which was amusingly put (with a sort of AI present, no less) in this webcomic: https://existentialcomics.com/comic/253

SubiculumCode · on July 6, 2021

Thanks. I think I just found a new comic to read.

988747 · on July 6, 2021

Which is why some forms of Buddhism are basically a cult of death: https://en.wikipedia.org/wiki/Sokushinbutsu

spywaregorilla · on July 6, 2021

> In the video game The Legend of Zelda: Breath of the Wild, the monks in the Ancient Shrines seem to be based on sokushinbutsu.

Factoid of the day for sure

phkahler · on July 6, 2021

It's good because most people can understand it. I'd say it's a perfect strategy for a game, but if they're using evolutionary algorithms they should require some form of reproduction for the wolves to carry on. That would make the suicide strategy fail to propagate well. I can also see a number of possible strange outcomes even then.

spywaregorilla · on July 6, 2021

You're conflating the evolution of the strategy with the idea of the evolution of the actor being controlled by the agent. To give an obvious example, if dying gave 100 points instead of subtracting 10, even the dumbest evolutionary algo would learn to commit suicide asap. The survival of the actor has no intrinsic relevance to how the evolution develops.

phkahler · on July 8, 2021

>> You're conflating the evolution of the strategy with the idea of the evolution of the actor being controlled by the agent.

Yes.

The survival of the actor has no intrinsic relevance to how the evolution develops.

No, not in this case. That was my point. That's why the outcome should not be surprising.

jonnycomputer · on July 6, 2021

What mechanism are you thinking of? One in which having offspring is rewarding and so enters into the same learning algorithm, or one in which the learning algorithm/action selection is evolved and differentially conserved?

fny · on July 6, 2021

If I remember correctly there were similar scenarios that would occur using that popular Berkeley Pacman universe where he would run into a ghost to avoid the penalty of living for too long.

vladTheInhaler · on July 9, 2021

The example you're thinking of is actually in gridworld [1]. As you allud to, one of the parameters of the model is the cost of simply being alive for an additional time-step. If the cost is negative (a reward), then the agent will just sit there forever and accumulate infinite points. If it is zero, it might still just sit there to avoid falling into the hole, which has a large penalty and ends the simulation. As you turn up the dial on the cost of living, the agent starts using more and more aggressive strategies to reach the goal quickly. But if you make it too big, it will just jump in the hole.

[1] https://inst.eecs.berkeley.edu/~cs188/fa18/assets/slides/lec...

hotwire · on July 6, 2021

It reminds me of the thread about the Quake 3 bots, who left alone for several years, figured out that the best approach was to not kill each other.

https://i.imgur.com/dx7sVXj.jpg

spywaregorilla · on July 6, 2021

Without knowledge of their reward function its difficult to tell if they're converged on this strategy or if its just broken.

TomAnthony · on July 6, 2021

Similar story of unexpected AI outcomes...

As part of my PhD research, I created a simplified Pac-Man style game where the agent would simply try to stay alive as long as possible whilst being chased by the 3 ghosts. The agent was un-motivated and understood nothing about the goal, but was optimising for maximising its observable control over the world (avoiding death is a natural outcome of this).

I spent sometime trying to debug a behaviour where the agent would simply move left and right at the start of each run, waiting for the ghosts to close in. At the last minute it would run away, but always with a ghost in the cell right behind it.

Eventually, I realised this was an outcome of what it was optimising for. When ghosts reached cross-roads in the world they would got left or right randomly (if both were same distance to catching the agent). This randomness reduced the agent's control over the world, so was undesirable. Bringing a ghost in close made that ghost's behaviour completely predictable.

joek1301 · on July 6, 2021

Yet another similar story. A side project of mine was building a rudimentary neural network whose weights were optimized via a genetic algorithm. The goal was operating top-down, 2D self-driving cars.

The cars' "fitness" function rewarded cars for driving along the course and punished them for crashing into walls. But evidently this function punished a little too severely: the most successful cars would just drive in tight circles and never make progress on the course. But they were sure to avoid walls. :)

yodelshady · on July 6, 2021

I believe that tactic is called "kiting" and used by speedrunners?

joe_the_user · on July 6, 2021

Yeah, waiting for the ghosts to get close was a standard strategy I used back when I played lots of Pacman.

Having all the ghosts behind you gives you more control since they'll follow you in a line.

That the ghosts follow the player is what makes the game winnable. If they formed a grid and gradually closed-in, it would be impossible to escape.

Edit: What was unexpected in this case was that the system found a strategy the programmer didn't think of.

NaturalPhallacy · on July 7, 2021

Yup! It's also used in other games a lot.

For example in EVE Online with a 1v1 fight two basic tactics are either Kite or Brawl. A kiter that can maintain range will beat a brawler. But a brawler that 'catches' a kiter will generally win.

TomAnthony · on July 6, 2021

Yes! Exactly - kiting. I didn't know the term but when I explained the behaviour I was seeing to a colleague they told me about this.

Retr0id · on July 6, 2021

Another similar story, I remember reading about an AI that simply paused the game when it was about to die. I can actually remember doing something similar as a child.

0110101001 · on July 6, 2021

https://youtu.be/xOCurBYI_gY&t=15m10s

edejong · on July 6, 2021

Same story as one I shared 4 years ago. Seems to be the best tactic! https://news.ycombinator.com/item?id=14031932

Edit: don’t want to sound accusatory

jeremysalwen · on July 6, 2021

No need to be accusatory. The stories are different, just the learned behavior is the same. And not very surprising, considering your story was pre-empted by Pac-Man speedrunners, who already discovered this technique, which they call "kiting".

You can see the paper OP wrote to confirm for yourself that their story is not the same as yours: https://uhra.herts.ac.uk/bitstream/handle/2299/15376/906989....

TomAnthony · on July 6, 2021

Hah - thank you for sharing!

That is very interesting that this emerged from two different approaches.

I published my result years back, and have never heard of this emerging elsewhere before!

Didn’t take it as accusatory [but thanks to child for sharing link :)].

wildmanx · on July 6, 2021

"Completely predictable" is different from "This would minimize the probability of being fenced in by the four ghosts." no?

McMiniBurger · on July 6, 2021

hm... "keep your friends close but your enemies closer" ...?

lancengym · on July 6, 2021

But try to make sure your enemies don't end up surrounding you?

inglor_cz · on July 6, 2021

Yeah, that is tricky. I believe that Constantinople once found out the hard way, and thus is now Istanbul.

TheDauthi · on July 6, 2021

I guess people just liked it better that way.

TchoBeer · on July 6, 2021

How did you measure control over the world?

TomAnthony · on July 6, 2021

The method was called 'empowerment'. Two ways to explain it...

From a mathematical perspective, we used Information Theory to model the world as an information theoretic 'loop'. The agent could 'send' a signal to the world by performing an action, which would change the state of the world; the state of the world was what the agent 'received'. This obviously relies on having a model of the world and what your actions will do, but doesn't burden the model with other biases.

Pore more colloquially, the agent could perform actions in the world, and see the resulting state of the world (in my case, that was the location of the agent and of the ghosts). Part of the principle was that changes you cannot observe are not useful to you.

greenpresident · on July 6, 2021

In an active inference approach you would have the agent minimise surprisal. Choose the action that is most likely to produce the outcome you predicted.

TomAnthony · on July 6, 2021

The approach I used was similar. The idea of maximising observed control of the world means you seek states where you can reach many other states, but _predictably_ so. This comes 'for free' when using Information Theory to model a channel.

cmehdy · on July 6, 2021

Do you have any reading you'd recommend related to this?

I naively thought it would be some kind of Kalman filtering of sorts but from what I gather in your words it doesn't even have to be "that" complicated, right?

edit: found your link to the paper in another post ( https://news.ycombinator.com/item?id=27749619 ), thanks!

benlivengood · on July 6, 2021

What's the tradeoff between "delete all state in the world with 100% certainty" and "be able to choose any next state of the world with (100-epsilon)% certainty"?

TomAnthony · on July 7, 2021

In Information Theory, there is a concept of Channel Capacity. If a channel is defined as the probability of the output being s if you send a, across all possible values of a, then the Channel Capacity is the maximum amount of information you can communicate across this channel, measured in bits.

To achieve the Channel Capacity you need to find the optimum distribution across a - i.e. what set of signals maximises the information you can transmit on this channel. There are known algorithms for finding this distribution (e.g. Blahut-Arimoto).

Now if you model the world as a channel, where s represents the reachable states and a represents the actions the agent can take (and the channel, P(s|a), represents the dynamics of the world), you can calculate what actions allow you maximal control (in terms of states you can controllably reach).

More info in this paper: https://uhra.herts.ac.uk/handle/2299/15376

TchoBeer · on July 6, 2021

Why would this cause the net to avoid death? Do things keep moving after pacman dies?

Iv · on July 6, 2021

A while ago, a very simple agent I made had to do tasks in the maze and evaluate strategies to reach them. I wanted it to have no assumptions about the world, so it started with minimum knowledge. Its first plan was to try to remove walls, to get to the things it needed.

It is a fun feeling when your own program surprises you.

johbjo · on July 6, 2021

It can depend on what the agent "sees" and how many time-steps away the "consequences" are. If the ghosts are so far away that any action will take t time-steps before consequences to the agent, the actions are pseudo-random because there is no reward to optimize on.

The number of outcomes in branching_factor^t (very large) makes the action-values at t=0 (where the agent chooses between two/three actions) almost uniform random.

TomAnthony · on July 6, 2021

Yes, you are right.

I experimented with different time horizons, mostly look 3-7 steps ahead.

In terms of the 'reward', that was implicit within the model - if the ghosts caught you, your ability to influence the state of the world dropped to 0.

fnord77 · on July 6, 2021

this sounds interesting. can you link your research or paper?

TomAnthony · on July 6, 2021

Sure! The PDF is available here:

https://uhra.herts.ac.uk/handle/2299/15376

TeMPOraL · on July 6, 2021

Reminds me of the old essay by 'Eliezer: "The Hidden Complexity of Wishes".

https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...

In it, there is a thought experiment of having an "Outcome Pump", a device that makes your wishes come true without violating laws of physics (not counting the unspecified internals of the device), by essentially running an optimization algorithm on possible futures.

As the essay concludes, it's the type of genie for which no wish is safe.

The way this relates to AI is by highlighting that even ideas most obvious to all of us, like "get my mother out of that burning building!", or "I want these virtual wolves to get better at eating these virtual sheep", carry incredible amount of complexity curried up in them - they're all expressed in context of our shared value system, patterns of thinking, models of the world. When we try to teach machines to do things for us, all that curried up context gets lost in translation.

lancengym · on July 6, 2021

Interesting essay. I think the big blind spot for humans programming AI is also the fact that we tend to overlook the obvious, whereas algorithms will tend to take the path of least resistance without prejudice or coloring by habit and experience.

TeMPOraL · on July 6, 2021

Yes. What I like about AI research is that it teaches us about all the things we take for granted, it shows us just how much of meaning is implicit and built on shared history and circumstances.

saalweachter · on July 6, 2021

The hard part about programming is that you have to tell the computer what you want it to do.

TeMPOraL · on July 6, 2021

The difficult, but in many ways rewarding, core of that is that it forces you to finally figure out what you actually want, because the computer won't accept anything except perfect clarity.

nojs · on July 6, 2021

Related to the paperclip maximiser [1]:

> Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.

[1] https://en.m.wikipedia.org/wiki/Instrumental_convergence

eldenbishop · on July 6, 2021

There is a wonderful little game based on this concept called universal paperclips. The AI eventually consumes all the matter in the universe in order to turn it into paperclips.

https://www.decisionproblem.com/paperclips/

foldr · on July 6, 2021

Aesop managed to make the point a lot more concisely: "Be careful what you wish for, lest it come true." (Although now that I look, I don't think that's a translation of any specific part of the text.)

TeMPOraL · on July 6, 2021

Yes, but that moral is attached to a story. Morals and saws work as handles - they're useful for communication if both you and your interlocutor know the thing they're pointing to. Conversely, they are of little use until you read the story from which the moral comes, or personally experience the thing the saw talks about.

foldr · on July 6, 2021

Eliezer Yudkowsky tells a long story about an Outcome Pump. Aesop tells a short story about an eagle and a tortoise. The point made is the same, as far as I can see.

TeMPOraL · on July 6, 2021

Eliezer tells the story that elaborates on why you should be careful what you wish for. Of about a dozen versions of the Eagle and Tortoise story I've just skim-read, none of them really has this as a moral - in each of them, either the Eagle or a Tortoise was an asshole and/or liar and/or lying asshole, so the more valid moral would be, "don't deal with dangerous people" and/or "don't be an asshole" and/or "don't be an asshole to people who have power to hurt you".

OscarCunningham · on July 6, 2021

A closer tale might be https://en.wikipedia.org/wiki/The_Sorcerer%27s_Apprentice

m12k · on July 6, 2021

I think a major takeaway here is that balancing a reward system to reward more than a single behavior is really hard - it's easy to tip the scales so one behavior completely dominates all others. It's an interesting lens to use to look at the heuristic reward system humans have built in (hunger, fear, desire, etc). This tends to have an adaptation/numbing effect, where repeated rewards of the same type tend to have diminishing returns, and that makes sense because it protects against "gaming the system" and going for one reward to the exclusion of all others.

SamBam · on July 6, 2021

Evolution works in an incredibly complex "fitness landscape," where certain minor tweaks in phenotype or behaviors can affect your fitness in quite complex ways.

Genetic Algorithms attempt to use this same system over extremely simple "fitness landscapes," where the fitness of an agent is defined by programmers using some simple mathematical formula or something.

When the fitness function is being defined in the system by programmers, instead of emerging from a rich and complex ecosystem, then the outcome depends exactly on what the programers choose. If they fail to see the consequences of their scoring algorithm, that's on them. There's nothing really magical going on, they simply failed to foresee the consequences of their choice.

(As someone who has worked with GAs and agent models, this outcome really doesn't surprise me. I would have said "oops, I need to weight the time less" and re-run it, and not thought twice.)

mcguire · on July 6, 2021

From the article: (I don't know Chinese, but the animations are clear enough.)

https://www.bilibili.com/video/BV16X4y1V7Yu?p=1&share_medium...

bserge · on July 6, 2021

That was my thought, too. They used too few rewards in the first place, but had they used something more complex it would then have become hard to balance it all.

kzrdude · on July 6, 2021

leela (lc0) chess also has this problem. People sometimes thinks it wins too slowly (prefers some surefire way to win by 50 moves instead of slightly more risky by 5 moves), or that it plays without tact when in a losing position (it's hard for it to rank moves when all of them lead to a loss, it doesn't have the sense that humans do of still preserving the beauty of the game).

AIs need to learn to feel awkward and avoid it, just like we humans do (even if it feels very irrational at times).

lancengym · on July 7, 2021

What do you make of the documentary on AlphaGo where the AI did seemingly suicidal and incomprehensible moves to the human masters but won in the end, baffling everyone? https://youtu.be/WXuK6gekU1Y

wombatmobile · on July 6, 2021

The philosopher Hubert Dreyfus argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all.

https://www.nature.com/articles/s41599-020-0494-4

What he means is that computers, which can learn rules and use those rules to make predictions in certain domains, nevertheless cannot exercise general intelligence because they are not "in the world". This renders them unable to experience and parse culture, most of which is tacit in real time, and sustained by enduring mental models which we experience as "expectations" that we navigate with our emotions and senses.

Culture is the platform on which intelligence is manifest, because the usefulness of knowledge is not absolute - it is contextual and social.

cperciva · on July 6, 2021

The philosopher Hubert Dreyfus argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all.

Similarly, nuclear submarines, which lacking all of the critical organs of fish, are completely unable to swim.

fellow_human · on July 6, 2021

Similarly, a brick has the ability to deep sea dive!

cpach · on July 6, 2021

The nuclear submarine is just part of our extended phenotype :)

feoren · on July 6, 2021

A good example of why philosophers are utterly useless mental masturbators who spend all their time arguing about definitions of words. Here he takes something obviously stupid and wrong and says it in such a way that you can feel smart by regurgitating it. Computers don't exist in the world? What? It must be some problem with their Thetans. Er, sorry, I mean "qualia".

NaturalPhallacy · on July 7, 2021

>The philosopher Hubert Dreyfus argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all.

I feel like that's utter nonsense. What the things we misname 'AIs' today don't lack intelligence. They lack motivation. Goals. It has nothing to do with childhood or culture. It's not What™ or How™ that is missing but Why™.

For example even the dumbest 'living' organism is motivated to reproduce. Even if it doesn't know why. But since all the ones that weren't didn't, they died out and all we're left with are the ones that do.

And humans without a Why™ strongly resemble what we call depressed.

tiborsaas · on July 6, 2021

This is why all AI today falls o to the narrow AI category. It just often omitted because it's true for all of them.

colinmhayes · on July 6, 2021

Imagine being a dualist in the 21st century.

goatlover · on July 6, 2021

What in the parent post is dualist? Sounds more like an argument that animals have embodied intelligence.

But as for being a dualist in the 21st century, there is always consciousness, information and math. All three of which can lead to some form of dualism/platonism.

mcguire · on July 6, 2021

Many of Dreyfuss' and other similar arguments reduce do dualism when you start digging into them. I don't have the time to dig into the specific article, but here's some immediate questions:

1. What is special about a body that makes it impossible to have intelligence without it? (a) Is it possible for a quadriplegic person to be intelligent? (b) A blind and deaf person? ((c)What about that guy from Johnny Got His Gun?)

2. What is special about a childhood such that a machine cannot have it?

3. Would a person transplanted into a completely alien culture not be intelligent?

What is fundamentally being argued is the definition of "intelligence", and there are many fixed points of those arguments. Unfortunately, most of them (such as those that answer "no", "probably not", and "definitely not" to 1a, 1b, and 1c) don't really satisfy the intuitive meaning of "intelligence". That, and the general tone of the arguments, seem to imply the only acceptable meaning is dualism.

For example, "...there is always consciousness, information and math...": without a tight, and very technical, definition of consciousness, that seems to be assuming the conclusion. With a tight, and very technical, definition of consciousness, what is the problem with a machine demonstrating it?

Information? Check out knowledge, "justified true belief", and the Gettier problem (https://courses.physics.illinois.edu/phys419/sp2019/Gettier....).

Math? Me, I'm a formalist. It's all a game that we've made up the rules to.

goatlover · on July 6, 2021

> Many of Dreyfuss' and other similar arguments reduce do dualism when you start digging into them. I don't have the time to dig into the specific article, but here's some immediate questions:

To me it sounds dualist if intelligence is disembodied. If the substrate doesn't matter, only the functionality, then that sounds like there's something additional to the world than just the physical constintuents. But of course, embodied versions of intelligence need to answer the sort of questions you posed. It should be noticed that Dreyfuss wrote his objections in the 50s and 60s during the period of classical AI. I don't know whether he addressed the question of robot children, or simulated childhoods. We don't have the sort of thing even today, and we also don't have AGI. Some of his objections still stand, although machine learning and robotics research has made inroads.

> Math? Me, I'm a formalist. It's all a game that we've made up the rules to.

So why is physics so heavily reliant on mathematics? Quite a few physicists think the world has a mathematical structure.

> For example, "...there is always consciousness, information and math...": without a tight, and very technical, definition of consciousness, that seems to be assuming the conclusion.

Qualia would be the philosophical term for subjective experiences of color, sound, pain, etc. Reducing those to their material correlations has been notoriously difficult, and there is still no agreement on what that entails.

As for information, some scientists have been exploring the idea that chemical space leads to the emergence of information as an additional thing to physics which needs to be incorporated into our scientific understanding of the world. That we can't really explain biology without it.

mcguire · on July 6, 2021

:-)

"To me it sounds dualist if intelligence is disembodied. If the substrate doesn't matter, only the functionality, then that sounds like there's something additional to the world than just the physical constintuents."

Off the top of my head, what the substrate is doesn't matter, but that there is a substrate does. Intelligence is the behavior of the physical constituents.

"So why is physics so heavily reliant on mathematics? Quite a few physicists think the world has a mathematical structure."

Because humans are very good at defining the rules when we need them? Because alternate rules are nothing but a curiosity even to mathematicians unless there is a use---such as a physical process---for them?

One of the problems with qualia, as a topic of discussion, is that I can never be entirely sure that you have it. I can assume you do, and rocks don't, but that is about as far as I can get.

wombatmobile · on July 6, 2021

Don't overthink this.

If you put a computer in a room with a hot babe, a 3 layer chocolate cake, a bottle of the finest whisky or bourbon, the keys to a Porsche, and a trillion dollars in cash, what would it do?

Yeah, nothing. The computer is not in the world.

TchoBeer · on July 6, 2021

What if we build a computer that would do something with those things? Additionally, if I care about neither food nor drink nor money nor cars, am I not in the world?

wombatmobile · on July 7, 2021

> I care about neither food nor drink nor money nor cars, am I not in the world?

Assuming you are human, that depends on how long you care not about food or drink.

wombatmobile · on July 6, 2021

> (a) Is it possible for a quadriplegic person to be intelligent? (b) A blind and deaf person?

Yes of course, because all of those people have ambitions and desires. They feel pain and they seek pleasure, which they experience through their bodies.

Imagine if the world 2,000 years from now was populated only by supercomputers, all the lifeforms having perished.

What are these computers going to do with the planet?

colinmhayes · on July 6, 2021

Why can't a computer have ambitions and desires? Why can't it seek pleasure and feel pain? The only answer is dualism or we don't know how to wire it properly yet.

goatlover · on July 7, 2021

Or we don't have the proper design. If we want machines to be like animals, maybe we need to make them that way. Like the replicants in Blade Runner, or the humanoid "toasters" in the recent Battlestar Galactica.

wombatmobile · on July 8, 2021

It’s easy for a human to make another human, by combining with another human. If it’s the right human, it’s fun. If it’s the wrong human, it’s a disaster.

How to have fun and avoid disaster? That’s a definition of intelligence.

colinmhayes · on July 6, 2021

> cannot exercise general intelligence because they are not "in the world".

Implies dualism. In a materialist world a computer can learn anything given the proper structure and stimuli.

wombatmobile · on July 6, 2021

The limitation is more practical than theoretical or philosophical.

Consider this line from an Eagles song:

“City girls just seem to find out early, how to open doors with just a smile.”

What does that mean to you?

Disembodied computers don’t get the experiences required to gain that intelligence, and even if they could go along for the ride, in a helmet cam, they wouldn't experience the tingling in their heart, lungs and genitals that provide the signals for learning.

goatlover · on July 7, 2021

Yes, but the key is the proper structure and stimuli, which animal intelligence gets through having a body. Can we get computers to have the same sort of intelligence without a synthetic body? This becomes more of a robotics versus traditional AI debate. Think Rodney Brooks versus Marvin Minsky.

shrimpx · on July 6, 2021

Where’s the dualism? It sounds like just a peculiar definition of learning.

JoshTko · on July 6, 2021

Folks are missing why this went viral in China. From the article "In an even more philosophical twist, young and demoralized Chinese corporate citizens also saw the suicidal wolf as the perfect metaphor for themselves: a new class of white collar workers — often compelled to work ‘996' (9am to 9pm, six days a week) — chasing a dream of promotions, pay raise, marrying well… that seem to be becoming more and more elusive despite their grind."

nobodyandproud · on July 6, 2021

Missed or possibly don’t care.

The technical details aren’t interesting, but I do think it’s interesting just how disjointed life is vs what was promised.

In the US, this was aptly named a rat-race; and the white collar Chinese with a market-based economy are suffering the same.

Our markets and nations promise some combination of wealth or retirement and enjoyment of life, but it’s an ever-moving goal just out of reach for anyone but the lucky few.

jhbadger · on July 6, 2021

I'm reminded of the fable (in Nick Bostrom's Superintelligence) of the chess computer that ended up murdering anyone who tried to turn it off because in order to optimize winning chess games as programmed it has to be on and functional.

taneq · on July 6, 2021

Interestingly I was just today explaining the paperclip optimizer scenario to a friend who asked about the dangers of AI, including the fact that there's almost no general optimization task that doesn't (with a sufficiently long lookahead) involve taking over the world as an intermediate step.

(Obviously closed, specific tasks like "land this particular rocket safely within 15 minutes" don't always lead to this, but open ended ones like "manufacture mcguffins" or "bring about world peace" sure seem to.)

OscarCunningham · on July 6, 2021

> "land this particular rocket safely within 15 minutes"

This one becomes especially dangerous after the 15 minutes have passed and it begins to concentrate all its attention on the paranoid scenarios where its timekeeping is wrong and 15 minutes haven't actually passed.

taneq · on July 6, 2021

Ooh true, that could generate some interesting scenarios. "No, it's the GPS satellite clocks that are wrong, I must destroy them before they corrupt the world and cause another rocket to land at the wrong time!"

XorNot · on July 6, 2021

Always a good time to post Jipi and the Paranoid Chip: https://vanemden.com/books/neals/jipi.html

Which pretty much tackles these issues head on.

lancengym · on July 6, 2021

Perhaps all AI eventually figure out that humans are the REAL problems because we don't optimize, we lust and hoard and are envious and greedy - the very antithesis of resource optimization! Lol.

taneq · on July 6, 2021

We're just optimizing (generally quite well, I might add) for genetic survival.

NaturalPhallacy · on July 7, 2021

Ian Banks did a really amazing exposition of this where the Culture was rallying to stamp out reproducing nanites and they had to be stopped because if not they'd literally turn the whole universe into copies of themselves. One of the human characters mused that isn't that what all life is trying to do? I think it was in the Hydrogen Sonata, but I'm not sure.

taneq · on July 7, 2021

Yes! I often find myself thinking of organisms as 'hegemonising swarms.'

FridayoLeary · on July 6, 2021

Which perverted mind would build into a chess computer the ability to kill?

nabajour · on July 6, 2021

I think this comes from the theory of general artificial intelligence where your AI would have the ability for self improving. Hence it could develop any capability given time and incentive for it.

There are interesting videos on the subject on Robert Miles channel on AI safety: https://www.youtube.com/channel/UCLB7AzTwc6VFZrBsO2ucBMg

benlivengood · on July 6, 2021

A human mind not giving due consideration to the effects of granting arbitrarily high intelligence to an agent with simplistic morality counter to human morality.

From there it's a sequence of steps that would show up in a thorough root cause analysis ("humanity, the postmortem") where the agent capitalizes on existing abilities to gain more abilities until murder is available to it. It would likely start small with things like noticing the effects of stress or tiredness or confusion on human opponents and seeking to exploit those advantages by predicting or causing them, requiring more access to the real world not entirely represented by a chess board.

FridayoLeary · on July 6, 2021

none of the explanations here are good enough. It's an absurd scenario that could never happen. Checkmate.

zild3d · on July 6, 2021

Doesn't need a gun, just network access.

mcguire · on July 6, 2021

Network access and bitcoins. :-)

NaturalPhallacy · on July 7, 2021

Network access and invention of a currency.

abrahamneben · on July 6, 2021

This problem isn't particularly unique to AI research. In any optimization problem, if you do not encode all constraints or if your cost function does not always reflect the real world cost, then you will get incorrect or even nonsensical results. Describing this as an AI problem is just clickbait.

xtracto · on July 6, 2021

The article doesn't mention it but the researchers are using agent-based-modelling. It was nice to see the gif of what appears to be either NetLogo or Repast. I did research in that area for about 8 years and know a bit about the subject.

What they are showing is one of the main issues with agent-based-models (and I think every model, but it happens particularly with models trying to capture the behaviour of complex open systems): Garbage in -> Garbage Out.

Most likely the representation of the sheep/wolf system was not correct (so the modeling was not correct). Here "correct" means good enough to demonstrate whatever emerging behaviour they are studying. ABM is a powerful tool, but you must know how to use it.

nxmnxm99 · on July 6, 2021

Yep. Feels a bit like blaming a failed shuttle launch on calculus.

jonplackett · on July 6, 2021

Isn't this just a cock up with incentives? If they'd put a -100 score on dying it would have sorted itself out pretty quick.

ncallaway · on July 6, 2021

The issue with AI safety and unanticipated AI outcomes in general is that it’s always just a cock-up with incentives.

It’s easy to sort out in narrowly specified areas, but an extremely hard problem as the tasks become more general.

qayxc · on July 6, 2021

Even worse: if simulations are used, you now have two problems - formulating correct incentives and protecting against abusing flaws in the simulation.

shrimpx · on July 6, 2021

Isn’t this true about all systems, not just “AI”? The definition of a software bug is an unintended behavior. In a large system, myriad intents overlap and combine in unexpected ways. You might imagine a complex enough system where the confidence that a modification doesn’t introduce an unintended behavior is near zero.

ncallaway · on July 6, 2021

I think it’s true for many systems, not just AI that’s true.

AI is worth calling out in this regard because, if the field is successful enough, it can create dangerous systems that don’t behave how we want.

Building a safe general AI is much harder than building a general AI, which is why it’s worth considering AI as it’s own problem domain.

AnIdiotOnTheNet · on July 6, 2021

While obviously I've got the advantage of hindsight here, it seems like it should not have taken three days of analysis to see why the wolves were committing suicide. It seems obvious once the point system is explained. Perhaps some rubber-duck debugging might have helped in this case.

QuesnayJr · on July 6, 2021

I wonder if they initially thought it was a bug in the software, rather than a misalignment in the point system.

Brendinooo · on July 6, 2021

I think the point is more about highlighting the fact that AI doesn't share our base assumptions. We wouldn't think to put a huge penalty on dying because humans generally think that death is bad.

bserge · on July 6, 2021

Yeah, because we have a -1000 points on death built-in.

imtringued · on July 6, 2021

We don't receive a penalty for dying. The difference between suicidal humans and suicidal AIs is that suicidal AIs keep respawning i.e. they are immortal.

ajmurmann · on July 6, 2021

Looking at genetic algorithms makes a great comparison. In essence any algorithm in which the wolf commits suicide doesn't make it to the next generation. It's the equivalent of an enormous score penalty and 100% analog to how it works for actual life.

spywaregorilla · on July 6, 2021

Genetic algorithms are based on the same reward/cost function setup. They could easily arrive at the same conclusion because suicide might be the dominant strategy.

ajmurmann · on July 7, 2021

Yes, definitely. I misread the parent comment as snark claiming that we don't have that score penalty in reality

benlivengood · on July 6, 2021

Humans don't put a huge penalty on dying. We discount it and assume/pretend that once we've had a good long life then death is okay and euthanasia is preferable to suffering with no hope of recovery. AI wolves that can live for 20 seconds are unwilling to suffer -1 per second with no hope of sheep.

rfrey · on July 6, 2021

Perhaps the PhD student wasn't trying to make an AI that wins at pac-man, but investigating something else. They mention "maximizing control over environment".

xtracto · on July 6, 2021

One of the most typical scenarios studied in those wolf/sheep models (like http://www.netlogoweb.org/launch#http://ccl.northwestern.edu... ) is to find the best conditions for "balance" between sheep and wolf: Too many wolves and the sheep go extint and later the wolf starve. Too many sheep and then the sheep don't get enough food and also die, taking the wolves with them..

yodelshady · on July 6, 2021

Or social commentary on the nature of depression.

If you add your penalty, and a deficit of nearby sheep, you'd expect a trifurcation of strategy: hoarders that consume the nearby sheep immediately, explorers that bet on sheep further afield, and suicides from those that have evaluated the -100 penalty to still be optimal.

lancengym · on July 6, 2021

That same observation, with the exact same -100 points recommendation on crashing into a boulder, was indeed also made by a commentator on social media.

imtringued · on July 6, 2021

No, it's a cock up with the source of the wolves. If you could respawn endlessly after death would you fear it? You'd just want the stupid game to end before you lose points from the timer.

imtringued · on July 6, 2021

For clarification purposes:

Let's say you are a human player playing the wolf and sheep game. The score achieved in the game decides your death in real life. Note the stark difference. Dying in the game is not the same thing as dying in real life.

If there is an optimal strategy in the game that involves dying in the game you are going to follow it regardless of whether you are a human or an AI. By adding an artificial penalty to death you haven't changed the behavior of the AI, you have changed the optimal strategy.

The human player and the AI player will both do the optimal strategy to keep themselves alive. For the AI "staying alive" doesn't mean staying alive in the game, it means staying alive in the simulation. Thus even a death fearing AI would follow the suicide strategy if that is the optimal strategy.

It is impossible conclude from the experiment whether the AI doesn't fear death and thus willingly commits suicide or whether it fears death so much that it follows an optimal strategy that involves suicide.

Jaruzel · on July 6, 2021

We don't have AI. AI is a buzzphrase overused by the media. What we have is Machine Learning (ML). If and only if, we get past the roadblock of the 'agent' creating some usable knowledge out of an unprogrammed experience, and forming conclusions based on that, will we have AI. For now, the mantra 'Garbage-in-garbage-out' applies; if the controller of the agent gets their rule-set wrong, the agent will not behave as expected. This is not AI. The agent hasn't learnt by itself that it is wrong.

For example, there's a small child who is learning to walk. The child falls down a lot. Eventually the child will work out a long list of arbitrary negatives connected to its wellbeing that are associated with falling down.

However, the parents, being impatient, reach inside the child's head and directly tweak some variables so that the child has more dread of falling over than they do of walking. Did the child learn this, or was it told ?

We currently do the latter every time an agent gets something wrong. Left to their own devices, 99.9% of agents will continue to fall down over and over again until the end of time.

We have a long way to go before we can say we've created 'AI'.

KaoruAoiShiho · on July 6, 2021

Nah we have loads of AI now that don't need variable tweaking, like the OpenAI project that plays any retro game.

Tenoke · on July 6, 2021

Definitions change, and it seems pointless to deny that AI is just used to mean 'modern ML'.

dnautics · on July 6, 2021

Not even, we've used AI to describe entirely preprogrammed and non-ml agents in video games for decades now.

Is it artificial? Does it make decisions? It's an AI. Even if it's crappy, and not very intelligent.

hinkley · on July 6, 2021

We also have a lot of graph-theory and optimization algorithms that get labeled AI by actual AI people. But the press is, almost to a man, always talking about machine learning and expert systems.

scotty79 · on July 6, 2021

Just remember that you are optimizing for what you actually encoded in your rewards, your system, and your evaluation procedure, not for what narrative you constructed about what you think you are doing.

I had my own expeirience with this when I tried to train "rat" to get out of the maze. I rewarded rats for exiting but for some simple labirynths I generated for testing it was possible to exit it by just going straight ahead. So this strategy quickly dominated my testing population.

Hackbraten · on July 6, 2021

Reminds me of this 2014 king-of-the-hill challenge: https://codegolf.stackexchange.com/questions/25347/survival-...

One particular solution stood out: https://codegolf.stackexchange.com/a/25357

The suicidal wolf became a (short-lived) running gag so it started appearing in other king-of-the-hill challenges: https://codegolf.stackexchange.com/a/34856

cornel_io · on July 6, 2021

I mean, lesson zero of optimization is when you're designing a loss function and trying to incentivize agents to perform a task, don't set it up so that suicide has a higher payoff than making partial progress on the task. Maybe make death the worst outcome, not one of the best...?

One of these days I have to actually scour the web and collect a few good examples where evolutionary methods are used effectively on problems that actually benefit from them, assuming I can find them. Almost every example you're likely to see is either a) solved much more effectively by a more traditional approach like normal gradient descent or classic control theory techniques (most physical control experiments fall into this category), b) poorly implemented because of crappy reward setup, c) fully mutation-driven and hence missing what is actually good about evolution above and beyond gradient descent (crossover), or d) using such a trivial genotype to phenotype mapping that you could never hope to see any benefit from evolutionary methods beyond what gradient descent would give you (if the genome is a bunch of neural network weights, you're definitely in this category).

Jabbles · on July 6, 2021

For more examples of AI acting in unpredicted (note, not unpredictable) ways, see this public spreadsheet:

https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRPiprOa...

From https://deepmindsafetyresearch.medium.com/specification-gami...

rrmm · on July 6, 2021

One thing I've been considering: At what point does a creator have a moral or ethical obligation to a creation. Say you create an AI in a virtual world that keeps track of some sense of discomfort. How complex does the AI have to get to require some obligation? Just enough complexity to exhibit distress in a way to stir the creator's sympathy or empathy?

The glib answer is never, of course. And one easy-out, I can think of is setting a fixed/limited lifespan for the AI and maybe allow suicide or an off-button. So the AI can ultimately choose to 'opt-out' should it like; and at least, suffering isn't infinite or unending.

It reminds me of reactions to testing the stability of Boston Dynamic's early pack animal. The people giving the demo were basically kicking it, while the machine struggled to maintain its balance. The machine didn't have the capacity to care, but to a person viewing it, it looked exactly like an animal in distress.

dqpb · on July 6, 2021

> The glib answer is never, of course

Dismissing “never” offhand without explanation is glib.

OscarCunningham · on July 6, 2021

Utility functions are only defined up to addition of a constant and scaling by a positive constant. So instead of rewarding them with +5 and punishing them with -5, you can use 1005 and 995 instead. Problem solved.

rrmm · on July 6, 2021

The numbers are indeed arbitrary. But ultimately you want to avoid low utility/reward action and continue high utility/reward actions. That behavior, trying to avoid or pursue actions, would be indicative of the state of distress regardless of an arbitrary number attached to it.

OscarCunningham · on July 6, 2021

Gwern has a list of similar stories: https://www.gwern.net/Tanks#alternative-examples

gwern · on July 6, 2021

FWIW, I see a critical difference between OP and my reward hacking examples: OP is an example of how reward-shaping can lead to premature convergence to a local optima, which is indeed one of the biggest risks of doing reward-shaping - it'll slow down reaching the global optima rather than speeding it up, compared to the 'true' reward function of just getting a reward for eating a sheep and leaving speed implicit - but the global optima nevertheless remained what the researchers intended. After (much more) further training, the wolf agent learned to not suicide and became hunting sheep efficiently. So, amusing, and a waste of compute, and a cautionary example of how not to do reward-shaping if you must do it, but not a big problem as these things go.

Reward hacking is dangerous because the global optima turns out to be different from what you wanted, and the smarter and faster and better your agent, the worse it becomes because it gets better and better at reaching the wrong policy. It can't be fixed by minor tweaks like training longer, because that just makes it even more dangerous! That's why reward hacking is a big issue in AI safety: it is a fundamental flaw in the agent, which is easy to make unawares, and which will with dumb or slow agents not manifest itself, but the more powerful the agent, the more likely the flaw is to surface and also the more dangerous the consequences become.

OscarCunningham · on July 6, 2021

I think in some of your examples the global optimum might also have been the correct behaviour, it's just that the program failed to find it. For example the robot learning to use a hammer. It's hard to believe that throwing the hammer was just as good as using it properly.

queuebert · on July 6, 2021

This is the danger of not understanding what you're doing at a deep level.

Clearly in the (flawed) objective there is a phase transition near the very beginning, where the wolves have to chose whether to minimize the time penalty or maximize the score. With enough "temperature" and time perhaps they could transition to the other minimum, but the time penalty minimum is much closer to the initial conditions, so you know ab initio that it will be a problem. You can reduce that by making the time penalty much smaller than the sheep score and adding it only much later. I feel bad that the students wasted so much time on a badly formulated problem.

Edit: Also none of these problems are black boxes if you understand optimization. Knowing what is going on inside a very deep neural network (such as an AGI might have) is quite different than understanding the incentives created by a particular objective function.

mcguire · on July 6, 2021

It's really rather hard to draw any general conclusions from such simple systems:

"In the initial iterations, the wolves were unable to catch the sheep most of the time, leading to heavy time penalties. It then decided that, ‘logically speaking’, if at the start of the game it was close enough to the boulders, an immediate suicide would earn it less point deductions then if it had spent time trying to catch the sheep."

It's as if the scenario you are thinking about involves "assume a machine capable of greater-than-human-level perception, planning, and action" and then set it to optimize a trivially bad function.

How many people do you know with a single goal of "die with as much money as possible", which has a trivial solution: rob a bank and then commit suicide.

m3kw9 · on July 6, 2021

Dev: it’s a bug

Manager to boss: It’s a crazy new AI behaviour that is going viral around the world!

aliasEli · on July 6, 2021

A nice story about AI systems that warns that you should very carefully choose the parameter you want to optimize.

phoe-krk · on July 6, 2021

> very carefully choose the parameter you want to optimize.

This does not only concern AI systems, but all systems in general - including human ones.

aetherspawn · on July 6, 2021

From a retrospective today... "the KPIs are abysmal but the deliverables are very high .. so I guess the KPIs are wrong?"

shrimpx · on July 6, 2021

Sounds like the deliverables KPI is fantastic.

aliasEli · on July 6, 2021

You are right, of course.

morpheos137 · on July 6, 2021

What distinguishes AI from self-calibrated algorithm?. Neither this "AI" nor the story about it seem too intelligent.

The incentive structure is a two dimensional membrane embeded in a third dimension of "points space."

Obviously if the goal is to maximize total points OR minimize point loss and the absolute value of the gradient toward a mininum loss is greater than the abs gradient toward a maximum gain then the algorithm may prefer the minimum until or if it is selected against by random chance or survivorship bias.

obviously the linear time constraint causes this. a less monotonic, i.e. random, time constraint may have been interesting.

alpaca128 · on July 6, 2021

I remember a similar story about (I think) a Tetris game where the AI's training goal was to delay the Game Over screen as long as possible. So in the end the AI just paused the game indefinitely.

RandomWorker · on July 6, 2021

https://www.bilibili.com/video/BV16X4y1V7Yu?p=1&share_medium...

Here is the full video also linked at the bottom. It also shows the one that trained longer that the wolves start successfully hunting the sheep after more training examples.

spywaregorilla · on July 6, 2021

The ai seems to die at the top of the map unexpectedly for some reason. Like 6:07.

Another interesting observation is that the wolves don't coordinate it seems. That probably implies that the reward functions are individual, so they're technically competing rather than cooperating.

Lastly... they seem to not be very good at the game even at the end

darepublic · on July 6, 2021

Software has bugs.. you anthropomorphize those bugs and you have a story on medium.

sega_sai · on July 6, 2021

It's an interesting illustration of 'be careful what you wish for' and that the definition of the proper loss function is a very important part of the solution to any problem.

lancengym · on July 6, 2021

Yes, indeed. Sometimes the disincentive is just as important as the incentive in determining the outcome!

SquibblesRedux · on July 6, 2021

The article and the phenomena it describes makes me think of the ending of Aldous Huxley's Brave New World [1]. (I strongly recommend the book if you have not read it.) A line that really stands out:

"Drawn by the fascination of the horror of pain and, from within, impelled by that habit of cooperation, that desire for unanimity and atonement, which their conditioning had so ineradicably implanted in them, they began to mime the frenzy of his gestures, striking at one another as the Savage struck at his own rebellious flesh, or at that plump incarnation of turpitude writhing in the heather at his feet."

[1] https://en.wikipedia.org/wiki/Brave_New_World

joebob42 · on July 6, 2021

This just seems really obvious. Even if there are sheep nearby worth hunting, it's probably always eventually going to be the right move to suicide.

npteljes · on July 6, 2021

The result of perverse incentives. See the cobra story in the wiki article, that's another fantastic story.

https://en.wikipedia.org/wiki/Perverse_incentive

giantg2 · on July 6, 2021

For some reason this makes me think of corporate policies - how some people game them and how others except that the incentives are unattainable.

xtracto · on July 6, 2021

That's an interesting idea for an agent-based-model and a study: Show how certain corporate policies would push towards short term local-optima (what's happening in the article) instead of more long term global optimum states.

dqpb · on July 6, 2021

It’s pretty similar to quitting once all your equity has vested.

giantg2 · on July 6, 2021

I was mostly thinking about my own experience where the company screwed me over enough times that I feel no incentive to try hard. Take the least risk, focus on not losing point rather than gaining them, because I'll never catch a "sheep".

Camillo · on July 6, 2021

The problem here is not the AI, but the incentive design. The Chinese netizens who take this as inspiration to comment on the incentives in their own lives (under the 996 system) are the insightful ones, more son than those who worry about "AI ethics".

We have so many systems in the real world that set up bad incentives for humans, yet the concept is largely misunderstood by politicians and decision makers. Our democratic discourse is dominated by first-order thinking, our laws are too often written under the assumption that the affected entities' behaviour will remain the same under the new incentives, which never holds.

tasuki · on July 6, 2021

> The two creators, after three days of analysis, realized why the AI wolves were committing suicide rather than catching sheep.

I'm not buying that. As soon as they mentioned the 0.1 point deduction every second it seemed obvious?

wolfium3 · on July 6, 2021

Wolf: Why are we still here? Just to suffer?

ramtatatam · on July 6, 2021

I'm not an expert, but story described within the article looks like normal bump on the road to get desired result. When putting together rules for the game researchers did not think that in resulting environment it might be more rewarding to chose observed action than to do what they intended. As much as it looks like nice story, is it not just what researchers encounter on daily basis?

eitland · on July 6, 2021

Ok. Lots of AI stories here so I'll the best I've read, the student who trained an AI to work on upwork ;-)

https://news.ycombinator.com/item?id=5397797

Be sure to read to the end.

One of the answers is also pure gold in context:

> Don't feel bad, you just fell into one of the common traps for first-timers in strong AI/ML.

alexshendi · on July 6, 2021

Well I can identify with that AI wolf. He recognises his own incompetence and chooses suicide over eternally failing.

trezemanero · on July 7, 2021

The problem is obvious when you read the whole text: The wolves were too much penalized when trying to reach the sheep. and probably was possible to make negative points, so the score was being lowered even after reached 0.

billpg · on July 6, 2021

"Read this story with a free account."

I'll pass thanks.

arduinomancer · on July 6, 2021

This makes me wonder: is it possible for ML models to be provably correct?

Or is that completely thrown out the window if you use a ML model rather than a procedural algorithm?

Because if the model is a black box and you use it for some safety system in the real world, how do you know there isn’t some wierd combination of inputs that causes the model to exhibit bizzare behaviour?

AceJohnny2 · on July 6, 2021

There are many such stories of AI "optimizations" gone wrong, because of loopholes the program found that humans didn't consider.

Here's a collection of such stories:

https://arxiv.org/pdf/1803.03453.pdf

hinkley · on July 6, 2021

My favorite story is the genetic evolution algorithm that was abusing analog noise on an FPGA to get the right answer with fewer gates than was theoretically possible.

The problem was discovered when they couldn’t get the same results on a different FPGA, or in the same one in different day (subtle variations of voltage from mains and the voltage regulators).

They had to redo the experiment using simulated FPGAs as a fitness filter.

AceJohnny2 · on July 6, 2021

To whet your appetite:

> " William Punch collaborated with physicists, applying digital evolution to find lower energy configurations of carbon. The physicists had a well-vetted energy model for between-carbon forces, which supplied the fitness function for evolutionary search. The motivation was to find a novel low-energy buckyball-like structure. While the algorithm produced very low energy results, the physicists were irritated because the algorithm had found a superposition of all the carbon atoms onto the same point in space. “Why did your genetic algorithm violate the laws of physics?” they asked. “Why did your physics model not catch that edge condition?” was the team’s response. The physicists patched the model to prevent superposition and evolution was performed on the improved model. The result was qualitatively similar: great low energy results that violated another physical law, revealing another edge case in the simulator. At that point, the physicists ceased the collaboration."