AlphaStar: Mastering the Real-Time Strategy Game StarCraft II

rcheu · on Jan 24, 2019

This is really impressive, I didn't expect starcraft to be played this well by a machine learning based AI. I'm excited to read the paper when it comes out!

That said, I'm not sure I agree that it was winning mainly due to better decision making. For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.

The stalker micro in particular looked to be above what's physically possible, especially in the game against Mana where they were fighting in many places at once on the map. Human players have attempted the mass stalker strategy against immortals before, but haven't been able to make it work. The decisions in these fights aren't "interesting"--human players know what they're supposed to do, but can't physically make the actions to do it.

While they have similar APM to SC2 pros, it's probably far more efficient and accurate so I don't think that alone is enough. For example, human players have difficulty macroing while they attack because it takes valuable time to switch context, but the AI didn't appear to suffer from that and was extremely aggressive in many games.

gamegoblin · on Jan 24, 2019

In the mass stalker battles, the AI APM exceeded 1000 a few times, and no doubt that most of that was precisely targeted. Whereas a human doing 500 APM micro is obviously going to be far more imprecise.

I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.

wnevets · on Jan 24, 2019

>I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.

IIRC OpenAI limits the reaction time to ~200ms when playing DoTA2. AI employing better strategies than humans will always be more interesting than AI that can out click humans.

jgon · on Jan 24, 2019

Even the 200ms reaction time seemed overly slanted towards the AI. I don't think that is the actual reaction time of top pros, in the matches the AI played the human player would teleport in from complete invisibility and try to use an instant cast spell and the AI would have already teleported out. Yes the theoretically may have been constrained to a 200ms reaction time, but in practice the AI was playing at a superhuman level. Even with that advantage in fights, the human team still demolished the AI. Oh well, lots of things to learn still.

dx87 · on Jan 24, 2019

Another advantage was that the AI is just reading the game state through an API, it doesn't have to look on the screen. The game can be difficult to watch from a pro's perspective since they have to constantly click around the map to see what's happening, but the AI has perfect knowledge of everything it is capable of seeing, all without having to physically move a mouse to click on the screen.

spydum · on Jan 25, 2019

If you watch the 11th game where pro player wins, (prior was a 10-0 shutout by Alpha), the AI actually lost because they rebuilt the agent to use the same forced camera perspective as the human - so there is absolute truth to this being a compelling advantage. It was able to micro multiple units in disparate areas by having far better spatial awareness. When they took that advantage away it seemed more even.

rcthompson · on Jan 25, 2019

I don't know if we can absolutely claim that the limited viewport was the deciding factor in the 11th game, but it did seem to me that the Alphastar agent's blink stalker micro was somewhat compromised in that game compared to the seemingly superhuman blink micro in previous games.

ajuc · on Jan 25, 2019

You can see the alphastar perspective of that 11th game here: https://youtu.be/H3MCb4W7-kM?t=5195

It struggles with camera placement like real players :) And uses popular divert-attention tactics, which shows it understand that part of the game - for example when it sends oracles to mineral line at the same time as it attacks in front. Previous versions didn't do that, because they were taught playing vs cheating AI - so no point diverting attention of something that has instant access to any unit on the map :)

It also struggles to defend against adept harras beacuse it has "tunnel vision" - controls its oracle instead of defending probes at home. Mana actually managed his attention budget a lot better (this is a crucial pro-player skill in starcraft - harras is effective because it trades little of your attention for a lot of attention of the enemy, it's a skill that becomes irrelevant when opponent doesn't really have "attention" and can perceive and interact with all units on the map at once like previous version of alphastar).

This one is much more human, and much lower level. In my opinion it lost unfair advantage, so the mistakes in its errormaking are revealed. Previously it never was behind and never had to react to human player strategy - it rarely even scouted because what's the point - it wanted to build mass stalkers anyway.

rcthompson · on Jan 28, 2019

Yeah, that's actually a huge point that I didn't even consider. Regardless of whether the AI itself is playing with a limited viewport, the fact that its opponent has a limited viewport opens up the opportunity to learn attention diversion tactics during the training process, which would otherwise be impossible.

gowld · on Jan 25, 2019

What happens if a human tries to use the API with a custom UI of the human's own choosing? Such a UI might not exist yet, but are there ideas for more efficient UIs that could be built?

confiscate · on Jan 25, 2019

Yes I am curious of this too. What happens if the human has a giant TV screen that can see the whole map at once

Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do

NicoJuicy · on Jan 25, 2019

It would massively be faster I think.

Upgrade building 3 comes available when you have enough resources.

A separate tab with insufficient resources gives you an overview with what you need to finish a,b,c.

A red alert appears when an enemy is spotted. You can click nearby units attack or a FSM with the attack strategy.

An finished building automatically will be placed near the town center.

Not working farmers can search for resources.

A wall is suggested by your current buildings, you can set an margin of eg. 20 meters.

The question is, how much programming will the custom UI need ( and how deep) to make it a lot more efficient

dwighttk · on Jan 25, 2019

Stalker unit AI could be microing perfectly for you...

nurettin · on Jan 25, 2019

Giant tv:

Macro-wise, it would be like an unwieldly minimap which already exists so people can get a sense of where the enemy is moving. With a giant screen, information is not focused on a small area, so you are limited to your FOV. Minimap which shows unit strength in terms of armor hp or shields as well as placement would be ideal information.

Micro-wise, it would be like sitting in front of a giant text display looking at a whole book. You still have to focus on a small section to read it.

maksimum · on Jan 25, 2019

> Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do

While this would make it more fair, it would just make the micro game more similar to chess or go. I don't think humans would necessarily win in the end.

feanaro · on Jan 25, 2019

That's a good insight and yes, humans would probably be overpowered eventually. However, this is just the consequence of the fact that all games are similar if you remove external limitations such as reaction time (or, alternatively, produce a more efficient "being" which is not as subject to these limitations as some other).

Starcraft is like chess in some sense. The largest fundamental difference is that it isn't a perfect information game.

setr · on Jan 26, 2019

Tbh starcraft and dota shouldn’t really be the test games atm; turn based strategies (or rather, grand strategies) would be the far more appropriate evolution after chess and go, since we’re clearly more interested in AI macro than micro, and too much of its learning process is in trying to push the AI beyond micro-oriented thinking (probably many rounds of the AI tournament are lost simply because one AI found a new micro strategy to abuse)

But ofc, there’s no tbs or grand strategy currently out there with a real tournament scene, so you can’t really count on the devs implementing an AI-API, or even properly balanced / bug-free (far more user-testing goes into sc2/dota2 than say civ, simply by virtue of its playerbase).

notSupplied · on Jan 27, 2019

Yes but a turn based game drastically reduces the action space compared to a real time game, something the DeepMind folks pointed out as a particularly interesting problem they wanted to tackle.

setr · on Jan 28, 2019

>a turn based game drastically reduces the action space compared to a real time game,

That's the primary benefit imo. The bigger action space is largely composed of non-strategic elements, at least in the sense of long-term strategies, eg micro and mini-skirmish tactics, that I don't think are as interesting. Ofc its clearly a conflict of interest, but my feeling was the most interesting aspect of Go/Chess is the AI making unintuitive discoveries that benefit the long-term. The human-collective machine is pretty good on its own at finding the shorter-term strategies; I don't think AI will make much significant impact in that space.

games as a medium to study upcoming real-world applications (eg cars), RTS makes sense; but as a medium to study AI beating humans, TBS is more appropriate (their ability to explore large search-spaces is far more interesting/potentially impactful). Studying both would be ideal ofc, but in a pick-one situation, TBS is better imo. But only RTS are even really viable atm, which is disappointing.

zizee · on Jan 25, 2019

Well, are we wanting to test the computers ability to strategize/plan, or their ability to out click humans?

The former is an interesting AI challenge/achievement, the latter is a space in which computers are already known to outperform humans.

ajuc · on Jan 25, 2019

Even allowing players to zoom out would give huge advantages, that's why no matter the screen size you have to play at the same zoom. There was a bug at one point that allowed players to play multiplayer zoomed out and it was forbidden to use it in competetive games.

waterhouse · on Jan 25, 2019

How about having multiple humans control the same faction, so one can focus on building, two on a couple of battle groups, another on scouting, etc.? Then they don't have to context switch nearly so much.

thefreeman · on Jan 25, 2019

They actually have this game mode built in, it's called Archon mode.

waterhouse · on Jan 26, 2019

Aha, nice, thanks. Let's see, two players per side... not a huge number but probably a big step up from one. Looks like people aren't playing it much; some people suggest it's because that requires a partner.

I would like to see a setup akin to that of Ender Wiggin, with one commander overseeing and recommending overall strategy, and, say, five others managing different areas or groups. That seems like the way to get the best human performance, and might be enough to beat the AIs—at least to nullify chunks of their advantage.

thomasfortes · on Jan 25, 2019

Yeah, put an eye tracker in a pro and you'll see that the eyes are constantly changing the focus point, if you can watch the entire scene with the same precision without the need to focus on it you're already at a nice advantage.

As an aside, a few pro gamers prefer to play on windowed mode for exactly this reason.

bitxbitxbitcoin · on Jan 25, 2019

Just supporting this. I remember the uproar when 800x600 resolution was removed as an option in SC2 around 2012[1].

[1] https://eu.battle.net/forums/en/sc2/topic/6201040181

rootlocus · on Jan 25, 2019

> the uproar

I'm not saying you're wrong, but 6 posts with no profanity is hardly "uproar" by blizzard forums standards.

theflork · on Jan 25, 2019

Is the bit about reading the game through an api true? Earlier iterations of this same rl based agent that played Atari games would read just raw pixels not an api.

sytelus · on Jan 25, 2019

Yes, it’s true. A special PySC interface was created for AI. Also, it’s not only that AI doesn’t need to parse information available on much limited screen real estate but also that AI doesn’t have to use controller that have physical constraints. So AI has access to this super human controller and it can decide to click on one screen extreme and then another within 200ms.

setr · on Jan 25, 2019

Any game that is specifically going out of its way to support these ai’s will naturally do it through an api, though I’m only aware of dota2 and sc2 (sc:bw also does, through a community-modified client that serves the api, iirc). For adhoc games, eg atari, pixel-parsing is the natural result, but no one would intentionally set it up like that

asdfasgasdgasdg · on Jan 25, 2019

The game is difficult to watch, but does anyone honestly believe that an AI is going to have a difficult time parsing the scene if it is trained to do so? That to me just seems like a question of resources. We're pretty good at image recognition and segmentation now, and that's without the unlimited amounts of training data one could generate when using a controlled game environment with a limited range of possible animations and effects. This is why I find the prospect of the AI agent having to parse the screen entirely uninteresting.

kaveh_h · on Jan 25, 2019

For real life applications, parsing the ”scene” would have impact as it could only convey imperfect information retained. In the game of starcraft the information is perfect when fog of war have been removed this together with unlimited attention (camera viewport) helps action potential and macro planning. No player is ever going to be able to consider precise strategy on the whole map perfectly in their mind. If deepmind wanted to mimic human limitations perfectly they would have to provide imperfect information for AlphaStar, e.g when providing information of locations of objects sample a random variable from a probability distribution which represent the location imperfectly and making that distribution bigger the longer the attention of the A.I wanders from the object both spatialy and temporal. Of course the usefulness of having these limitations is purely to model maximum theoretical human mental capacity and it’s use case could be to help explore strategies that work for actual humans.

feanaro · on Jan 25, 2019

There is another potential use: given these limitations, an AI might be able to learn to be better strategically, which could translate to an even greater advantage once the limitations were removed later on.

posterboy · on Jan 25, 2019

windowing the focus perhaps, yes, but I'd assume it's the opposite and the focus is applied more freely.

posterboy · on Jan 25, 2019

You talk about a static image, but navigating the camera requires strategy, attention, and adds to the focus. If you take that away, it's just a turbo charged pen-and-paper RPG with a time limit on rounds.

They could train against the API, reinforcing the AI trying to predict the state from vision. But with limited APM it would be pretty difficult for the AI to keep track of everything. And, potentially, it would still not be the same as a human looking at it. I'm not sure whether human attention is a particularly bad example of efficient resource allocation. I'm very biased to think it is still the gold standard. But the fact that deepmind didn't focus on this implies they were not finding it interesting enough, and/or too difficult.

Anyhow, (visual) exploration is a step up from mere image recognition

thomasfortes · on Jan 25, 2019

But on the other hand, an AI that beats humans using brute force in a game where it makes a ton of difference isn't much fair too.

red75prime · on Jan 25, 2019

> using brute force

"Brute force" in AI context is usually reserved for traversal of the entire search space. I think "superhuman micromanagement" is a better term. And before AlphaStar superhuman micro wasn't insurmountable obstacle for human players.

candiodari · on Jan 25, 2019

The funny thing is that once we're talking about the real world, which will come, that incentive actually reverses.

At that point the name of the game will be maximizing the advantage the body/infrastructure provides the AI, not minimizing it.

Weird.

buchanan · on Jan 25, 2019

Yes, since DeepMind chose SC2 for having the right characteristics for mapping to the real world, ie imperfect information and real time response, they should have had at least one run without any speed governors. And maybe another with the CPU limited to some level we might find in an embedded system of near future.

gowld · on Jan 25, 2019

It's the same principle as a baseball player putting extra weights on the bat in practice.

lrem · on Jan 24, 2019

I've recently watched a TED talk explaining how human perception has a lag of about a third of a second. Pro players might be better, but after noticing they also need to take an action.

neltnerb · on Jan 25, 2019

My experience is that to beat 300ms requires there to be no conscious thought in the loop. It has to be muscle memory guided by higher level intent. It's like how the gunslinger waiting to shoot hits first, it's reflex instead of decision.

wnevets · on Jan 25, 2019

Getting sub 200ms on something like this benchmark is fairly easy [1]. While waiting for the color to change is different than processing a game like dota2 or sc2 a 200ms limit isn't too unreasonable to me.

I would love to see these AIs get handicapped even more like a full second and really force them to out think humans.

[1] https://www.humanbenchmark.com/tests/reactiontime

gsich · on Jan 25, 2019

I think OpenAI would have been by lots of humans, but they decided to train it with 5 unlimited, invulnerable couriers. (until the TI showmatches, in which they were beaten easily.)

dustycowboy · on Jan 26, 2019

The only way to truly have a fair fight would be to accurately model the limits of human capacities. How fast can humans move the mouse and at what accuracy? How fast can they type keyboard commands? How fast can they move their eyes? You could study those limits in a sports lab with high speed cameras, etc.

A simpler model would be to limit the bot to, say, one action per 250ms, introduce a slight delay in his reaction time, require him to move the camera to gain detailed information and take further actions, and have camera movements count as actions.

a_wild_dandan · on Jan 25, 2019

Here's a graph of AlphaStar's APM versus a professional player's: https://i.imgur.com/TXeLkQK.png Evidently AlphaStar also has a similar Economy of Attention (where the player focuses) to a professional player, at around 30 screens per minute. Additionally, AlphaStar's reaction time is around 350ms, a significant disadvantage over a pro.

The skepticism in this thread is absolutely justified but I think it's important to note the lengths to which DeepMind has gone to address and assuage the fears of superhuman mechanical skills being employed in these games.

gamegoblin · on Jan 25, 2019

I watched all of the event live and I feel that that graph is deceptive. If a game is 15 minutes and has 3 main battles lasting 15 seconds each, and you use 100 average APM on non-battle time and 1000 APM during battles, your average APM will be 145 but you obviously have a superhuman advantage.

This is compounded by the fact that almost all of AlphaStar’s actions are “useful” whereas a significant amount of the human actions are spammy.

You will typically see a human select a group of units, and fast-click a location in the general direction they want the units to move (to get them started moving that way), and then keep clicking to continuously update the destination to a more precise location. Every click counts as an action. An AI can be perfectly precise and “clicks” the right place the first time.

dlubarov · on Jan 25, 2019

TLO seems to have a longer tail than AlphaStar in that graph though, so doesn't that imply that TLO peaked at an even higher APM, presumably during battles?

Fair point about humans needing minor adjustments though. Another comment also mentioned a bug in the APM measurement: https://news.ycombinator.com/item?id=18994350

georgeek · on Jan 25, 2019

TLO is a Zerg player, so he probably does a lot more errors when playing Protoss. Also, every top player estimates when to do a sequence of actions and spams it a few times to maximize the chance of execution. Meanwhile Alphastar only has to do that once.

integricho · on Jan 25, 2019

Yes,could that bug be the reason for the AI getting to 1000APM?

jacobush · on Jan 25, 2019

Hm, should be interesting to force the AI to use input commands through a "filter", where it can only execute orders with human level precision. And something similar for input.

arcticfox · on Jan 25, 2019

This graph is incredibly deceptive and I'm kind of upset they posted it. There are about 10-15 seconds of gametime where APM is incredibly important, and the AI boosted to 1000+ APM during those periods. During lulls it cruised at ~30 APM.

Meanwhile humans are literally spamming keys to keep their physical fingers loose and ready - they're not performing anything close to 400 useful APM on a regular basis (or in TLO's case - 1500 ... He kept walking his units straight into death while spamming keys).

gowld · on Jan 25, 2019

How can it do 1000 APM if if its reaction time is 350ms? (180/minute)

gamegoblin · on Jan 25, 2019

I believe you are conflating latency and throughput. It might take AlphaStar 350ms to perceive a threat, but once perceived, it might issue many commands at high speed to respond.

aiiane · on Jan 25, 2019

Latency != Bandwidth

taneq · on Jan 24, 2019

How many of those 500 actions are actually useful? I haven't watched competitive StarCraft games for years but back when I did, rates were more like 300APM and even then the players basically spam clicked the background or selected random units non-stop and were probably only doing 50-100 actual effective actions.

arcticfox · on Jan 24, 2019

> How many of those 500 actions are actually useful?

Exactly, a human doing 500 APM during intense moments is going to be way different than an AI bursting 1000 APM with pixel-precision during the most crucial moment in a game.

TLO spent a ton of time at >1000 APM and walked his army directly into enemy shots all the time. MaNa had much better control at ~400 APM. So APM is really irrelevant to control - for humans.

I suspect the AI, on the other hand, makes each action precise & count for something.

This graph, which I think was supposed to show that the AI was being "human", IMO is pretty damning. We saw the APM spike to >1000 during a critical moment and we saw the APM at <30 during lulls, so we know it uses its APM at important moments, presumably with important pixel-precise actions.

https://deepmind.com/blog/alphastar-mastering-real-time-stra...

CydeWeys · on Jan 24, 2019

I suspect that once the AI becomes good enough it will be able to beat human players using a much lower total APM than human players. We're not quite there yet, but it just needs a little bit of time.

As a hopefully illustrative comparison, you could give any top player a day of play time per move against the top Chess AI being given a minute of play time per move and the AI will still win. That's how much better the AIs are than humans now. There's no reason in principle this won't be possible with StarCraft AI too.

setr · on Jan 25, 2019

The biggest issue with allowing the ai to have high APM is that it will inevitably learn optimal strategies that depend on that high APM, eg stalkers can take on far more immortals than we normally expect, and the AI will learn it this way, because the high APM allows a new stalker strategy (or rather, empowers an old one greatly) while not affecting immortals significantly. This also naturally means the AI leagues see a different game balance than the human leagues, leading to strategy divergence.

And then when you drop the APM limit, suddenly all the learned optimal ai strategies start falling apart, and the whole thing has to be relearned.

More annoyingly, there’s not much for human players to learn from innovative ai strategies that are based on inhuman accuracy of play (because we couldn’t possibly execute it).

CydeWeys · on Jan 25, 2019

What they're improving at right now isn't any specific AI model, it's how to train the AI models. It's meta-machine learning. I don't doubt that they can quickly train up a new model under different constraints now that they know how best to train up said models. It's not like they throw away all progress once they change some constraints; far from it.

arcticfox · on Jan 25, 2019

I'm sure we'll get there too, I just think it's a little deceptive how they've measured the APM at the moment.

StarCraft is more random than chess, so I do think it's possible humans will always be able to take occasional games off of fairly constrained AIs just based off blind luck in picking counter builds, it will be interesting to see what % that is.

vertline3 · on Jan 25, 2019

Such high actions per minute does not seem fun to me, and possibly a repetitive strain injury waiting to happen.

aurelwu · on Jan 25, 2019

the 1000 apm thing is because of a bug in how apm is calculated in starcraft2. There is a hotkey to assign all your units to a new control group while also deleting it from all other control groups which TLO extensively uses, and while it just is one key-combination to press it records as 1 action per unit which was selected. The real APM of pro players averages at 250-400 and peaks at 600-700.

gpm · on Jan 25, 2019

> a repetitive strain injury waiting to happen.

Yes, I have one from it and wasn't even playing that high (I averaged less than 100 apm). I understand that it's a common problem.

bitxbitxbitcoin · on Jan 25, 2019

Was Starcraft the only/main game that you played?

gpm · on Jan 25, 2019

Yes, basically the only for several years at that point. A few hours here and their of other games but nothing at all substantial.

fernandotakai · on Jan 25, 2019

i already had some RSI, but playing SC2 made it a lot worse (i stopped playing when i got to plat as a zerg because it required enough APM to hurt)

dmoy · on Jan 25, 2019

It is why I stopped playing SC, and I was never any good anyways. Still fun, but it just hurt real bad.

swaggyBoatswain · on Jan 26, 2019

I stopped playing SC competitively because it's too stressful. Both physically and mentally. Hitting 300 APM continously in a game for up to 60 minutes at a time makes your hands go numb. And the adrenaline rush makes you want to go running afterwards. With games like LoL/DoTA at least you have a chance to take a break after a gank/farming/ team wipes. With starcraft every decision has a significantly higher compounding effect

dmoy · on Jan 27, 2019

Hell I never played it competitively. I had to stop playing it even casually because it physically hurt.

cjbprime · on Jan 25, 2019

Wait until you hear about stringed musical instruments? :)

jacobolus · on Jan 25, 2019

From what I understand, the most common string instrument problems are with shoulders/neck/back, due to sitting for long periods of time with poor posture.

Most music should be playable without excessive risk of serious injury to arms / wrists / hands, but from what I understand very high notes on e.g. the violin are hard to play without using an over-flexed wrist, which is definitely a problem if playing music requiring such a position for long stretches of time, or many rapid switches between high and low notes.

Some of the string players with most risk are novices who have not been taught proper technique.

For professional PC game players, the design of the standard computer keyboard and furniture is absolutely terrible from an RSI perspective (worse than any common musical instrument, and without any of the design requirements of acoustic instruments as an excuse), and it is shocking to me that there has not been more effort to get more ergonomic equipment into players’ hands. The way game players typically use a computer keyboard is generally more dangerous than the way typists or e.g. programmers do. As someone who spent a few years thinking about computer keyboard design, I can think of at least a dozen straight-forward and fairly obvious changes that could be made to a standard computer keyboard to make it more efficient and less risky for game players. There is a lot of low-hanging fruit here.

Whether or not the equipment is changed, the most important single thing when using a computer keyboard (or any hand tool for that matter) is to avoid more than slight wrist flexion or extension, especially while doing work with the fingers. Excessive pronation and ulnar deviation of the wrist are also quite bad. Watching pro players, many of them have their wrists in an extremely awkward position while doing fast repetitive finger motions for hours per day without breaks, which is a guaranteed recipe for RSI.

vertline3 · on Jan 25, 2019

Well I have heard of them, also looked up TLO mentioned above, he actually did get RSI and had to take months off.

"Liquid regretfully announces that Dario “TLO” Wünsch will be unable to play for the next few months due to the Carpal Tunnel Syndrome he experiences in both hands. He will however continue to be involved with E-Sports even as he takes a break from gaming to give his wrists time to heal. Sadly, this means that he will not be attending Dreamhack Summer or the Homestory Cup III as a player."

simmanian · on Jan 24, 2019

> artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased

I like the idea of having action noise that's linearly related to APM

pesmhey · on Jan 25, 2019

There would be an entire new dimension of decision making, in addition to good macro, where you have to prioritize actions. Will be interesting to see.

posterboy · on Jan 25, 2019

I said so before, but is it really a big difference from controlling a unit that can also only do one thing at a time? The agent controls itself just like another unit, with a constraint on APM available to control other units. On the one hand, these APMs add a new parameter, if the constraint is implemented naively. On the other hand, if there are viable strategies against ultrahigh APM opponents, then the constrained is really rather limiting the dimensions of the decision space and to good effect, finding viable strategies that take less effort. Hence such things are called "hyperparameters" (I think that's something different, but you know what I mean). Likewise, the game isn't as fast as to need 100 screen switches per second, if good planning allows batching and bursting actions.

pmontra · on Jan 24, 2019

I understand the spirit of the proposal but that would be like limiting a computer to add at most two numbers per second. It's OK if we want an interesting contest against humans but it wouldn't be a fair estimate of a computer math capability. It's also not the point of using computers to do math instead of a room full of accountants. I'm OK with the AI going as fast as it can and play superhuman strategies because it can be that fast. After all we'll not limit AIs output rate when we'll let them manage a country's power grid.

sl1ck731 · on Jan 24, 2019

The purpose of limiting speed isn't to make an interesting contest, it is to accurately compare the "math" instead of the speed the math is done at.

It isn't surprising that its fast, the surprising part is that it can make human-like decisions. The only way to compare whether its thinking is human-like is to restrain it from "brute forcing" the contest through speed.

The model has likely learned that the faster it does things the better the outcome. What it needs to be measured on is strategy.

freeflight · on Jan 24, 2019

But isn't the competency of a Starcraft player is also measured on his/her speed?

In that context, you can't really measure strategy without accounting for timing/speed because a lot of tactics and strategies only become viable once the player has the required speed to actually realize them aka "micro".

b_tterc_p · on Jan 24, 2019

Exactly, and due to superhuman micro, the AI has cornered itself into learning a small subset of the strategy space. It’s not good at strategy because it’s optimized itself for just getting into micro-handled situations.

It’s not good at strategizing with all the options available to it given it’s micro ability, it has “one” strategy that leveraged the micro as much as it could, and when given a strategic challenge by mana, it didn’t know what to do.

confiscate · on Jan 25, 2019

yes but the ultimate goal, is to make an AI as "smart", or "smarter" than a human. That's why they keep making AI's play against human players in Chess, Go etc. It's not to prove computers are faster than humans. It's to prove computers can be smart like humans.

They want to make an AI that can teach new ideas to humans. New strategies that human bodies are physically capable of executing, but no human was "smart enough" to think of yet. An example is when the AI built a high number of probes at the start. That's "smart".

The only way to train an AI to be able to come up with new ideas, is to force it to be "slow". Otherwise, it will just always do the easiest way to win, which is out-micro. There is nothing interesting about a game like that. That only shows the AI is fast, but it won't be clear that it's "smart"

BigJono · on Jan 24, 2019

That's exactly why it's so important to try and constrain the system to as close to human parameters as possible. You can't compare strategic prowess if the two players are playing at a completely different level. It'd be the same as saying MaNa is better than say, Maru (who has just won 3 GSL Code S's in a row), because he has stronger strategies against ~30th percentile players. It makes no sense.

arcticfox · on Jan 24, 2019

Speed is only interesting as part of fair human competition. It's trivial for the AI to win with speed and it doesn't have to be remotely smart about it. Serral (dominant world #1) was easily beat by 3 far weaker humans controlling one opponent - it wasn't even close. It's just stupid to even claim victory in those situations.

Making an AI that wins by outsmarting humans, on the other hand, is what we are all interested in.

ajuc · on Jan 25, 2019

That would be right if AI and human player had the same opportunities for micro.

They don't, because AI doesn't use physical objects to move stuff in the game. AI just "thinks" that this stalker should blink and it blinks. Human player has to deal with inertia of his hand and of mouse.

If you want fair competition of micro - make a robot that watches screen through it's camera, moves mouse and presses keys to play starcraft.

Then the bandwith of the interface is the same for both players, and we can compare their micro.

aurelwu · on Jan 25, 2019

you don't really need a real robot, but assign some "time cost" for various actions which depends on spatial distance and type of action and if it is a different action than the previous action. humans are really fast when for example splitting a group of units but performing multiple different actions on different areas on the screen or even multiple screens takes a lot longer. They don't need to fully emulate human behaviour but getting somewhat close would really show how strong teh AI is tactically and strategically without superhuman micromanagement.

pmontra · on Jan 25, 2019

I try to make my point clearer.

If we want to measure strategy, I agree with you, and out of curiosity we might do it. But the goal is winning, so is strategy important as long as it wins? The AI can take every shortcut it finds IMHO. People do take shortcuts.

Cars and planes bring us across the world exactly because they don't walk like people and don't fly like birds. Wheels, fixed wings and turbofans are shortcuts and we're happy with them. We can build walking and wing flapping robots but they have different goals than what we need in our daily transportation activities.

ajuc · on Jan 25, 2019

The problem with starcraft is - interface overhead is significant part of the game. AI doesn't have to cope with that - every click is perfect, and moving the mouse from one edge of the screen to the other takes no time.

If you want to make it fair - place an AI-steered robot in front of the screen, and make it record the screen with camera, and actually move the mouse and press the keys.

Then I can agree it's fair :)

But then of course AI would be incredibly bad.

Right now the advantage doesn't come from faster thinking, but from much higher bandwith and precision that AI has when controlling the game. It's anything but fair.

With chess it's not a problem, because interface overhead is negligible.

pmontra · on Jan 25, 2019

Those are different engineering problems. I'm pretty sure that they could eventually build a pixel perfect camera and a fast pixel perfect robot mouse. They'll be at least as good as human eyes and hands, probably better. Done that, they'll keep winning.

It's surely interesting technology with positive impacts in a lot of areas but is it that the important part of the experiment? Humans need keyboards and mice to interface with computers, computers don't (lucky them.)

Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.

ajuc · on Jan 25, 2019

No, these are not "just" engineering problems.

When you're trying to individually blink 30 stalkers at the perfect time they have almost 1 hp - latency is everything.

Camera has latency. Depending on various factors it takes even milliseconds of exposure for camera to gather enough light that it registers as a clear image frame. Human eye works on a different basis, but also isn't instant. You cannot cut that in software, human player cannot train to lower this. But AI doesn't need to do it - it has image provided as a memory buffer.

Image recognition has latency (both in the brain and in computer). Even as simple stuff as recognizing where the computer screen is as opposed to the background. It takes time. AI doesn't need to do it.

Muscles (engines in robot hands) have latency.

Mouses and hands have inertia and can't be moved instantly - have to be accelerated and stopped and even if you have optimal algorithm to be 100% accurate - it takes time.

It's not only hard to implement, it's also physically IMPOSSIBLE to do without introducing significant delays.

AI that is controlling the ui directly doesn't have to deal with most of these tasks, so it has a huge advantage in a game like starcraft. It's not that AI is so much better, it's that AI is high-frequency trading and human player is sending requests to buy/sell by telefax. By the time your request is processed the other guy had opportunity to do 10 different things.

If you want to focus on the part of the job that is doable now - sure, go ahead. But then don't abuse the unfair advantages you have and announce you "won". It's very low threshold to win in starcraft when your opponent has effectively 100 times the lag you do.

I'm sure someday we will have AI that can beat human player in starcraft without abusing this advantage, And I'm pretty sure the fastest way to this isn't to put a real robot in front of a screen, but it's to limit the intraface bandwidth of the AI to be on the similar level as that of human players.

> Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.

Let's remove the roads that we made specifically for cars and speak about this again :) Will your car move you through an untamed wilderness quicker than your legs? Possibly. Or not at all.

If I walk into a bullet train, slowly walk inside it, and walk out of it at the end of the route I will be even faster than the fastest car. Is it fair to say I'm faster than a car? After all it's not my fault the car doesn't fit inside that bullet train :)

We need to compare apples to apples, and comparing AI that doesn't need to deal with half the sources of latency with a human player that does, in a game where latency is very important - just isn't fair.

otikik · on Jan 25, 2019

If you don't put any limits on the AI, it's not Starcraft any more.

You could make an AI which tries to hack the human computer to force a leave. That would also constitute a "win". Or one which hacks its own computer and displays "You win" immediately. Or one which tries to kill the human player, if we want to be really dramatic about it.

mikeash · on Jan 24, 2019

Chess and Go both limit computers to one move per human move, and they’re still very interesting games for AI. You’ll always have limitations. When you’re playing a game, the limitations are largely arbitrary, and you choose them to make the game better achieve whatever goal you’re after.

thomasahle · on Jan 24, 2019

You are right, but the point here is to force it to win by pure decision making. Having an AI play a game was always about challenging ourselves to improve our understanding of intelligence. Limiting APM is just another way to force us to come up with new ideas.

ivalm · on Jan 26, 2019

So, in some sense this is a limitation of starcraft. The goal of this project is presumably have the AI play a high strategic depth game. However, with sufficiently high micro certain strategies that have low "macro depth" become unbeatable. So it's true the AI would win, but it plays in ways that do not expand our understanding of SC strategy, it is simply using a simple to understand and impossible for human to execute strategy. Think of aimbot in a shooting game, a human can try to play smart and attack from unusual angles/lay traps/crossfires, but if the AI can simply get instant headshots the AI can run straight to objective and win. It would be a winning play, and humans understand why it would be a winning play (boringly so), but it is outside of human execution.

reitzensteinm · on Jan 24, 2019

But it's important to be clear about what's being measured. If the AI can take and successfully win engagements that no human could because of their superior micro, it's not necessarily winning via superior strategy (as is claimed).

spiritcat · on Jan 25, 2019

but still, if you want to measure that then play a turn based game. if i could micro as good as the pros i'd be pretty damn good too.

hooking it up to a camera looking at a screen and a robot arm with a mouse would be more fair though.

edit: ok they did have a camera version, but i still want a robot arm.

cm2012 · on Jan 24, 2019

In the showmatched they made the computer have to look at a regular screen to control, the stalker micro was much less impressive - and mana won.

CydeWeys · on Jan 24, 2019

For now. Give them another month. This is like AlphaGo vs Fan Hui all over again -- people knocked that accomplishment at the time because he was just a master, not one of the top players in the world. Well, not much longer, AlphaGo beat Lee Sedol, the best player in the world.

The ceiling here is going to be incredibly high, much higher than the level of play that people are capable of, even when restricted to a single window.

Obi_Juan_Kenobi · on Jan 25, 2019

This doesn't nullify the observations that people are making here.

Part of the difficulty here is describing what a 'fair' match might be. Specifically, I think fairness has to do with a goal many people have for AI: to improve human play. The strategies in Chess or Go that were employed could conceivably be used by human players. There aren't any hard restrictions preventing humans from learning from that play, even if the AI is entirely superior.

It would follow that a 'fair' SCII match would employ strategies that humans could implement. Making extra workers, for instance, might be a real lesson from AlphaStar play. The insane stalker micro, however, could never be done by a human.

From this perspective, I think the important takeaways were:

* The AI leaned heavily on super-human stalker micro.

* The AI had some strategic blind-spots, namely the immortal harass.

* The APM comparison isn't terribly meaningful; a lot of human APM is spammy/twitchy button presses that doesn't do all that much, whereas the AI can presumably make each action count. There were also AlphaStar APM spikes that likely go along with the stalker-micro issue.

CydeWeys · on Jan 25, 2019

None of this really matters though. The AI is improving every day through training. Give it another few months of development and it'll be able to trounce humans under any "fair" set of handicaps you can think of, like limiting average and max APM throughout the game. We saw the same pattern with AlphaGo. There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.

When AlphaGo first one, people said it wasn't fair because it was running on a whole cluster of computers. Well, within not much time at all, it was good enough to run on a single computer and still beat top humans. We are dealing with exponential progress here. The writing is on the wall.

rcthompson · on Jan 25, 2019

It's tempting to assume the AI will just keep getting better and better, but that's not guaranteed, and I was happy to see that the Deepmind folks in the video clearly acknowledged this. In the game that MaNa won, it's possible that he did so by finding a strategy the AI agent had never encountered before, causing it to respond with nonsense (e.g. not building a Phoenix and pulling its entire stalker army back to deal with warp prism harrassment). In a game with a strategy space as large as SC2, it's possible that an AI will never be able to saturate the space of viable strategies, and it will always be possible to find edge cases that the AI has no idea how to handle.

Obi_Juan_Kenobi · on Jan 25, 2019

The point isn't that the AI won't improve or win with those conditions; I agree it likely will, and soon. The point is that the conditions of the match matter and that this one missed the mark.

It absolutely does matter whether the AI can use obviously super-human techniques, because then it's not nearly as interesting for human observers. I'd much rather watch an AI that was a strategic genius that won despite being hamstrung in terms of micro/techniques.

> There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.

Who's claiming this?

tsbinz · on Jan 24, 2019

Lee Sedol was not the best player anymore at that time (not saying it wasn't an impressive/important achievement, but overstating it doesn't help either - the "beat best human players part" came later in 2017).

NhanH · on Jan 25, 2019

Lee Sedol was still top 5, certainly no worse than top 10 at the time. By all mean he wasn't the best and most dominant, but the difference with the top was tiny.

eurg · on Jan 24, 2019

I don't understand who's downvoting you, this is accurate. While AlphaGo/Zero improved quickly to superhuman play, we are just in this thread comparing timelines, so that is relevant.

sherjilozair · on Jan 25, 2019

What kind of evidence is going into this analogical reasoning? Do we also extrapolate similarly for other things? We went to the Moon in 1960s. Was Mars a month, or a year, or a decade away? Then we sent robots to Mars. Did we yet send any robots to Alpha Centauri?

Different problems have different difficulties. Solving simple problems quickly doesn't mean we'd also be able to just as easily solve the hard problems. Often the comparably simpler problems have the best reward/effort ratio and thus make quick progress, which doesn't need to be the case for hard problems.

CydeWeys · on Jan 25, 2019

Going to the Moon is a completely different endeavor than making an AI better at a game that it's already quite good at. This is a red herring.

If you had bet against AIs reaching parity with top human players in any previous game, whether it be Checkers, Chess, Go, etc., you'd have lost. I see no reason why StarCraft II should be any different.

We can reconvene in the comments here a year from now and see where AlphaStar is then.

gambler · on Jan 24, 2019

It's not the accomplishment that people knock. It's the spin, the inaccurate article titles and the hype.

CydeWeys · on Jan 24, 2019

It doesn't seem like hype to me -- it seems like a genuine, significant accomplishment. Sure, they might not be able to beat the best pro players consistently right now, but I suspect that is right around the corner. Would you rather they stay completely mum until they've reached that goal too? And why? I'd rather know now, and then be able to follow along as it gets better and beats higher and higher-ranked players.

moogleii · on Jan 25, 2019

Just FYI, Lee Sedol wasn't the best player in the world at that time (nor is he now). AlphaGo went ahead and beat the actual #1, Ke Jie, 3-0.

Jyaif · on Jan 24, 2019

The AI lost because it completely messed up the response to the immortal drop, nothing to do with micro.

Laremere · on Jan 24, 2019

This was my read as well. It seems that Mana simply found a strategy that the AI had not found. Due to not having trained against it, the AI produced nonsense results. The commentators noted that the obvious response was to build a Pheonix and just completely shut down the harassment. The situation is similar to Alpha Go vs Lee Sedol match 4.

One of the hardest parts about these kinds of human vs ai expositions is making sure the AI has explored the full possibility space, so that can handle all situations. The techniques at play lack the ability to perceive a completely new situation and formulate a good response. (Though anyone who's lost to cheese in games they later learned easy counters for know that humans, while better than state of the art AI, aren't perfect here either.)

celeritascelery · on Jan 24, 2019

Mana got himself in the same situation where he was surrounded by stalkers on multiple sides, but this time the micro wasn’t so crazy that he couldn’t manage it, and he was able to take on one group at a time. The immortal drop, while unanswered, was not really that effectual.

cjbprime · on Jan 24, 2019

But it was answered: AlphaStar pulled a huge stalker army that was about to hit MaNa's base all the way back home to (attempt to) answer the drop, repeatedly. If you have more complexity to your army but fewer army units, as MaNa did, a delay like that is how you win the game.

b_tterc_p · on Jan 24, 2019

It’s funny because this works against the standard Ai too.

nickpsecurity · on Jan 25, 2019

That's what I said on Lobsters. They were always good at builds, micro's, etc. The one thing they couldn't do was judge human intent, esp if they were being mislead (esp time wasting). I was waiting for one of the players to try to screw with its head to see what it did. Mana showed two gaps: the back and forth thing; that it ignored the observers giving up constant strategy information. Then, he got the first win.

Now, the questions are how many more such glitches will show up and can they eliminate them with better algorithms?

cjbprime · on Jan 25, 2019

And against human players up to Masters 3 or so :) When you're still using the all-army hotkey, defending with a small and precise group isn't happening.

setr · on Jan 24, 2019

That time, the ai didn’t really even try to engage. In fact, the ending of the match was marked by the entirely absent group of stalkers as the natural was engaged.

It’s likely safer to say the AI was confused in general at that point, possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game

nickpsecurity · on Jan 25, 2019

"possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game "

In software, changes in assumptions can break what depended on them. There could be many assumptions in its neural net centered on full visibility. They should probably retrain all or just some from scratch with the camera change in from the beginning to see what happens. Then, it will be firmly encoded into the strategies over time.

maksimum · on Jan 25, 2019

They mentioned that they retrained after the camera change and it was equivalent to the AIs that beat Mana 5-0 by their metrics.

TulliusCicero · on Jan 25, 2019

The immortal drop let him keep AlphaStar occupied while he built up a critical mass of immortals (it becomes harder and harder to effectively micro stalkers against immortals as numbers go up, probably even for an AI), then let him put AlphaStar in an awkward position when it was camping near where the warp prism was hiding.

Cookingboy · on Jan 24, 2019

The results are obviously impressive, but even then there is a lot of work to do as far as learning efficiency goes:

"The AlphaStar league was run for 14 days, using 16 TPUs for each agent. During training, each agent experienced up to 200 years of real-time StarCraft play. "

MaNa probably played less than 2-3 years of Starcraft in his whole life (by that I mean 24hr x 365d x 3), and was learning with a much less focused/rigorous methodology.

derefr · on Jan 24, 2019

Another way to think about it is that a human brain is mostly doing transfer-learning, on top of a 99%-baked deep net that was wired up during foetal development from our DNA, where that DNA-persisted model has "seen" hundreds of millions of years of training data.

Humans don't have to learn to process, recognize, and classify objects in visual sense-data, for example. We can do that from the moment we're born, because we already have hundreds of precisely-tuned "layers" laying around in our brains for doing just that. We just need to transfer-learn the relevant classes.

dcl · on Jan 24, 2019

This is a widely underappreciated fact when it comes to comes to comparing the 'training experience' of humans versus bots. And it extends far beyond processing 'sense data' - A human likely has some level of understanding of how the game works based on experience from other games it has played and from 'real life' - we know almost instinctively that 'high ground' is likely to give a combat advantage without having test it in game.

derefr · on Jan 24, 2019

Not only that, humans (and many other eusocial species) have an instinctual intuitional understanding of many aspects of game theory.

For example, humans, even from infancy, prefer games where it is possible to punish cheating (i.e. take revenge upon cheaters) to games where it is not. This isn't just "we're animals that have evolved to enact tit-for-tat strategies [by e.g. injustice triggering rage] because they lead to cooperation which leads to egalitarian utility"; this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted, whether you'll be likely to end up in an "unjust" social situation if you agree to the given ruleset. There is an "accelerated co-processor" of high-level abstract game-theoretic information—and layers to extract that information from sense-data—that ship as part-and-parcel of the human brain model. We never need to learn how to judge unfairness, any more than we need to learn how to see.

cjbprime · on Jan 25, 2019

And perhaps worth noting that the great apes we evolved alongside have the same kind of outrage to unfair trades.

gdy · on Jan 25, 2019

"humans, even from infancy, prefer games where it is possible to punish cheating...this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted"

[Citation needed]

saberience · on Jan 25, 2019

All of our knowledge of how to play games and so on has come from our current lifetime. We do not have a "genetic memory" that means we have learnings from cavemen or some other such nonsense. Our DNA contains instructions on how to grow a human, it's not a mega hard drive with millions of years of collective memory.

If a 19 year old is good at Starcraft, he's good at Starcraft because he spent two or three years playing a shit load of Starcraft and we are much more efficient at learning higher level strategies than AI are. These AI agents nead to try damn near every possibility to adjust their weightings for various actions. Humans understand pretty much the first time when something goes wrong, oh better not do that OR similar things again.

It's incredibly impressive that a given human can become GM level at Starcraft within a few years and to take an AI to that level takes 200 years of training, as well as an inhuman reaction time, perfect micro/clicking, etc. It shows how amazing our learning skills are.

taneq · on Jan 25, 2019

We may not have "genetic memory" but a ton of human capabilities are baked in at the DNA level. Sure, we need to practice in order to specialise those abilities for particular tasks, but that's more of a calibration phase on a fantastically capable machine, rather than a construction phase.

Totally agree with how impressive humans are, though. In fact, one of the most amazing things to me about robotics is finding out how close to global optimal some humans can actually get.

posterboy · on Jan 25, 2019

The GP is underselling the fact that in the human years of being a pro player they think through many more games and may even dream of it. I certainly went to bed after a lengthy session with images of the game still in front of me. Although that might be more about micro, the macro skills are somewhat transferable from other "games". RTS simulate economy, amongst other things, after all.

GP's claim, "99%-baked deep net that was wired up during foetal development from our DNA" is also unfounded, if not completely overblown. I am far from a student of biology, much less an expert, but intelligence is still seen as an emergent property. The real kicker might be that organizing thoughts might be a "game" of it self, that is learned in development and constantly exercised. Talk about self-play.

I recently read a similar question about "inherent mathematical language", ie. capability, and the given opinion was that there is no consensus, except perhaps for basic addition, which I guess concerns vision, ie. seeing a set of things and knowing the count is +++++. That works only up to around +++++++ items at best, according to findings.

asteli · on Jan 25, 2019

Perhaps a nit, but still fascinating: the human visual cortex finishes developing after birth. A newborn can't really distinguish between objects. The ability to differentiate, focus on and track objects is developed over the course of several months.

derefr · on Jan 25, 2019

True. Humans are pretty unique in that regard, though; pretty much no other animal is like that. It's easier to understand human neonatal development if you just considering all humans to be born premature. (It'd be really interesting to know whether that's literally true—whether keeping a human baby in the womb for an extra few months would actually result in the same stages of mental development being passed that occur in a regular baby of that age who has been sensing and interacting with the world.)

Ar-Curunir · on Jan 25, 2019

I've read somewhere that we are basically born prematurely (as you said) because if we waited any longer then our enlarged head sizes would make delivery quite possibly fatal.

intuitionist · on Jan 25, 2019

My brother was born a week or so after his due date; they induced labor for him for exactly this reason. Perhaps unsurprisingly, his head circumference was literally off the charts.

posterboy · on Jan 25, 2019

Maybe off-topic, but that's one side of the coin, and I suppose the other is that being exposed to more sensory input accelerates development, or makes it even possible (on higher levels of cognition). If this wasn't the case, why wouldn't we just be bigger and carry longer? Is size viz megafauna really that suboptimal for any more significant reasons than being hunted human hunters? I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.

Calling it premature is ironic, if we reach nominal maturity only after 10 or more years as far as fertility is concerned--the equivalent in AI would be the procreation of a neural net, perhaps after exploiting a bug in the game, breaking out to rewrite a better version of itself, or colluding with itself in self play. Yes, this is going off-topic.

derefr · on Jan 26, 2019

> why wouldn't we just be bigger and carry longer?

The consensus in the evolutionary-anthropology community is that our hips (pelvic bones) have to be the size they are, in proportion to the rest of us, to make us able to walk upright. "Building bigger" doesn't really work, for the same reason that you can't make a giant robot—if you scale humans up, the pelvis would need to be made out of something stronger than bone to support the additional load.

The same is not as true, though, if you just make the person wider—because then you spread the same load over "more pelvis." (This is just a personal unfounded hunch of mine, but I think some human subgroups—e.g. midwestern Americans—who are at the genetic limits of baby head size, and who avoid C-sections, are currently selecting toward bigger-boned-ness.)

> I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.

Keep in mind that we wouldn't be conscious for any of it. The development stage that "wakes you up" to the outside world would just occur later on, as occurs in animals with longer gestation periods (e.g. elephants, with a gestation period of 18-22 months.) This would give things like your ocular layers longer to finish developing, without really having an impact on the parts of your brain that learn stuff or think stuff.

nopinsight · on Jan 25, 2019

Hypothesis:

Being born “prematurely” might allow for more flexible brain wiring. Adapting better to an environment quite distinct from ancient ones we had evolved in is possibly one of our key cognitive advantages compared to other animals.

dooglius · on Jan 25, 2019

Is there evidence for this? My mental model has been that DNA encodes more along the lines of hyperparameters: amount of gray matter vs white matter, locations of brain regions and folds, etc, but the connections between neurons, and their weights, were all learned. There isn't that much information you can stuff into DNA, after all.

YinLuck- · on Jan 25, 2019

Connections between neurons, the synapses, are encoded. So much so that they are given individual names. This is a fun one to read about to get an idea:

https://en.wikipedia.org/wiki/Calyx_of_Held

erikpukinskis · on Jan 26, 2019

> Humans don't have to learn to process, recognize, and classify objects in visual sense-data

Do you have a citation for this? It doesn’t jibe with my understanding of development. For example, animals born paralyzed are blind: https://io9.gizmodo.com/the-seriously-creepy-two-kitten-expe...

lrem · on Jan 25, 2019

Human genome isn't even a gigabyte of data. That's less than a byte per neuron and a big chunk of that data actually has to go into "how to make a kidney cell" and "which way to route veins". So while some basics have to be hard-coded, it can't be remotely close to "99% transfer from ancestors".

saberience · on Jan 25, 2019

That's not how any of this works. We do not have "millions of years" of information encoded into DNA. DNA doesn't store that much data. In fact, it's about 1.6 gigabytes only! And most of that information is basically a ruleset for growing proteins which become our body.

All the stuff we've learned about games and so on have come from our current lifetime. I don't have caveman memory for how to fight a tiger.

derefr · on Jan 25, 2019

I said "deep net" for a reason. A DNN model almost always turns out to be far, far smaller than the training data that was used to create it.

For one example: any smartphone's face-recognition feature. Each such feature is a DNN which took millions of hours of face data to train... but the resultant model fits on an ASIC.

Our DNA doesn't directly encode such a model, but it encodes a particular morphogenic chemical gradient, and set of proteins, that go together to make specialized neural "organs" (like your substantia nigra, or your basal ganglia, or your superchiasmatic nucleus, etc.) which manage to serve the same function to your brain that access to a pre-trained "black box" DNN model would serve an untrained NN in achieving transfer learning.

saberience · on Jan 25, 2019

Our DNA is NOT a trained deep net, nor is it a deep net period. Our DNA is a string of proteins which encode other proteins which gives the series of tasks needed to create and operate all the structures of the brain and body.

The "training" of our deep net happens during our lifetime. We are not born with a trained deep net so your analogy that somehow we are born with a highly capable deep-net encoded into 1.6GB of DNA makes no sense.

Can you imagine how capable a human being would be if it was born into a world with no other humans or learning sources? Imagine a new born baby born into a world with some accessible food/water close by so it wouldn't die from lack of nutrition or wild animals, but crucially without any other humans. It would be utterly fucking useless, no language/reading means no way of assimiliating new knowledge. That baby would end up being a totally incapable human, regardless of the DNA or structure of the brain.

As far as we currently understand, if infants aren't exposed to language and communication at a very young age, they are either incapable or severely stunted in terms of communication for the rest of their life.

My point is, that we are very much dependent on the learning that we get from the point of birth ONWARDS. We get the amazing capacity to learn from the structure of our brain and body, but we'd be absolutely incapable idiots without other people to teach us, our books, language etc. We understand "games" and game theory from playing games with other kids, we're not born with "game theory" encoded into our DNA as one other commenter seemed to think, the same for language learning, and everything else.

Anyway, the point of this whole debate was that it's incredibly impressive that humans can learn to play a game as complex as SC2 in a tiny fraction of the time it takes a cluster of GPUs using a huge amount of energy and resources. Not forgetting that we also have to use a physical body to control our actions in the game, which adds a whole other level of complexity since we have to understand how to manipulate a mouse/keyboard etc, whereas the AI is essentially acting directly with the game, like a human with a neural link. The other kicker, is that if you just changed one aspect, like picking a new map neither player had seen, the AI would be sent hurtling back to square one whereas the human would only be partially affected. These series of demos only make me more impressed that given the huge resources given to Google, they can just about beat a human and even then after 200 years of training time and various other artificial advantages.

slphil · on Jan 25, 2019

You are willfully missing the point. Animals have instincts. The complexity of humans does not make them an exception to this rule. There are in fact large amounts of brain function that are baked in at birth (or developed in a predictable timeline after birth -- humans are basically born premature). Humans are able to instinctively perform behaviors which are not taught, although the majority of critical behaviors in humans are socially learned. Feral children (like Genie) are functioning organisms with complex behaviors. They're just defective humans because humans rely on a distributed learning system called culture in order to do the work that biology cannot.

You are insisting that because humans do not have instincts at a certain level of abstraction (playing video games) that no part of these instinctive brain functions play a role in the development of skill at Starcraft. This is wrong. Abstract reasoning is not simply learned, but it is HONED by experience and neural development. An AI has to do an enormous amount of work in order to replicate functions that humans can already do. This is the basic visual problem in AI that stumped researchers in the 60s who thought that tasks like visual recognition, spatial rotation, etc would be trivial because they are trivial to evolved organisms.

You're relying on some kind of mental model where brains are just masses of neurons that form all of their connections and complexity after birth. This is ultimately a political idea, and it's wrong. No neuroscientist believes this. Brains have pre-defined areas (with fuzzy borders) and many behaviors do come baked into the template. Complex behaviors like language do not, perhaps, although even there, the underlying functionality that permits language is an evolved trait (which is why other animals can't learn language). Research the FOXP2 gene, as just an obvious example.

Edit: Your post contains "structures of the brain". What exactly do you think the structures of the brain are, if not evolved modular solutions to complex problems? Your visual center is somewhat trained after birth, but it already exists. The same goes for speech, motor control, and all of the other unconscious or semi-conscious processes that all humans (and other animals as appropriate) share.

nopinsight · on Jan 25, 2019

One macro technique used by AlphaStar agents that is not used by human pros is building extra workers beyond currently exploitable capacity.

This gives them reserves when attacked and some workers killed. They can also ramp up mining at a new base quickly by moving the extra workers there.

Apparently the benefits outweigh the costs for these workers for AlphaStar. It will be interesting to see if some pros decide to adopt the technique and if it improves human performance as well.

Disclaimer: I do not have much Starcraft experience.

jammygit · on Jan 25, 2019

Workers mine 40 minerals per minute and cost 50, taking... 15 seconds to build? I forget. Workers beyond 24 provide zero benefit (better to send them to the natural).

Let's say you make 4 extra at a cost of 200 minerals and then lose 4 workers to harassment. You are out 200 minerals in both cases, but the prebuilt workers in the prebuilt case will mine an extra... 100 minerals? (40 + 30 + 20 + 10).

This doesn't take chronoboost into account though. I don't know, the gain is marginal, and the opportunity cost is having a smaller army (2 zealots for example)

Please correct my numbers if I've made a mistake, I forget build times and havent played since hots

keerthiko · on Jan 25, 2019

The numbers you cite are close enough that your estimations are good to work with (12 seconds to build, closer to 60 minerals at full efficiency but down to 40 for probes #17-24, etc)

The extra workers aspect was the most interesting decision-based adjustment AlphaStar made on conventional pro level wisdom of "standard" play. It has a couple of factors in play, that I trust the AI factored in and more and tested over several games for its long-term benefit to winning a game:

- every 8 probes you build requires a pylon as well. total cost of 500 minerals

- workers are safer in the main than in an unoccupied natural (long distance mining) to harassment and pressure

- when your expansion completes, having 4 workers vs 8 workers vs 16 workers potentially has huge impact to the immediate spike in income

- what you mention -- the prebuilt workers will dampen the impact of most worker harassment to purely the resource cost of the lost workers.

My guess was that well executed harassment by an opponent in practice games put AlphaStar in very limited situations with a crippled economy that it couldn't fight its way out of, so this was a catch-all harassment "counter" -- it's ok if you kill a few probes, at least it won't throw off my economy completely and I can still continue my overall gameplan.

After that I think the next most important aspect was planning ahead for a bigger income spike when their expansion was done without waiting to build out another 16 workers after the nexus was ready.

TulliusCicero · on Jan 25, 2019

Yeah, it looked like Mana was copying this behavior somewhat in the live game.

proc0 · on Jan 25, 2019

I bet there's a sweet spot in-between that will come out of this, like saturating your natural to 24 workers minutes before expanding.

rkagerer · on Jan 25, 2019

Yeah that stalker micro really showcases a particular advantage leveraged by the AI.

I'd love to watch the results of constraining the AI so instead of seeing the whole map at once it has to pan around the same way a human would to get updated information on each battle. Counting those "info-gathering" window pans against the actions tally might yield slightly fairer APM metrics. (EDIT: Turns out they built a new agent for game 11 to do just that)

One of my biggest beefs with strategy games of this genre occurred around the time sprites went 3D and the player viewports got smaller (presumably to showcase all the cosmetic detail, and since it became harder to distinguish between visuals when zoomed out farther). I always feel too constrained on the modern games - like I can't see enough of the map at once. In my opinion that "full size viewport" gives a multi-tasking edge to the engine that the player doesn't share (beyond the human cognitive overhead from context switching you already pointed out).

On the other hand I find it fascinating our AI's have become strong enough at our games that we're having to handicap them to avoid players crying foul that they're not fair.

fandango · on Jan 25, 2019

I agree. Most RTS games feel constrained because of the limited viewport. Supreme Commander has a nice feature where you can zoom all the way out at any time.

yvdriess · on Jan 25, 2019

And a very important part to SupCom's zoom feature is that at a certain zoom level it switches to a rich visual overlay of unit icons and pending/queued orders.

sciyoshi · on Jan 25, 2019

I would agree with that. If you take a look at the exhibition match replay, there's some cases where it makes objectively suboptimal decisions. We couldn't see this during the live stream, but the double immortal warp prism caused AlphaStar to bring back its entire army from across the map, when a few units at home would have been enough to defend. It even kept trying to blink its stalkers to a place where the warp prism couldn't be reached. Perhaps this version with the limited viewpoint hadn't been trained with enough games?

andreyk · on Jan 24, 2019

Also worth noting that it starts by imitation learning from pros. I'd be curious to see if the macro can be learned without imitation; a much harder challenge. Also, playing with full visibility as was mostly the case in the demonstration is quite lame...

hughzhang · on Jan 24, 2019

I'll bet you that AlphaStarZero comes out in a year and just learns from scratch.

loser777 · on Jan 24, 2019

I'll take you up on that bet; they started with a version that tried to learn from scratch they seemed to have scrapped that approach.

taneq · on Jan 24, 2019

I bet the very early internal versions of AlphaGo learned from scratch and didn't work very well either.

gwern · on Jan 24, 2019

Correct. They started with pure self-play and it didn't work at all.

kolinko · on Jan 24, 2019

It wasn't a full visibility - Alpha had a fog of war. It just saw the whole map at the same time.

TulliusCicero · on Jan 25, 2019

That's still a large advantage that humans don't have access to. Not just in the "pitiful humans can't take advantage of such a large viewing area" sense, but literally the game will not let human players zoom out that far.

celeritascelery · on Jan 24, 2019

Also I wonder how it handles invisible units. Because as a human player you can see the shimmer if you look close. Can it see that or are they just totally invivisble to it?

setr · on Jan 24, 2019

Presumably completely invisible, as it was looking at raw unit stats rather than the visuals.

celeritascelery · on Jan 24, 2019

I wonder if that would let you win with something like mass dark templar with phoenix's to snipe observers. You could run right past it, and it could never anticipate you.

Or better yet, imagine zerg where you can burrow every unit.

setr · on Jan 25, 2019

It would be the same as with a player: as soon as you do something with those invisible units, or imply that you have it (eg dt shrine), its sufficient to say that invisibility is in play, and appropriate tools should be used. Its not like you can do anything about dark templars even if you see the shimmer, if you have no sight, beyond body blocking.

Regardless, the article describes cheesing as the common tactic in early iterations, with economic-play being learned later — one of the described cheeses is dt rushes, which the AI apparently learned to deal with, so it should have some understanding of invisible units (alternatively it learned to ignore the dts and base trade or something).

I don’t think the shimmer is useful enough to be a significant loss for these prospective AI’s quests for world (sc2) domination

freeflight · on Jan 24, 2019

If you learn, why not learn from the best, the pros? These people already have spent years figuring out what works and what doesn't. Why not draw from that pool of knowledge and instead spend extra time going through the same motions?

MereInterest · on Jan 24, 2019

Because then you don't know whether the AI learned by experimentation or by mimicking. To draw an analogy, imagine the difference between somebody reading and following an algorithm to solve a Rubik's cube, as opposed to somebody being handed a Rubik's cube and experimenting. If expert-level strategies can be reproduced without being explicitly shown to the person/AI, then it means something is going right in your methodology.

kadoban · on Jan 25, 2019

Two reasons I can think of:

An AI trained from human strategy might end up more limited than one that could learn from scratch. It could be stuck in a local maximum of play and be unable to escape.

An AI technique that requires a large dataset of pro play to learn will be much more limited in terms of applying it to other games.

olliej · on Jan 24, 2019

it seems like in some cases at least it didn't have to move the camera (it had direct interfaces) which for some of the stalker micro battles (especially in game 3 or 4?) the battles were larger than the screen space -- it would not have been possible to micro that well if your control interface limited what you can control or where you can place them.

methodover · on Jan 24, 2019

This is a great point, and something that seems a bit lost in the discussion:

In StarCraft 2, the game IS the interface. That is to say, the developers have constructed the game in such a way as to be difficult to control; and human mastery of the interface is a large percentage of the game. Strategy in the game is important, of course -- but this is not chess, where human beings are not limited by the interface of the game. In StarCraft, you are intentionally given a limited interface to monitor and control a gigantic game while under incredibly tight time controls.

And I should also note that Blizzard is extremely reluctant to add features that make it easier to control the game. I have a friend who works on the StarCraft 2 team. We talked at length about this one feature that he designed and proposed for the team to make a specific aspect of the game friendlier towards players. It was turned down for exactly the reasoning above -- the game is the interface. By making the game easier to control, it disrupts the entire experience; an StarCraft 2 that is easier to control is no longer StarCraft 2.

olliej · on Jan 24, 2019

That would actually be an interesting thing for someone from blizzard to do, get two similarly skilled high level players, and compare the win/loss rate by doing two 7 games matches with each player having a match with a 10% increased view size, and see what the impact is.

Essentially try to quantify the advantage of increased view area.

TulliusCicero · on Jan 25, 2019

Yup, exactly. To add onto this, for people less familiar, there's a non-stupid reason for this: economy of attention.

Attention/APM is often called the "third resource" (after minerals and gas), spending it wisely when you have several areas at any given time that could use attention is part of the strategic and tactical decisionmaking. For example, usually in a battle you wanna be paying most attention to the fight rather than your base, but sometimes it's actually better to jump out back to your base to increase production or economy, and knowing which situation is which can be challenging.

Obviously, if you make the game mechanics too easy to control (letting the computer do more of the work), then this part of the game becomes less interesting, because you don't have to weigh trade-offs as much anymore.

foobiekr · on Jan 25, 2019

Are there any bolt-on augmentation interfaces that utilize the same API the bots use to allow players to more effectively enter their intent?

notSupplied · on Jan 27, 2019

It's a question of whether "played with human level latency and precision" be a part of the rules of the game we are making the AI play.

I would say yes, because StarCraft was very clearly balanced for human players. We already saw some indication that when played with super-human micro, mass blink stalkers is a stronger strategy than when humans are in control. Without the active intervention of game balancing, RTS metas tend to devolve into "mass one or two units" which was what happenes to every Command & Conquer game (and why SC is a respected eSport while C&C is not).

I suspect this will happen when you have agents playing parameters that don't match what the game was balanced for. The strategic landscape will shrivel up and the game cease to captivate us.

stared · on Jan 25, 2019

APM is one thing. I am curious what would happen if it could only see a limited view (as in the last game with MaNa, which it lost to him) and physical click dynamics (i.e. clicking + gaussian noise as an action, instead of giving direct commands). That way there will be misclicks, preventing this super-efficient Stalker micro.

sytelus · on Jan 25, 2019

Also these wins are not using same inputs that human receive (ie on screen image) and outputs that humans are allowed. They instead use PySC APIs which has much more flexibility, perfect information and no constraints of limited screen real estate and pixels. There is a claim in that article that they have another version being trained that uses on screen only information but I still don’t know if AI is allowed to bypass the physical constraints of controller. So if AI has access to super human controller you will see AI performing super human actions like many commentators have described here.

ygra · on Jan 25, 2019

Perfect information is a bit of a stretch. There was still fog of war. The AI just played as if the portion of the map visible and actionable at any point in time was the whole map. They retrained with a restriction to a given locus of attention that can change, akin to a screen the player is looking at and acting on.

kibibu · on Jan 25, 2019

The final game in the video has this limitation. It does affect the performance of the agent.

Thaxll · on Jan 24, 2019

This is exactly what I think, I'd like to see how Alphastar react to "cannon rush" or other weird bo where you need to be "smart" to counter it and just not be based on insane / none human micro.

knicholes · on Jan 24, 2019

This is how it responds to cannon rush. : ) https://www.youtube.com/watch?v=vYdWQjTWTFM

tjoff · on Jan 24, 2019

Isn't the point of a cannon rush to build the first cannons where they can't be seen?

gpm · on Jan 25, 2019

Surprisingly not. The trick is usually to build pylons (or other cannons) such that they protect the cannons from being attacked by probes. Building them out of sight is usually too slow as a rush.

Still, he didn't do that either.

tjoff · on Jan 25, 2019

In the beginning of SCII I only saw people trying to hide it. But I guess the strategy evolved, interesting.

arayh · on Jan 25, 2019

Sometimes you may see a photon cannon used to deny an enemy's natural expansion to try to gain an economic advantage. Depending on the map and matchup, it may also complicate the enemy's early attempts at scouting and aggression.

Typically, you don't see more than 1-2 photo cannons, because you don't usually want to "over-invest" and lose what advantage you gain.

Thaxll · on Jan 24, 2019

This is not how a player would do a cannon rush, it needs to be hidden / at the edge of the opponent view.

59nadir · on Jan 25, 2019

That's inaccurate. The best cannon rushers generally build them visibly, but not just anywhere. If you look at someone like QuasarPrintf as an example, a player that keeps a fairly high rank on an account that literally only cannon rushes (there is no anonymity, no pretense about what's going to happen), he wins despite people knowing what's going to happen and putting the cannons mostly well in view of opponents on a lot of maps.

Printf is part of a fairly small group of cannon rushers that don't simply see it as just another cheese, because what generally defines a cheese strat is that it can be easily countered if you know it's coming; not so with their cannon rushes.

Now, with that said, Printf (or any other "I always cannon rush" player aren't winning tournaments), but that's partly because not many players decide that they want to stake their development on any one strat like that, and if they do, it'll likely be one that's deemed more legitimate by the community.

hughzhang · on Jan 24, 2019

AlphaStar makes up for its slightly subpar macro with REALLY good at micro. Thus, more micro heavy counters like cheeses are unlikely to beat it.

nardi · on Jan 25, 2019

The strategies might be subpar, but the economy sure isn’t. It consistently had better economy.

TulliusCicero · on Jan 25, 2019

It was very good at microing its macro.

javier2 · on Jan 25, 2019

I am really impressed it learned when to pull probes in that game against Mana where the AI was pressured into his natural.

It was also extremely active with the stalkers, deciding to split them in three and not let Mana cross the map with his immortals.

throwawaymath · on Jan 25, 2019

> For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.

What's that hireability like?

sidusknight · on Jan 29, 2019

What was your SC2 alias? I played at a similar level as you.

ajuc · on Jan 25, 2019

Mana tried to outblink an AI?

Damn I really need to watch these games :)

throwaway415415 · on Jan 25, 2019

Totally. What would be interesting to see is a low APM bot that still beats human players. A lot of that macro was unbeatable.

porky · on Jan 25, 2019

And also, latency is lower

pesmhey · on Jan 25, 2019

In a nutshell, AI micro was flawless, makes up for suboptimal macro?

cjbprime · on Jan 25, 2019

The macro seemed fine -- AlphaStar usually had more workers than the human opponent, in every game, and was producing more army. The suboptimality seemed to be in army composition (blink stalkers) and strategic decision making (pulling all of a superior army back home to defend a single warp prism drop).

ehsankia · on Jan 24, 2019

> While they have similar APM to SC2 pros

Wasn't the APM closer to half that of the pros?

https://storage.googleapis.com/deepmind-live-cms/images/SCII...

arcticfox · on Jan 24, 2019

This is super deceiving and I'm kind of upset they posted this image, knowing it would mislead people not familiar with the game. The AI sits around during lulls at <30 APM - meanwhile MaNa and TLO were literally spamming keys to keep their fingers warm, not actually doing anything.

During the fights, the critical moments in when MaNa would top out at ~600 humanly inaccurate APM (this is 10 inputs per second), the AI would jump up to over 1000 - we don't know exactly what it was doing, but it was presumably pixel-precise. Meanwhile the physical inertia of the mouse is a challenge for humans at that speed - imagine trying to click five totally different places with perfect precision in a single second.

sherjilozair · on Jan 25, 2019

Do you know why TLO's APM is sometimes so large? Did he actually peak at 2000, or is he using a repeater or something like that?

Retric · on Jan 25, 2019

APM gets inflated by counting several single actions as multiple separate actions. For example a Zerg player may want to turn larva into 30 Zerglings, they do this by pressing one button and holding it down as the UI repeats a separate action for each larva transformed.

By comparison selecting a single stalker, and having it jump to a new location is much more effort, but counts as fewer actions.