This is really impressive, I didn't expect starcraft to be played this well by a machine learning based AI. I'm excited to read the paper when it comes out!
That said, I'm not sure I agree that it was winning mainly due to better decision making. For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.
The stalker micro in particular looked to be above what's physically possible, especially in the game against Mana where they were fighting in many places at once on the map. Human players have attempted the mass stalker strategy against immortals before, but haven't been able to make it work. The decisions in these fights aren't "interesting"--human players know what they're supposed to do, but can't physically make the actions to do it.
While they have similar APM to SC2 pros, it's probably far more efficient and accurate so I don't think that alone is enough. For example, human players have difficulty macroing while they attack because it takes valuable time to switch context, but the AI didn't appear to suffer from that and was extremely aggressive in many games.
In the mass stalker battles, the AI APM exceeded 1000 a few times, and no doubt that most of that was precisely targeted. Whereas a human doing 500 APM micro is obviously going to be far more imprecise.
I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.
>I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.
IIRC OpenAI limits the reaction time to ~200ms when playing DoTA2. AI employing better strategies than humans will always be more interesting than AI that can out click humans.
Even the 200ms reaction time seemed overly slanted towards the AI. I don't think that is the actual reaction time of top pros, in the matches the AI played the human player would teleport in from complete invisibility and try to use an instant cast spell and the AI would have already teleported out. Yes the theoretically may have been constrained to a 200ms reaction time, but in practice the AI was playing at a superhuman level. Even with that advantage in fights, the human team still demolished the AI. Oh well, lots of things to learn still.
Another advantage was that the AI is just reading the game state through an API, it doesn't have to look on the screen. The game can be difficult to watch from a pro's perspective since they have to constantly click around the map to see what's happening, but the AI has perfect knowledge of everything it is capable of seeing, all without having to physically move a mouse to click on the screen.
If you watch the 11th game where pro player wins, (prior was a 10-0 shutout by Alpha), the AI actually lost because they rebuilt the agent to use the same forced camera perspective as the human - so there is absolute truth to this being a compelling advantage. It was able to micro multiple units in disparate areas by having far better spatial awareness. When they took that advantage away it seemed more even.
I don't know if we can absolutely claim that the limited viewport was the deciding factor in the 11th game, but it did seem to me that the Alphastar agent's blink stalker micro was somewhat compromised in that game compared to the seemingly superhuman blink micro in previous games.
It struggles with camera placement like real players :) And uses popular divert-attention tactics, which shows it understand that part of the game - for example when it sends oracles to mineral line at the same time as it attacks in front. Previous versions didn't do that, because they were taught playing vs cheating AI - so no point diverting attention of something that has instant access to any unit on the map :)
It also struggles to defend against adept harras beacuse it has "tunnel vision" - controls its oracle instead of defending probes at home. Mana actually managed his attention budget a lot better (this is a crucial pro-player skill in starcraft - harras is effective because it trades little of your attention for a lot of attention of the enemy, it's a skill that becomes irrelevant when opponent doesn't really have "attention" and can perceive and interact with all units on the map at once like previous version of alphastar).
This one is much more human, and much lower level. In my opinion it lost unfair advantage, so the mistakes in its errormaking are revealed. Previously it never was behind and never had to react to human player strategy - it rarely even scouted because what's the point - it wanted to build mass stalkers anyway.
Yeah, that's actually a huge point that I didn't even consider. Regardless of whether the AI itself is playing with a limited viewport, the fact that its opponent has a limited viewport opens up the opportunity to learn attention diversion tactics during the training process, which would otherwise be impossible.
What happens if a human tries to use the API with a custom UI of the human's own choosing? Such a UI might not exist yet, but are there ideas for more efficient UIs that could be built?
Yes I am curious of this too. What happens if the human has a giant TV screen that can see the whole map at once
Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do
Macro-wise, it would be like an unwieldly minimap which already exists so people can get a sense of where the enemy is moving. With a giant screen, information is not focused on a small area, so you are limited to your FOV. Minimap which shows unit strength in terms of armor hp or shields as well as placement would be ideal information.
Micro-wise, it would be like sitting in front of a giant text display looking at a whole book. You still have to focus on a small section to read it.
> Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do
While this would make it more fair, it would just make the micro game more similar to chess or go. I don't think humans would necessarily win in the end.
That's a good insight and yes, humans would probably be overpowered eventually. However, this is just the consequence of the fact that all games are similar if you remove external limitations such as reaction time (or, alternatively, produce a more efficient "being" which is not as subject to these limitations as some other).
Starcraft is like chess in some sense. The largest fundamental difference is that it isn't a perfect information game.
Tbh starcraft and dota shouldn’t really be the test games atm; turn based strategies (or rather, grand strategies) would be the far more appropriate evolution after chess and go, since we’re clearly more interested in AI macro than micro, and too much of its learning process is in trying to push the AI beyond micro-oriented thinking (probably many rounds of the AI tournament are lost simply because one AI found a new micro strategy to abuse)
But ofc, there’s no tbs or grand strategy currently out there with a real tournament scene, so you can’t really count on the devs implementing an AI-API, or even properly balanced / bug-free (far more user-testing goes into sc2/dota2 than say civ, simply by virtue of its playerbase).
Yes but a turn based game drastically reduces the action space compared to a real time game, something the DeepMind folks pointed out as a particularly interesting problem they wanted to tackle.
>a turn based game drastically reduces the action space compared to a real time game,
That's the primary benefit imo. The bigger action space is largely composed of non-strategic elements, at least in the sense of long-term strategies, eg micro and mini-skirmish tactics, that I don't think are as interesting. Ofc its clearly a conflict of interest, but my feeling was the most interesting aspect of Go/Chess is the AI making unintuitive discoveries that benefit the long-term. The human-collective machine is pretty good on its own at finding the shorter-term strategies; I don't think AI will make much significant impact in that space.
games as a medium to study upcoming real-world applications (eg cars), RTS makes sense; but as a medium to study AI beating humans, TBS is more appropriate (their ability to explore large search-spaces is far more interesting/potentially impactful). Studying both would be ideal ofc, but in a pick-one situation, TBS is better imo. But only RTS are even really viable atm, which is disappointing.
Even allowing players to zoom out would give huge advantages, that's why no matter the screen size you have to play at the same zoom. There was a bug at one point that allowed players to play multiplayer zoomed out and it was forbidden to use it in competetive games.
How about having multiple humans control the same faction, so one can focus on building, two on a couple of battle groups, another on scouting, etc.? Then they don't have to context switch nearly so much.
Aha, nice, thanks. Let's see, two players per side... not a huge number but probably a big step up from one. Looks like people aren't playing it much; some people suggest it's because that requires a partner.
I would like to see a setup akin to that of Ender Wiggin, with one commander overseeing and recommending overall strategy, and, say, five others managing different areas or groups. That seems like the way to get the best human performance, and might be enough to beat the AIs—at least to nullify chunks of their advantage.
Yeah, put an eye tracker in a pro and you'll see that the eyes are constantly changing the focus point, if you can watch the entire scene with the same precision without the need to focus on it you're already at a nice advantage.
As an aside, a few pro gamers prefer to play on windowed mode for exactly this reason.
Is the bit about reading the game through an api true? Earlier iterations of this same rl based agent that played Atari games would read just raw pixels not an api.
Yes, it’s true. A special PySC interface was created for AI. Also, it’s not only that AI doesn’t need to parse information available on much limited screen real estate but also that AI doesn’t have to use controller that have physical constraints. So AI has access to this super human controller and it can decide to click on one screen extreme and then another within 200ms.
Any game that is specifically going out of its way to support these ai’s will naturally do it through an api, though I’m only aware of dota2 and sc2 (sc:bw also does, through a community-modified client that serves the api, iirc). For adhoc games, eg atari, pixel-parsing is the natural result, but no one would intentionally set it up like that
The game is difficult to watch, but does anyone honestly believe that an AI is going to have a difficult time parsing the scene if it is trained to do so? That to me just seems like a question of resources. We're pretty good at image recognition and segmentation now, and that's without the unlimited amounts of training data one could generate when using a controlled game environment with a limited range of possible animations and effects. This is why I find the prospect of the AI agent having to parse the screen entirely uninteresting.
For real life applications, parsing the ”scene” would have impact as it could only convey imperfect information retained. In the game of starcraft the information is perfect when fog of war have been removed this together with unlimited attention (camera viewport) helps action potential and macro planning. No player is ever going to be able to consider precise strategy on the whole map perfectly in their mind. If deepmind wanted to mimic human limitations perfectly they would have to provide imperfect information for AlphaStar, e.g when providing information of locations of objects sample a random variable from a probability distribution which represent the location imperfectly and making that distribution bigger the longer the attention of the A.I wanders from the object both spatialy and temporal. Of course the usefulness of having these limitations is purely to model maximum theoretical human mental capacity and it’s use case could be to help explore strategies that work for actual humans.
There is another potential use: given these limitations, an AI might be able to learn to be better strategically, which could translate to an even greater advantage once the limitations were removed later on.
You talk about a static image, but navigating the camera requires strategy, attention, and adds to the focus. If you take that away, it's just a turbo charged pen-and-paper RPG with a time limit on rounds.
They could train against the API, reinforcing the AI trying to predict the state from vision. But with limited APM it would be pretty difficult for the AI to keep track of everything. And, potentially, it would still not be the same as a human looking at it. I'm not sure whether human attention is a particularly bad example of efficient resource allocation. I'm very biased to think it is still the gold standard. But the fact that deepmind didn't focus on this implies they were not finding it interesting enough, and/or too difficult.
Anyhow, (visual) exploration is a step up from mere image recognition
"Brute force" in AI context is usually reserved for traversal of the entire search space. I think "superhuman micromanagement" is a better term. And before AlphaStar superhuman micro wasn't insurmountable obstacle for human players.
Yes, since DeepMind chose SC2 for having the right characteristics for mapping to the real world, ie imperfect information and real time response, they should have had at least one run without any speed governors. And maybe another with the CPU limited to some level we might find in an embedded system of near future.
I've recently watched a TED talk explaining how human perception has a lag of about a third of a second. Pro players might be better, but after noticing they also need to take an action.
My experience is that to beat 300ms requires there to be no conscious thought in the loop. It has to be muscle memory guided by higher level intent. It's like how the gunslinger waiting to shoot hits first, it's reflex instead of decision.
Getting sub 200ms on something like this benchmark is fairly easy [1]. While waiting for the color to change is different than processing a game like dota2 or sc2 a 200ms limit isn't too unreasonable to me.
I would love to see these AIs get handicapped even more like a full second and really force them to out think humans.
I think OpenAI would have been by lots of humans, but they decided to train it with 5 unlimited, invulnerable couriers. (until the TI showmatches, in which they were beaten easily.)
The only way to truly have a fair fight would be to accurately model the limits of human capacities. How fast can humans move the mouse and at what accuracy? How fast can they type keyboard commands? How fast can they move their eyes? You could study those limits in a sports lab with high speed cameras, etc.
A simpler model would be to limit the bot to, say, one action per 250ms, introduce a slight delay in his reaction time, require him to move the camera to gain detailed information and take further actions, and have camera movements count as actions.
Here's a graph of AlphaStar's APM versus a professional player's: https://i.imgur.com/TXeLkQK.png Evidently AlphaStar also has a similar Economy of Attention (where the player focuses) to a professional player, at around 30 screens per minute. Additionally, AlphaStar's reaction time is around 350ms, a significant disadvantage over a pro.
The skepticism in this thread is absolutely justified but I think it's important to note the lengths to which DeepMind has gone to address and assuage the fears of superhuman mechanical skills being employed in these games.
I watched all of the event live and I feel that that graph is deceptive. If a game is 15 minutes and has 3 main battles lasting 15 seconds each, and you use 100 average APM on non-battle time and 1000 APM during battles, your average APM will be 145 but you obviously have a superhuman advantage.
This is compounded by the fact that almost all of AlphaStar’s actions are “useful” whereas a significant amount of the human actions are spammy.
You will typically see a human select a group of units, and fast-click a location in the general direction they want the units to move (to get them started moving that way), and then keep clicking to continuously update the destination to a more precise location. Every click counts as an action. An AI can be perfectly precise and “clicks” the right place the first time.
TLO seems to have a longer tail than AlphaStar in that graph though, so doesn't that imply that TLO peaked at an even higher APM, presumably during battles?
TLO is a Zerg player, so he probably does a lot more errors when playing Protoss. Also, every top player estimates when to do a sequence of actions and spams it a few times to maximize the chance of execution. Meanwhile Alphastar only has to do that once.
Hm, should be interesting to force the AI to use input commands through a "filter", where it can only execute orders with human level precision. And something similar for input.
This graph is incredibly deceptive and I'm kind of upset they posted it. There are about 10-15 seconds of gametime where APM is incredibly important, and the AI boosted to 1000+ APM during those periods. During lulls it cruised at ~30 APM.
Meanwhile humans are literally spamming keys to keep their physical fingers loose and ready - they're not performing anything close to 400 useful APM on a regular basis (or in TLO's case - 1500 ... He kept walking his units straight into death while spamming keys).
I believe you are conflating latency and throughput. It might take AlphaStar 350ms to perceive a threat, but once perceived, it might issue many commands at high speed to respond.
How many of those 500 actions are actually useful? I haven't watched competitive StarCraft games for years but back when I did, rates were more like 300APM and even then the players basically spam clicked the background or selected random units non-stop and were probably only doing 50-100 actual effective actions.
> How many of those 500 actions are actually useful?
Exactly, a human doing 500 APM during intense moments is going to be way different than an AI bursting 1000 APM with pixel-precision during the most crucial moment in a game.
TLO spent a ton of time at >1000 APM and walked his army directly into enemy shots all the time. MaNa had much better control at ~400 APM. So APM is really irrelevant to control - for humans.
I suspect the AI, on the other hand, makes each action precise & count for something.
This graph, which I think was supposed to show that the AI was being "human", IMO is pretty damning. We saw the APM spike to >1000 during a critical moment and we saw the APM at <30 during lulls, so we know it uses its APM at important moments, presumably with important pixel-precise actions.
I suspect that once the AI becomes good enough it will be able to beat human players using a much lower total APM than human players. We're not quite there yet, but it just needs a little bit of time.
As a hopefully illustrative comparison, you could give any top player a day of play time per move against the top Chess AI being given a minute of play time per move and the AI will still win. That's how much better the AIs are than humans now. There's no reason in principle this won't be possible with StarCraft AI too.
The biggest issue with allowing the ai to have high APM is that it will inevitably learn optimal strategies that depend on that high APM, eg stalkers can take on far more immortals than we normally expect, and the AI will learn it this way, because the high APM allows a new stalker strategy (or rather, empowers an old one greatly) while not affecting immortals significantly. This also naturally means the AI leagues see a different game balance than the human leagues, leading to strategy divergence.
And then when you drop the APM limit, suddenly all the learned optimal ai strategies start falling apart, and the whole thing has to be relearned.
More annoyingly, there’s not much for human players to learn from innovative ai strategies that are based on inhuman accuracy of play (because we couldn’t possibly execute it).
What they're improving at right now isn't any specific AI model, it's how to train the AI models. It's meta-machine learning. I don't doubt that they can quickly train up a new model under different constraints now that they know how best to train up said models. It's not like they throw away all progress once they change some constraints; far from it.
I'm sure we'll get there too, I just think it's a little deceptive how they've measured the APM at the moment.
StarCraft is more random than chess, so I do think it's possible humans will always be able to take occasional games off of fairly constrained AIs just based off blind luck in picking counter builds, it will be interesting to see what % that is.
the 1000 apm thing is because of a bug in how apm is calculated in starcraft2. There is a hotkey to assign all your units to a new control group while also deleting it from all other control groups which TLO extensively uses, and while it just is one key-combination to press it records as 1 action per unit which was selected. The real APM of pro players averages at 250-400 and peaks at 600-700.
I stopped playing SC competitively because it's too stressful. Both physically and mentally. Hitting 300 APM continously in a game for up to 60 minutes at a time makes your hands go numb. And the adrenaline rush makes you want to go running afterwards. With games like LoL/DoTA at least you have a chance to take a break after a gank/farming/ team wipes. With starcraft every decision has a significantly higher compounding effect
From what I understand, the most common string instrument problems are with shoulders/neck/back, due to sitting for long periods of time with poor posture.
Most music should be playable without excessive risk of serious injury to arms / wrists / hands, but from what I understand very high notes on e.g. the violin are hard to play without using an over-flexed wrist, which is definitely a problem if playing music requiring such a position for long stretches of time, or many rapid switches between high and low notes.
Some of the string players with most risk are novices who have not been taught proper technique.
For professional PC game players, the design of the standard computer keyboard and furniture is absolutely terrible from an RSI perspective (worse than any common musical instrument, and without any of the design requirements of acoustic instruments as an excuse), and it is shocking to me that there has not been more effort to get more ergonomic equipment into players’ hands. The way game players typically use a computer keyboard is generally more dangerous than the way typists or e.g. programmers do. As someone who spent a few years thinking about computer keyboard design, I can think of at least a dozen straight-forward and fairly obvious changes that could be made to a standard computer keyboard to make it more efficient and less risky for game players. There is a lot of low-hanging fruit here.
Whether or not the equipment is changed, the most important single thing when using a computer keyboard (or any hand tool for that matter) is to avoid more than slight wrist flexion or extension, especially while doing work with the fingers. Excessive pronation and ulnar deviation of the wrist are also quite bad. Watching pro players, many of them have their wrists in an extremely awkward position while doing fast repetitive finger motions for hours per day without breaks, which is a guaranteed recipe for RSI.
Well I have heard of them, also looked up TLO mentioned above, he actually did get RSI and had to take months off.
"Liquid regretfully announces that Dario “TLO” Wünsch will be unable to play for the next few months due to the Carpal Tunnel Syndrome he experiences in both hands. He will however continue to be involved with E-Sports even as he takes a break from gaming to give his wrists time to heal. Sadly, this means that he will not be attending Dreamhack Summer or the Homestory Cup III as a player."
There would be an entire new dimension of decision making, in addition to good macro, where you have to prioritize actions. Will be interesting to see.
I said so before, but is it really a big difference from controlling a unit that can also only do one thing at a time? The agent controls itself just like another unit, with a constraint on APM available to control other units. On the one hand, these APMs add a new parameter, if the constraint is implemented naively. On the other hand, if there are viable strategies against ultrahigh APM opponents, then the constrained is really rather limiting the dimensions of the decision space and to good effect, finding viable strategies that take less effort. Hence such things are called "hyperparameters" (I think that's something different, but you know what I mean). Likewise, the game isn't as fast as to need 100 screen switches per second, if good planning allows batching and bursting actions.
I understand the spirit of the proposal but that would be like limiting a computer to add at most two numbers per second. It's OK if we want an interesting contest against humans but it wouldn't be a fair estimate of a computer math capability. It's also not the point of using computers to do math instead of a room full of accountants. I'm OK with the AI going as fast as it can and play superhuman strategies because it can be that fast. After all we'll not limit AIs output rate when we'll let them manage a country's power grid.
The purpose of limiting speed isn't to make an interesting contest, it is to accurately compare the "math" instead of the speed the math is done at.
It isn't surprising that its fast, the surprising part is that it can make human-like decisions. The only way to compare whether its thinking is human-like is to restrain it from "brute forcing" the contest through speed.
The model has likely learned that the faster it does things the better the outcome. What it needs to be measured on is strategy.
But isn't the competency of a Starcraft player is also measured on his/her speed?
In that context, you can't really measure strategy without accounting for timing/speed because a lot of tactics and strategies only become viable once the player has the required speed to actually realize them aka "micro".
Exactly, and due to superhuman micro, the AI has cornered itself into learning a small subset of the strategy space. It’s not good at strategy because it’s optimized itself for just getting into micro-handled situations.
It’s not good at strategizing with all the options available to it given it’s micro ability, it has “one” strategy that leveraged the micro as much as it could, and when given a strategic challenge by mana, it didn’t know what to do.
yes but the ultimate goal, is to make an AI as "smart", or "smarter" than a human. That's why they keep making AI's play against human players in Chess, Go etc. It's not to prove computers are faster than humans. It's to prove computers can be smart like humans.
They want to make an AI that can teach new ideas to humans. New strategies that human bodies are physically capable of executing, but no human was "smart enough" to think of yet. An example is when the AI built a high number of probes at the start. That's "smart".
The only way to train an AI to be able to come up with new ideas, is to force it to be "slow". Otherwise, it will just always do the easiest way to win, which is out-micro. There is nothing interesting about a game like that. That only shows the AI is fast, but it won't be clear that it's "smart"
That's exactly why it's so important to try and constrain the system to as close to human parameters as possible. You can't compare strategic prowess if the two players are playing at a completely different level. It'd be the same as saying MaNa is better than say, Maru (who has just won 3 GSL Code S's in a row), because he has stronger strategies against ~30th percentile players. It makes no sense.
Speed is only interesting as part of fair human competition. It's trivial for the AI to win with speed and it doesn't have to be remotely smart about it. Serral (dominant world #1) was easily beat by 3 far weaker humans controlling one opponent - it wasn't even close. It's just stupid to even claim victory in those situations.
Making an AI that wins by outsmarting humans, on the other hand, is what we are all interested in.
That would be right if AI and human player had the same opportunities for micro.
They don't, because AI doesn't use physical objects to move stuff in the game. AI just "thinks" that this stalker should blink and it blinks. Human player has to deal with inertia of his hand and of mouse.
If you want fair competition of micro - make a robot that watches screen through it's camera, moves mouse and presses keys to play starcraft.
Then the bandwith of the interface is the same for both players, and we can compare their micro.
you don't really need a real robot, but assign some "time cost" for various actions which depends on spatial distance and type of action and if it is a different action than the previous action. humans are really fast when for example splitting a group of units but performing multiple different actions on different areas on the screen or even multiple screens takes a lot longer. They don't need to fully emulate human behaviour but getting somewhat close would really show how strong teh AI is tactically and strategically without superhuman micromanagement.
If we want to measure strategy, I agree with you, and out of curiosity we might do it. But the goal is winning, so is strategy important as long as it wins? The AI can take every shortcut it finds IMHO. People do take shortcuts.
Cars and planes bring us across the world exactly because they don't walk like people and don't fly like birds. Wheels, fixed wings and turbofans are shortcuts and we're happy with them. We can build walking and wing flapping robots but they have different goals than what we need in our daily transportation activities.
The problem with starcraft is - interface overhead is significant part of the game. AI doesn't have to cope with that - every click is perfect, and moving the mouse from one edge of the screen to the other takes no time.
If you want to make it fair - place an AI-steered robot in front of the screen, and make it record the screen with camera, and actually move the mouse and press the keys.
Then I can agree it's fair :)
But then of course AI would be incredibly bad.
Right now the advantage doesn't come from faster thinking, but from much higher bandwith and precision that AI has when controlling the game. It's anything but fair.
With chess it's not a problem, because interface overhead is negligible.
Those are different engineering problems. I'm pretty sure that they could eventually build a pixel perfect camera and a fast pixel perfect robot mouse. They'll be at least as good as human eyes and hands, probably better. Done that, they'll keep winning.
It's surely interesting technology with positive impacts in a lot of areas but is it that the important part of the experiment? Humans need keyboards and mice to interface with computers, computers don't (lucky them.)
Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.
When you're trying to individually blink 30 stalkers at the perfect time they have almost 1 hp - latency is everything.
Camera has latency. Depending on various factors it takes even milliseconds of exposure for camera to gather enough light that it registers as a clear image frame. Human eye works on a different basis, but also isn't instant. You cannot cut that in software, human player cannot train to lower this. But AI doesn't need to do it - it has image provided as a memory buffer.
Image recognition has latency (both in the brain and in computer). Even as simple stuff as recognizing where the computer screen is as opposed to the background. It takes time. AI doesn't need to do it.
Muscles (engines in robot hands) have latency.
Mouses and hands have inertia and can't be moved instantly - have to be accelerated and stopped and even if you have optimal algorithm to be 100% accurate - it takes time.
It's not only hard to implement, it's also physically IMPOSSIBLE to do without introducing significant delays.
AI that is controlling the ui directly doesn't have to deal with most of these tasks, so it has a huge advantage in a game like starcraft. It's not that AI is so much better, it's that AI is high-frequency trading and human player is sending requests to buy/sell by telefax. By the time your request is processed the other guy had opportunity to do 10 different things.
If you want to focus on the part of the job that is doable now - sure, go ahead. But then don't abuse the unfair advantages you have and announce you "won". It's very low threshold to win in starcraft when your opponent has effectively 100 times the lag you do.
I'm sure someday we will have AI that can beat human player in starcraft without abusing this advantage, And I'm pretty sure the fastest way to this isn't to put a real robot in front of a screen, but it's to limit the intraface bandwidth of the AI to be on the similar level as that of human players.
> Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.
Let's remove the roads that we made specifically for cars and speak about this again :) Will your car move you through an untamed wilderness quicker than your legs? Possibly. Or not at all.
If I walk into a bullet train, slowly walk inside it, and walk out of it at the end of the route I will be even faster than the fastest car. Is it fair to say I'm faster than a car? After all it's not my fault the car doesn't fit inside that bullet train :)
We need to compare apples to apples, and comparing AI that doesn't need to deal with half the sources of latency with a human player that does, in a game where latency is very important - just isn't fair.
If you don't put any limits on the AI, it's not Starcraft any more.
You could make an AI which tries to hack the human computer to force a leave. That would also constitute a "win". Or one which hacks its own computer and displays "You win" immediately. Or one which tries to kill the human player, if we want to be really dramatic about it.
Chess and Go both limit computers to one move per human move, and they’re still very interesting games for AI. You’ll always have limitations. When you’re playing a game, the limitations are largely arbitrary, and you choose them to make the game better achieve whatever goal you’re after.
You are right, but the point here is to force it to win by pure decision making. Having an AI play a game was always about challenging ourselves to improve our understanding of intelligence. Limiting APM is just another way to force us to come up with new ideas.
So, in some sense this is a limitation of starcraft. The goal of this project is presumably have the AI play a high strategic depth game. However, with sufficiently high micro certain strategies that have low "macro depth" become unbeatable. So it's true the AI would win, but it plays in ways that do not expand our understanding of SC strategy, it is simply using a simple to understand and impossible for human to execute strategy. Think of aimbot in a shooting game, a human can try to play smart and attack from unusual angles/lay traps/crossfires, but if the AI can simply get instant headshots the AI can run straight to objective and win. It would be a winning play, and humans understand why it would be a winning play (boringly so), but it is outside of human execution.
But it's important to be clear about what's being measured. If the AI can take and successfully win engagements that no human could because of their superior micro, it's not necessarily winning via superior strategy (as is claimed).
For now. Give them another month. This is like AlphaGo vs Fan Hui all over again -- people knocked that accomplishment at the time because he was just a master, not one of the top players in the world. Well, not much longer, AlphaGo beat Lee Sedol, the best player in the world.
The ceiling here is going to be incredibly high, much higher than the level of play that people are capable of, even when restricted to a single window.
This doesn't nullify the observations that people are making here.
Part of the difficulty here is describing what a 'fair' match might be. Specifically, I think fairness has to do with a goal many people have for AI: to improve human play. The strategies in Chess or Go that were employed could conceivably be used by human players. There aren't any hard restrictions preventing humans from learning from that play, even if the AI is entirely superior.
It would follow that a 'fair' SCII match would employ strategies that humans could implement. Making extra workers, for instance, might be a real lesson from AlphaStar play. The insane stalker micro, however, could never be done by a human.
From this perspective, I think the important takeaways were:
* The AI leaned heavily on super-human stalker micro.
* The AI had some strategic blind-spots, namely the immortal harass.
* The APM comparison isn't terribly meaningful; a lot of human APM is spammy/twitchy button presses that doesn't do all that much, whereas the AI can presumably make each action count. There were also AlphaStar APM spikes that likely go along with the stalker-micro issue.
None of this really matters though. The AI is improving every day through training. Give it another few months of development and it'll be able to trounce humans under any "fair" set of handicaps you can think of, like limiting average and max APM throughout the game. We saw the same pattern with AlphaGo. There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.
When AlphaGo first one, people said it wasn't fair because it was running on a whole cluster of computers. Well, within not much time at all, it was good enough to run on a single computer and still beat top humans. We are dealing with exponential progress here. The writing is on the wall.
It's tempting to assume the AI will just keep getting better and better, but that's not guaranteed, and I was happy to see that the Deepmind folks in the video clearly acknowledged this. In the game that MaNa won, it's possible that he did so by finding a strategy the AI agent had never encountered before, causing it to respond with nonsense (e.g. not building a Phoenix and pulling its entire stalker army back to deal with warp prism harrassment). In a game with a strategy space as large as SC2, it's possible that an AI will never be able to saturate the space of viable strategies, and it will always be possible to find edge cases that the AI has no idea how to handle.
The point isn't that the AI won't improve or win with those conditions; I agree it likely will, and soon. The point is that the conditions of the match matter and that this one missed the mark.
It absolutely does matter whether the AI can use obviously super-human techniques, because then it's not nearly as interesting for human observers. I'd much rather watch an AI that was a strategic genius that won despite being hamstrung in terms of micro/techniques.
> There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.
Lee Sedol was not the best player anymore at that time (not saying it wasn't an impressive/important achievement, but overstating it doesn't help either - the "beat best human players part" came later in 2017).
Lee Sedol was still top 5, certainly no worse than top 10 at the time. By all mean he wasn't the best and most dominant, but the difference with the top was tiny.
I don't understand who's downvoting you, this is accurate. While AlphaGo/Zero improved quickly to superhuman play, we are just in this thread comparing timelines, so that is relevant.
What kind of evidence is going into this analogical reasoning? Do we also extrapolate similarly for other things? We went to the Moon in 1960s. Was Mars a month, or a year, or a decade away? Then we sent robots to Mars. Did we yet send any robots to Alpha Centauri?
Different problems have different difficulties. Solving simple problems quickly doesn't mean we'd also be able to just as easily solve the hard problems. Often the comparably simpler problems have the best reward/effort ratio and thus make quick progress, which doesn't need to be the case for hard problems.
Going to the Moon is a completely different endeavor than making an AI better at a game that it's already quite good at. This is a red herring.
If you had bet against AIs reaching parity with top human players in any previous game, whether it be Checkers, Chess, Go, etc., you'd have lost. I see no reason why StarCraft II should be any different.
We can reconvene in the comments here a year from now and see where AlphaStar is then.
It doesn't seem like hype to me -- it seems like a genuine, significant accomplishment. Sure, they might not be able to beat the best pro players consistently right now, but I suspect that is right around the corner. Would you rather they stay completely mum until they've reached that goal too? And why? I'd rather know now, and then be able to follow along as it gets better and beats higher and higher-ranked players.
This was my read as well. It seems that Mana simply found a strategy that the AI had not found. Due to not having trained against it, the AI produced nonsense results. The commentators noted that the obvious response was to build a Pheonix and just completely shut down the harassment. The situation is similar to Alpha Go vs Lee Sedol match 4.
One of the hardest parts about these kinds of human vs ai expositions is making sure the AI has explored the full possibility space, so that can handle all situations. The techniques at play lack the ability to perceive a completely new situation and formulate a good response. (Though anyone who's lost to cheese in games they later learned easy counters for know that humans, while better than state of the art AI, aren't perfect here either.)
Mana got himself in the same situation where he was surrounded by stalkers on multiple sides, but this time the micro wasn’t so crazy that he couldn’t manage it, and he was able to take on one group at a time.
The immortal drop, while unanswered, was not really that effectual.
But it was answered: AlphaStar pulled a huge stalker army that was about to hit MaNa's base all the way back home to (attempt to) answer the drop, repeatedly. If you have more complexity to your army but fewer army units, as MaNa did, a delay like that is how you win the game.
That's what I said on Lobsters. They were always good at builds, micro's, etc. The one thing they couldn't do was judge human intent, esp if they were being mislead (esp time wasting). I was waiting for one of the players to try to screw with its head to see what it did. Mana showed two gaps: the back and forth thing; that it ignored the observers giving up constant strategy information. Then, he got the first win.
Now, the questions are how many more such glitches will show up and can they eliminate them with better algorithms?
And against human players up to Masters 3 or so :) When you're still using the all-army hotkey, defending with a small and precise group isn't happening.
That time, the ai didn’t really even try to engage. In fact, the ending of the match was marked by the entirely absent group of stalkers as the natural was engaged.
It’s likely safer to say the AI was confused in general at that point, possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game
"possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game "
In software, changes in assumptions can break what depended on them. There could be many assumptions in its neural net centered on full visibility. They should probably retrain all or just some from scratch with the camera change in from the beginning to see what happens. Then, it will be firmly encoded into the strategies over time.
The immortal drop let him keep AlphaStar occupied while he built up a critical mass of immortals (it becomes harder and harder to effectively micro stalkers against immortals as numbers go up, probably even for an AI), then let him put AlphaStar in an awkward position when it was camping near where the warp prism was hiding.
The results are obviously impressive, but even then there is a lot of work to do as far as learning efficiency goes:
"The AlphaStar league was run for 14 days, using 16 TPUs for each agent. During training, each agent experienced up to 200 years of real-time StarCraft play. "
MaNa probably played less than 2-3 years of Starcraft in his whole life (by that I mean 24hr x 365d x 3), and was learning with a much less focused/rigorous methodology.
Another way to think about it is that a human brain is mostly doing transfer-learning, on top of a 99%-baked deep net that was wired up during foetal development from our DNA, where that DNA-persisted model has "seen" hundreds of millions of years of training data.
Humans don't have to learn to process, recognize, and classify objects in visual sense-data, for example. We can do that from the moment we're born, because we already have hundreds of precisely-tuned "layers" laying around in our brains for doing just that. We just need to transfer-learn the relevant classes.
This is a widely underappreciated fact when it comes to comes to comparing the 'training experience' of humans versus bots. And it extends far beyond processing 'sense data' - A human likely has some level of understanding of how the game works based on experience from other games it has played and from 'real life' - we know almost instinctively that 'high ground' is likely to give a combat advantage without having test it in game.
Not only that, humans (and many other eusocial species) have an instinctual intuitional understanding of many aspects of game theory.
For example, humans, even from infancy, prefer games where it is possible to punish cheating (i.e. take revenge upon cheaters) to games where it is not. This isn't just "we're animals that have evolved to enact tit-for-tat strategies [by e.g. injustice triggering rage] because they lead to cooperation which leads to egalitarian utility"; this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted, whether you'll be likely to end up in an "unjust" social situation if you agree to the given ruleset. There is an "accelerated co-processor" of high-level abstract game-theoretic information—and layers to extract that information from sense-data—that ship as part-and-parcel of the human brain model. We never need to learn how to judge unfairness, any more than we need to learn how to see.
"humans, even from infancy, prefer games where it is possible to punish cheating...this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted"
All of our knowledge of how to play games and so on has come from our current lifetime. We do not have a "genetic memory" that means we have learnings from cavemen or some other such nonsense. Our DNA contains instructions on how to grow a human, it's not a mega hard drive with millions of years of collective memory.
If a 19 year old is good at Starcraft, he's good at Starcraft because he spent two or three years playing a shit load of Starcraft and we are much more efficient at learning higher level strategies than AI are. These AI agents nead to try damn near every possibility to adjust their weightings for various actions. Humans understand pretty much the first time when something goes wrong, oh better not do that OR similar things again.
It's incredibly impressive that a given human can become GM level at Starcraft within a few years and to take an AI to that level takes 200 years of training, as well as an inhuman reaction time, perfect micro/clicking, etc. It shows how amazing our learning skills are.
We may not have "genetic memory" but a ton of human capabilities are baked in at the DNA level. Sure, we need to practice in order to specialise those abilities for particular tasks, but that's more of a calibration phase on a fantastically capable machine, rather than a construction phase.
Totally agree with how impressive humans are, though. In fact, one of the most amazing things to me about robotics is finding out how close to global optimal some humans can actually get.
The GP is underselling the fact that in the human years of being a pro player they think through many more games and may even dream of it. I certainly went to bed after a lengthy session with images of the game still in front of me. Although that might be more about micro, the macro skills are somewhat transferable from other "games". RTS simulate economy, amongst other things, after all.
GP's claim, "99%-baked deep net that was wired up during foetal development from our DNA" is also unfounded, if not completely overblown. I am far from a student of biology, much less an expert, but intelligence is still seen as an emergent property. The real kicker might be that organizing thoughts might be a "game" of it self, that is learned in development and constantly exercised. Talk about self-play.
I recently read a similar question about "inherent mathematical language", ie. capability, and the given opinion was that there is no consensus, except perhaps for basic addition, which I guess concerns vision, ie. seeing a set of things and knowing the count is +++++. That works only up to around +++++++ items at best, according to findings.
Perhaps a nit, but still fascinating: the human visual cortex finishes developing after birth. A newborn can't really distinguish between objects. The ability to differentiate, focus on and track objects is developed over the course of several months.
True. Humans are pretty unique in that regard, though; pretty much no other animal is like that. It's easier to understand human neonatal development if you just considering all humans to be born premature. (It'd be really interesting to know whether that's literally true—whether keeping a human baby in the womb for an extra few months would actually result in the same stages of mental development being passed that occur in a regular baby of that age who has been sensing and interacting with the world.)
I've read somewhere that we are basically born prematurely (as you said) because if we waited any longer then our enlarged head sizes would make delivery quite possibly fatal.
My brother was born a week or so after his due date; they induced labor for him for exactly this reason. Perhaps unsurprisingly, his head circumference was literally off the charts.
Maybe off-topic, but that's one side of the coin, and I suppose the other is that being exposed to more sensory input accelerates development, or makes it even possible (on higher levels of cognition). If this wasn't the case, why wouldn't we just be bigger and carry longer? Is size viz megafauna really that suboptimal for any more significant reasons than being hunted human hunters? I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.
Calling it premature is ironic, if we reach nominal maturity only after 10 or more years as far as fertility is concerned--the equivalent in AI would be the procreation of a neural net, perhaps after exploiting a bug in the game, breaking out to rewrite a better version of itself, or colluding with itself in self play. Yes, this is going off-topic.
> why wouldn't we just be bigger and carry longer?
The consensus in the evolutionary-anthropology community is that our hips (pelvic bones) have to be the size they are, in proportion to the rest of us, to make us able to walk upright. "Building bigger" doesn't really work, for the same reason that you can't make a giant robot—if you scale humans up, the pelvis would need to be made out of something stronger than bone to support the additional load.
The same is not as true, though, if you just make the person wider—because then you spread the same load over "more pelvis." (This is just a personal unfounded hunch of mine, but I think some human subgroups—e.g. midwestern Americans—who are at the genetic limits of baby head size, and who avoid C-sections, are currently selecting toward bigger-boned-ness.)
> I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.
Keep in mind that we wouldn't be conscious for any of it. The development stage that "wakes you up" to the outside world would just occur later on, as occurs in animals with longer gestation periods (e.g. elephants, with a gestation period of 18-22 months.) This would give things like your ocular layers longer to finish developing, without really having an impact on the parts of your brain that learn stuff or think stuff.
Being born “prematurely” might allow for more flexible brain wiring. Adapting better to an environment quite distinct from ancient ones we had evolved in is possibly one of our key cognitive advantages compared to other animals.
Is there evidence for this? My mental model has been that DNA encodes more along the lines of hyperparameters: amount of gray matter vs white matter, locations of brain regions and folds, etc, but the connections between neurons, and their weights, were all learned. There isn't that much information you can stuff into DNA, after all.
Connections between neurons, the synapses, are encoded. So much so that they are given individual names. This is a fun one to read about to get an idea:
Human genome isn't even a gigabyte of data. That's less than a byte per neuron and a big chunk of that data actually has to go into "how to make a kidney cell" and "which way to route veins". So while some basics have to be hard-coded, it can't be remotely close to "99% transfer from ancestors".
That's not how any of this works. We do not have "millions of years" of information encoded into DNA. DNA doesn't store that much data. In fact, it's about 1.6 gigabytes only! And most of that information is basically a ruleset for growing proteins which become our body.
All the stuff we've learned about games and so on have come from our current lifetime. I don't have caveman memory for how to fight a tiger.
I said "deep net" for a reason. A DNN model almost always turns out to be far, far smaller than the training data that was used to create it.
For one example: any smartphone's face-recognition feature. Each such feature is a DNN which took millions of hours of face data to train... but the resultant model fits on an ASIC.
Our DNA doesn't directly encode such a model, but it encodes a particular morphogenic chemical gradient, and set of proteins, that go together to make specialized neural "organs" (like your substantia nigra, or your basal ganglia, or your superchiasmatic nucleus, etc.) which manage to serve the same function to your brain that access to a pre-trained "black box" DNN model would serve an untrained NN in achieving transfer learning.
Our DNA is NOT a trained deep net, nor is it a deep net period. Our DNA is a string of proteins which encode other proteins which gives the series of tasks needed to create and operate all the structures of the brain and body.
The "training" of our deep net happens during our lifetime. We are not born with a trained deep net so your analogy that somehow we are born with a highly capable deep-net encoded into 1.6GB of DNA makes no sense.
Can you imagine how capable a human being would be if it was born into a world with no other humans or learning sources? Imagine a new born baby born into a world with some accessible food/water close by so it wouldn't die from lack of nutrition or wild animals, but crucially without any other humans. It would be utterly fucking useless, no language/reading means no way of assimiliating new knowledge. That baby would end up being a totally incapable human, regardless of the DNA or structure of the brain.
As far as we currently understand, if infants aren't exposed to language and communication at a very young age, they are either incapable or severely stunted in terms of communication for the rest of their life.
My point is, that we are very much dependent on the learning that we get from the point of birth ONWARDS. We get the amazing capacity to learn from the structure of our brain and body, but we'd be absolutely incapable idiots without other people to teach us, our books, language etc. We understand "games" and game theory from playing games with other kids, we're not born with "game theory" encoded into our DNA as one other commenter seemed to think, the same for language learning, and everything else.
Anyway, the point of this whole debate was that it's incredibly impressive that humans can learn to play a game as complex as SC2 in a tiny fraction of the time it takes a cluster of GPUs using a huge amount of energy and resources. Not forgetting that we also have to use a physical body to control our actions in the game, which adds a whole other level of complexity since we have to understand how to manipulate a mouse/keyboard etc, whereas the AI is essentially acting directly with the game, like a human with a neural link. The other kicker, is that if you just changed one aspect, like picking a new map neither player had seen, the AI would be sent hurtling back to square one whereas the human would only be partially affected. These series of demos only make me more impressed that given the huge resources given to Google, they can just about beat a human and even then after 200 years of training time and various other artificial advantages.
You are willfully missing the point. Animals have instincts. The complexity of humans does not make them an exception to this rule. There are in fact large amounts of brain function that are baked in at birth (or developed in a predictable timeline after birth -- humans are basically born premature). Humans are able to instinctively perform behaviors which are not taught, although the majority of critical behaviors in humans are socially learned. Feral children (like Genie) are functioning organisms with complex behaviors. They're just defective humans because humans rely on a distributed learning system called culture in order to do the work that biology cannot.
You are insisting that because humans do not have instincts at a certain level of abstraction (playing video games) that no part of these instinctive brain functions play a role in the development of skill at Starcraft. This is wrong. Abstract reasoning is not simply learned, but it is HONED by experience and neural development. An AI has to do an enormous amount of work in order to replicate functions that humans can already do. This is the basic visual problem in AI that stumped researchers in the 60s who thought that tasks like visual recognition, spatial rotation, etc would be trivial because they are trivial to evolved organisms.
You're relying on some kind of mental model where brains are just masses of neurons that form all of their connections and complexity after birth. This is ultimately a political idea, and it's wrong. No neuroscientist believes this. Brains have pre-defined areas (with fuzzy borders) and many behaviors do come baked into the template. Complex behaviors like language do not, perhaps, although even there, the underlying functionality that permits language is an evolved trait (which is why other animals can't learn language). Research the FOXP2 gene, as just an obvious example.
Edit: Your post contains "structures of the brain". What exactly do you think the structures of the brain are, if not evolved modular solutions to complex problems? Your visual center is somewhat trained after birth, but it already exists. The same goes for speech, motor control, and all of the other unconscious or semi-conscious processes that all humans (and other animals as appropriate) share.
One macro technique used by AlphaStar agents that is not used by human pros is building extra workers beyond currently exploitable capacity.
This gives them reserves when attacked and some workers killed. They can also ramp up mining at a new base quickly by moving the extra workers there.
Apparently the benefits outweigh the costs for these workers for AlphaStar. It will be interesting to see if some pros decide to adopt the technique and if it improves human performance as well.
Disclaimer: I do not have much Starcraft experience.
Workers mine 40 minerals per minute and cost 50, taking... 15 seconds to build? I forget. Workers beyond 24 provide zero benefit (better to send them to the natural).
Let's say you make 4 extra at a cost of 200 minerals and then lose 4 workers to harassment. You are out 200 minerals in both cases, but the prebuilt workers in the prebuilt case will mine an extra... 100 minerals? (40 + 30 + 20 + 10).
This doesn't take chronoboost into account though. I don't know, the gain is marginal, and the opportunity cost is having a smaller army (2 zealots for example)
Please correct my numbers if I've made a mistake, I forget build times and havent played since hots
The numbers you cite are close enough that your estimations are good to work with (12 seconds to build, closer to 60 minerals at full efficiency but down to 40 for probes #17-24, etc)
The extra workers aspect was the most interesting decision-based adjustment AlphaStar made on conventional pro level wisdom of "standard" play. It has a couple of factors in play, that I trust the AI factored in and more and tested over several games for its long-term benefit to winning a game:
- every 8 probes you build requires a pylon as well. total cost of 500 minerals
- workers are safer in the main than in an unoccupied natural (long distance mining) to harassment and pressure
- when your expansion completes, having 4 workers vs 8 workers vs 16 workers potentially has huge impact to the immediate spike in income
- what you mention -- the prebuilt workers will dampen the impact of most worker harassment to purely the resource cost of the lost workers.
My guess was that well executed harassment by an opponent in practice games put AlphaStar in very limited situations with a crippled economy that it couldn't fight its way out of, so this was a catch-all harassment "counter" -- it's ok if you kill a few probes, at least it won't throw off my economy completely and I can still continue my overall gameplan.
After that I think the next most important aspect was planning ahead for a bigger income spike when their expansion was done without waiting to build out another 16 workers after the nexus was ready.
Yeah that stalker micro really showcases a particular advantage leveraged by the AI.
I'd love to watch the results of constraining the AI so instead of seeing the whole map at once it has to pan around the same way a human would to get updated information on each battle. Counting those "info-gathering" window pans against the actions tally might yield slightly fairer APM metrics.
(EDIT: Turns out they built a new agent for game 11 to do just that)
One of my biggest beefs with strategy games of this genre occurred around the time sprites went 3D and the player viewports got smaller (presumably to showcase all the cosmetic detail, and since it became harder to distinguish between visuals when zoomed out farther). I always feel too constrained on the modern games - like I can't see enough of the map at once. In my opinion that "full size viewport" gives a multi-tasking edge to the engine that the player doesn't share (beyond the human cognitive overhead from context switching you already pointed out).
On the other hand I find it fascinating our AI's have become strong enough at our games that we're having to handicap them to avoid players crying foul that they're not fair.
I agree. Most RTS games feel constrained because of the limited viewport. Supreme Commander has a nice feature where you can zoom all the way out at any time.
And a very important part to SupCom's zoom feature is that at a certain zoom level it switches to a rich visual overlay of unit icons and pending/queued orders.
I would agree with that. If you take a look at the exhibition match replay, there's some cases where it makes objectively suboptimal decisions. We couldn't see this during the live stream, but the double immortal warp prism caused AlphaStar to bring back its entire army from across the map, when a few units at home would have been enough to defend. It even kept trying to blink its stalkers to a place where the warp prism couldn't be reached. Perhaps this version with the limited viewpoint hadn't been trained with enough games?
Also worth noting that it starts by imitation learning from pros. I'd be curious to see if the macro can be learned without imitation; a much harder challenge. Also, playing with full visibility as was mostly the case in the demonstration is quite lame...
That's still a large advantage that humans don't have access to. Not just in the "pitiful humans can't take advantage of such a large viewing area" sense, but literally the game will not let human players zoom out that far.
Also I wonder how it handles invisible units. Because as a human player you can see the shimmer if you look close. Can it see that or are they just totally invivisble to it?
I wonder if that would let you win with something like mass dark templar with phoenix's to snipe observers. You could run right past it, and it could never anticipate you.
Or better yet, imagine zerg where you can burrow every unit.
It would be the same as with a player: as soon as you do something with those invisible units, or imply that you have it (eg dt shrine), its sufficient to say that invisibility is in play, and appropriate tools should be used. Its not like you can do anything about dark templars even if you see the shimmer, if you have no sight, beyond body blocking.
Regardless, the article describes cheesing as the common tactic in early iterations, with economic-play being learned later — one of the described cheeses is dt rushes, which the AI apparently learned to deal with, so it should have some understanding of invisible units (alternatively it learned to ignore the dts and base trade or something).
I don’t think the shimmer is useful enough to be a significant loss for these prospective AI’s quests for world (sc2) domination
If you learn, why not learn from the best, the pros? These people already have spent years figuring out what works and what doesn't. Why not draw from that pool of knowledge and instead spend extra time going through the same motions?
Because then you don't know whether the AI learned by experimentation or by mimicking. To draw an analogy, imagine the difference between somebody reading and following an algorithm to solve a Rubik's cube, as opposed to somebody being handed a Rubik's cube and experimenting. If expert-level strategies can be reproduced without being explicitly shown to the person/AI, then it means something is going right in your methodology.
An AI trained from human strategy might end up more limited than one that could learn from scratch. It could be stuck in a local maximum of play and be unable to escape.
An AI technique that requires a large dataset of pro play to learn will be much more limited in terms of applying it to other games.
it seems like in some cases at least it didn't have to move the camera (it had direct interfaces) which for some of the stalker micro battles (especially in game 3 or 4?) the battles were larger than the screen space -- it would not have been possible to micro that well if your control interface limited what you can control or where you can place them.
This is a great point, and something that seems a bit lost in the discussion:
In StarCraft 2, the game IS the interface. That is to say, the developers have constructed the game in such a way as to be difficult to control; and human mastery of the interface is a large percentage of the game. Strategy in the game is important, of course -- but this is not chess, where human beings are not limited by the interface of the game. In StarCraft, you are intentionally given a limited interface to monitor and control a gigantic game while under incredibly tight time controls.
And I should also note that Blizzard is extremely reluctant to add features that make it easier to control the game. I have a friend who works on the StarCraft 2 team. We talked at length about this one feature that he designed and proposed for the team to make a specific aspect of the game friendlier towards players. It was turned down for exactly the reasoning above -- the game is the interface. By making the game easier to control, it disrupts the entire experience; an StarCraft 2 that is easier to control is no longer StarCraft 2.
That would actually be an interesting thing for someone from blizzard to do, get two similarly skilled high level players, and compare the win/loss rate by doing two 7 games matches with each player having a match with a 10% increased view size, and see what the impact is.
Essentially try to quantify the advantage of increased view area.
Yup, exactly. To add onto this, for people less familiar, there's a non-stupid reason for this: economy of attention.
Attention/APM is often called the "third resource" (after minerals and gas), spending it wisely when you have several areas at any given time that could use attention is part of the strategic and tactical decisionmaking. For example, usually in a battle you wanna be paying most attention to the fight rather than your base, but sometimes it's actually better to jump out back to your base to increase production or economy, and knowing which situation is which can be challenging.
Obviously, if you make the game mechanics too easy to control (letting the computer do more of the work), then this part of the game becomes less interesting, because you don't have to weigh trade-offs as much anymore.
It's a question of whether "played with human level latency and precision" be a part of the rules of the game we are making the AI play.
I would say yes, because StarCraft was very clearly balanced for human players. We already saw some indication that when played with super-human micro, mass blink stalkers is a stronger strategy than when humans are in control. Without the active intervention of game balancing, RTS metas tend to devolve into "mass one or two units" which was what happenes to every Command & Conquer game (and why SC is a respected eSport while C&C is not).
I suspect this will happen when you have agents playing parameters that don't match what the game was balanced for. The strategic landscape will shrivel up and the game cease to captivate us.
APM is one thing. I am curious what would happen if it could only see a limited view (as in the last game with MaNa, which it lost to him) and physical click dynamics (i.e. clicking + gaussian noise as an action, instead of giving direct commands). That way there will be misclicks, preventing this super-efficient Stalker micro.
Also these wins are not using same inputs that human receive (ie on screen image) and outputs that humans are allowed. They instead use PySC APIs which has much more flexibility, perfect information and no constraints of limited screen real estate and pixels. There is a claim in that article that they have another version being trained that uses on screen only information but I still don’t know if AI is allowed to bypass the physical constraints of controller. So if AI has access to super human controller you will see AI performing super human actions like many commentators have described here.
Perfect information is a bit of a stretch. There was still fog of war. The AI just played as if the portion of the map visible and actionable at any point in time was the whole map. They retrained with a restriction to a given locus of attention that can change, akin to a screen the player is looking at and acting on.
This is exactly what I think, I'd like to see how Alphastar react to "cannon rush" or other weird bo where you need to be "smart" to counter it and just not be based on insane / none human micro.
Surprisingly not. The trick is usually to build pylons (or other cannons) such that they protect the cannons from being attacked by probes. Building them out of sight is usually too slow as a rush.
Sometimes you may see a photon cannon used to deny an enemy's natural expansion to try to gain an economic advantage. Depending on the map and matchup, it may also complicate the enemy's early attempts at scouting and aggression.
Typically, you don't see more than 1-2 photo cannons, because you don't usually want to "over-invest" and lose what advantage you gain.
That's inaccurate. The best cannon rushers generally build them visibly, but not just anywhere. If you look at someone like QuasarPrintf as an example, a player that keeps a fairly high rank on an account that literally only cannon rushes (there is no anonymity, no pretense about what's going to happen), he wins despite people knowing what's going to happen and putting the cannons mostly well in view of opponents on a lot of maps.
Printf is part of a fairly small group of cannon rushers that don't simply see it as just another cheese, because what generally defines a cheese strat is that it can be easily countered if you know it's coming; not so with their cannon rushes.
Now, with that said, Printf (or any other "I always cannon rush" player aren't winning tournaments), but that's partly because not many players decide that they want to stake their development on any one strat like that, and if they do, it'll likely be one that's deemed more legitimate by the community.
The macro seemed fine -- AlphaStar usually had more workers than the human opponent, in every game, and was producing more army. The suboptimality seemed to be in army composition (blink stalkers) and strategic decision making (pulling all of a superior army back home to defend a single warp prism drop).
This is super deceiving and I'm kind of upset they posted this image, knowing it would mislead people not familiar with the game. The AI sits around during lulls at <30 APM - meanwhile MaNa and TLO were literally spamming keys to keep their fingers warm, not actually doing anything.
During the fights, the critical moments in when MaNa would top out at ~600 humanly inaccurate APM (this is 10 inputs per second), the AI would jump up to over 1000 - we don't know exactly what it was doing, but it was presumably pixel-precise. Meanwhile the physical inertia of the mouse is a challenge for humans at that speed - imagine trying to click five totally different places with perfect precision in a single second.
APM gets inflated by counting several single actions as multiple separate actions. For example a Zerg player may want to turn larva into 30 Zerglings, they do this by pressing one button and holding it down as the UI repeats a separate action for each larva transformed.
By comparison selecting a single stalker, and having it jump to a new location is much more effort, but counts as fewer actions.
A huge part of a human's APM is meaningless spam, for example right-clicking the same unit multiple times to attack it, or setting the same waypoint thousands of times in the early game when there's nothing to do. The computer might be at double the human's effective APM, if only we had a credible way to measure that.
AlphaStar interacted with the StarCraft game engine directly via its raw interface, meaning that it could observe the attributes of its own and its opponent’s visible units on the map directly, without having to move the camera - effectively playing with a zoomed out view of the game
Additionally, and subsequent to the matches, we developed a second version of AlphaStar. Like human players, this version of AlphaStar chooses when and where to move the camera, its perception is restricted to on-screen information, and action locations are restricted to its viewable region.
I was really curious whether they would attempt moving the camera like a human. Sounds like it's still a work in progress, but very exciting! Even this isn't enough to make it fully like a human player, as I believe it is still getting numerical values for unit properties rather than having to infer them from the pixels on the screen. But it seems possible to fix that, likely at the cost of drastically increasing the training time.
The benefit of using pixels, of course, would be that the agent would become fully general. It would probably immediately work on Command & Conquer, for instance, while the current version would require deep integration with the game engine first. But I think the training time would be impractically long.
The live game that was just played was against this version of AlphaStar. Mana did win, but it was by exploiting some poor defense against drops and hard countering the stalkers he knew AlphaStar favours. The AI still looked very good and the developers claimed that this version of AlphaStar wasn't significantly weaker than the versions which didn't have to use the camera.
Dealing with perfect blinking is basically impossible, since you can blink back your units right before they die. Stalkers are balanced around the fact that HUMANS have limits to how well they can micro.
While the "skill cap" on blink stalkers is extremely high, there are many hard counters that can stop even perfect blink micro. MaNa won because he went for one these. Immortals are the perfect hard counter to stalkers because
- cost-for-cost, they are more efficient in a faceoff (resources)
- immortals are space-efficient dps (damage per second) in a battle. In a given battle, an army of 4 immortals is far more likely to all be in range of an enemy and doing damage than an army of 8 stalkers bumping against each other trying to get to the priority target
- immortal shots do not have projectiles, but are instant. No matter how perfect your stalker control, once an immortal targets a stalker, it is guaranteed to take 30+% of its hitpoints in damage.
The last point is very important. Once MaNa had 3+ immortals, even with perfect blink micro, a little bit of target fire and timing micro on MaNa's part allowed him to slaughter the stalker army one stalker per volley, while it takes them longer to clean up the immortals (especially with shield battery support).
Another thing glossed over in this discussion -- AlphaStar did more than classic blink micro. It did a very technical maneuver (the casters briefly allude to it) of triggering the barrier on one immortal with a single laser, then focusing all fire on an immortal whose barrier was already down from a previous iteration of this tactic, and then walking away until the barrier has worn off (while blink-microing weakened stalkers). Repeat. This is a detail of increasing the efficiency of trading stalkers with immortals that humans don't often even think about, let alone execute (because good blink control is often more impactful). That AlphaStar came up with this shows that it's not just about perfect execution of micro, but also perfect understanding of micro.
There was a "perfect zergling micro vs siege tanks" bot some time ago that would micro lings away from the one that was being fired at by the tanks, thereby negating all the splash damage. The effect was insanely powerful.
But as you say, showing that a bot can have perfect micro is not very interesting. Of course a computer can have better control of well defined tasks like moving a unit away just before it dies, especially doing so for many different units concurrently. What is interesting is the wider strategy and how the computer deals with imperfect information.
The interesting part to me is that, as far as I understand, the AI figured out this strategy by itself, basically deciding that it would be a good way for it to win games, rather than being specifically programmed to do it. That's actually pretty cool!
Other than that, I agree, and am also much more interested in what happens when you have a more level playing field (using camera movement rather than API, limiting reaction times and CPM, etc). I look forward to future matches where this happens.
I think there is some debate about what the neural net did and what was hardcoded. So far all starcraft AIs consist of hardcoded intelligent micro ruled by a neural net that picks one out of less than 100 possible hardcoded choices. And things like "expand", "scout", "group units", "micro" are hardcoded outside of the neural net, part of the API in fact. When the researches said they only used 15 TPUs for 14 days on LSTM, this makes me think they really narrowed down the search space of the neural net and hardcoded a lot of the micro or at least trained separate micro nets.
Not really. The version which learned from scratch was scrapped as it didn’t work at all. This version learned by observing pros. So it didn’t learn by itself, it imitated and perfected pro players.
It was not programmed to do the thing, but all these tactics were in seed replays, from which the agent started its learning. So, it actually not figured the move _by itself_, only found it useful.
I'm curious, would the AI be able to see cloaked units? In sc1 you could see them,( I think sc2 is the same) but it was very difficult. How does the 'raw' interface expose that subtlety?
This is actually a great question. Like what does it mean for a unit to be cloaked?
If humans can, under ideal circumstances, see cloaked units... Maybe the only mechanic that shows up (like for bots or an API) is the inability to be targeted using an attack command (i.e. you can still be hit with splash damage from ground targeting)
My understanding is that the AI sees things via an API the game exposes, so presumably cloaked units are completely invisible to it until they're revealed.
yeah I was disappointed to discover it worked this way.
don't get me wrong, it's a major accomplishment in AI regardless, but it's a significant advantage and it would be easier for me to appreciate the AI's skill if I didn't have to keep reminding myself that it can see the whole map at once. it's such an information advantage.
Actually, I would say this might be the strength of AI from another perspective: the ability to observe and monitor global information without losing attention. Or in other words, attend to the whole picture from get go without being overwhelmed.
While it is an unfair advantage in competitive gaming, but in more realistic settings, there is no requirement that AI needs to have only 2 eyes. It can have as many as it could handle, while human can't scale the same way.
While that would be amazing if true, I'm pretty sure if you take away the stalker blink micro AlphaStar loses hands down to humans. This isn't taking away from Deepmind's victory at all, but I think micro was what made the AI come out ahead in this one. In many of the games, Mana had much better macro only to lose to blink stalkers.
You play the game as it's written. Come back with another version of StarCraft that isn't so micro-intensive and we can see how the AI does on that.
Chess and Go don't have any form of micro and AIs are nevertheless dominant there.
I'd say, give AI development another year and I wouldn't expect there to be any kind of game, in any genre, that humans can beat AIs at. Whether it's Chess, Go, other classical board games, Civilization, MOBAs, RTSes, FPSs, etc.
> Chess and Go don't have any form of micro and AIs are nevertheless dominant there.
Yes, but chess and go have a tiny problem space compared to something like Starcraft. People want to see an AI win because it’s smart, not because it’s a computer capable of things impossible for humans. If the goal was perfect micro they could write computer programs to do that 10 years ago.
Then maybe we need a better game than StarCraft to test this on? Some kind of RTS that's less micro-heavy, perhaps? Maybe even an RTS where you can't give orders to individual units at all, like the Total War series? You can't fault the AI for winning at the game because of the way the game itself works.
Even if you limit the AI to max human APM, it's still going to dominate in these micro-heavy battles because it's going to make every one of its actions count.
> Even if you limit the AI to max human APM, it's still going to dominate in these micro-heavy battles because it's going to make every one of its actions count.
right, and we saw that with the incredible precision with stalker blink micro. There are many ways you could make it more comparable to humans. They have already tried that by even giving it an APM.
> You can't fault the AI for winning at the game because of the way the game itself works.
But it does make the victory feel hollow when it wins using a "skill" that is unrelated to AI (having crazy high APM with perfect precision because its a computer). Micro-bots have been around for decades, and they are really good. The whole point of this exercise is to build better AI, not prove that computers are faster then humans.
It would like if they wanted robots to try and beat humans at soccer, and the robots won because they shoot the ball out of a cannon at 1000 KPH. They win, but not really by having the skills that we are trying to develop.
I just can't help but feel that nothing AI does will ever be good enough according to this mindset, i.e. true "intelligence" is by definition things that computers cannot do.
Beating the world champion in Chess was, at one point, considered an impossible achievement for computers. Now it's considered so routine it doesn't even count as AI according to many. And in a few months when AlphaStar is beating top human players without having to use APM or viewport advantages, what will the next goalposts be?
The point is, it's like being impressed by a calculator because it can multiply two massive numbers faster than we can... no shit, that's the whole reason we use computers, because they calculate faster than we can...
There's nothing impressive in coding something that can execute something far faster than a human, or be so accurate and beat a human. There were Quake 3 bots that could wreck any human alive 10 years ago because they react in milliseconds and shoot you in the head perfectly. So what? It's obvious a computer can do that. It's like being surprised that a bullet beats a human in a fight, that's by design.
I would be impressed if a computer learned from scratch without knowing anything about the game beforehand, about the controls, or anything else, with ordinary human limitations. Using vision processors to look at a screen to see the inputs and controlling a physical mouse and keyboard. That would be impressive. But watching a computer do perfect blink micro at 1500apm is just underwhelming, since that isn't new tech, you could hand code that without deep nets.
> The point is, it's like being impressed by a calculator because it can multiply two massive numbers faster than we can
Yeah, exactly. And when calculators first came out, people were very impressed by them. They upended entire industries and made new things possible that had simply never been possible before with manual calculation. When you're pooh-poohing the entire computational revolution you might want to take a step back and reconsider your viewpoint. It only seems not impressive now because we were born in a world where electronic calculation is commonplace and thus taken for granted.
If you don't find this achievement impressive, then go look at some turn-based game where reaction time is eliminated entirely that computers still dominate at, like Chess or Go. The AIs are coming. Or give it a few months and they'll come back with a version hard-limited to half the APM of the human players and it'll still dominate. It's clear which way the winds are blowing on this. People who bet against the continued progress of game-playing AIs invariably lose.
> Or give it a few months and they'll come back with a version hard-limited to half the APM of the human players and it'll still dominate.
And this is exactly what is being argued here. Let's see that in particular, not a demonstration that computers are faster than humans. Of course they are. Whoever argued that, ever? This has been known and envisioned even before calculators were invented.
What people here are arguing with you for is that we want human-level limitations of the controls for the AI so it can clearly win by better strategy.
> I just can't help but feel that nothing AI does will ever be good enough
It can be good enough in a certain problem space, such as chess. But unlike chess or go, which are purely mental games, Starcraft has large physical component (vision, APM, reaction time). It can make it hard to determine when it has “mastered” this RTS. Like you said, it may be a few more months (years?) before AlphaStar can master Starcraft on “mental” level. The physical component is trivial for a computer, so mastering that is not much of a milestone.
Depending on how you define Chess, seeing the pieces and physically moving them is part of it as well. Chess-playing AIs haven't been required to have robot components because that's not the interesting part of the challenge of Chess. I'd argue the same is true of StarCraft, even more so, given that it's an innately computer-based game in a way that Chess is not. It seems arbitrary to require the presence of an electronic-to-physical bridge in the form of a robot only to then operate physical-to-electronic bridges in the form of a keyboard and mouse. Just let it run via the input devices directly. Give it some years and humans will be able to do this too.
In other words, this isn't an interesting handicap to apply.
> It seems arbitrary to require the presence of an electronic-to-physical bridge in the form of a robot only to then operate physical-to-electronic bridges in the form of a keyboard and mouse.
It's not at all arbitrary. SC2 match is won by a combination of reflexes and physical quickness with which the actions are executed, and strategy.
The whole point is to even the playing field in the area of the physical limitations so that only the strategy part is the difference. You know, the "Artificial INTELLIGENCE" part?
Is a AI that wins at Starcraft only because it has crazy high APM really going to help get to the next X? We could have built that 10 years ago. All it proves is that computers have faster reflexes then humans. That won’t help them become problem solvers for the future.
You seem to forget the way it learned to play every part of the game (not just micro fights). That is, not by having any developer code any rules, but simply by "looking" and "playing".
That's the great accomplishment and nothing like that could have been done 10 years ago.
What makes this interesting is if they can make a computer program better at Starcraft strategy then a human. How they did that is irrelevant. If having developers code rules makes a better AI then deep learning, then the former is the most impressive solution. What they did is a great accomplishment and the AI they created was amazing, but I feel like the faster-then-humanly-possible micro makes any accomplishment hollow, because that is really nothing new.
If they beat human performance in this (non-AI-building) field by humans painstakingly coding rules for specific situations, then that's cool I guess but not groundbreaking, because the solution doesn't generalise.
If they beat human performance in a field heretofore intractable by software by throwing the basic rules and a ton of compute at an algorithm and then waiting for six weeks while the algorithm figures the rest out by itself, then that absolutely is qualitatively different.
The reason being, of course, that if they can find an algorithm that works like this across a wide enough problem space then eventually they'll find an algorithm which will work on the question of "build a better algorithm." After which, as we know, all bets are off.
If you think the how is irrelevant you are completely missing the point of this exercise. Maybe to you only the result matters but for every other task and humanity the how matters.
Simply imagine next taking on a different Game like one version of the Anno series.
If developers did it by hand, you need 50 devs sitting there for probably a couple of months, figuring out the best, rules their sequence and putting them in. That is about $20 Million just to get a similar AI for the next game.
Compare that to download all available replays, requiring maybe 2-3 data scientist to get the data into shape, renting some compute in the google cloud and you get the same or a better result for probably half a million $.
Watch and learn from data alone is why modern machine learning is considered a revolution and novelty. Buying compute time in the cloud is in comparison (to devs and hand coding) dirt cheap and the results are often better.
Deepmind is not working on this problem for the benefit of gamers or the Starcraft community. Making the perfect bot is not the aim. Tackling the next hurdle, next hardest problem in machine learning is. On the way to become better at generalizing the learning algorithms.
Speed of play is a fundamentally important gameplay mechanic of any real-time game. One of the main reasons the pros are better than amateurs at these types of game is because they play and react faster.
And yes, of course computers are much better at doing things more quickly than humans. It's not even remotely close for us. The AIs are clearly better. It's not cheating either; they are legitimately better at it than us.
It sounds like you're simply objecting to pitting people up against computers in real-time games entirely.
So all they really proved is computers are faster then humans. I knew that before this started.
The Deepmind team knows the challenge isn’t to beat humans at Starcraft. That is trivially easy with the advantages you mentioned. The challenge is to be better at strategy then a human. That is why they tried to add artificial rules to make the AI have similar physical limitations to a human (emulated mouse, rate limited actions, emulated screen and visibility). There have been micro AI bots for years that could out preform any human. They knew they weren’t just trying to build another micro bot, because if they were it wouldn’t be much of an accomplishment.
> The Deepmind team knows the challenge isn’t to beat humans at Starcraft. That is trivially easy with the advantages you mentioned.
It's not trivially easy at all. No one had come close before. It took an entire team of ML experts at Google to pull it off. These hard-coded micro bots you're referring to didn't holistically play the entire game and win at it. They're more akin to an aimbot in FPSes, not a self-learning general game-playing AI.
This is yet another in a long string of impressive AI achievements being minimized through moving the goalposts. It's facile and it's boring.
>It's not cheating either; they are legitimately better at it than us.
This is not 100% true, the AI still skips the mechanical part (it doesn't have a mouse, keyboard and hands) in this particular case. This alone can introduce insane amounts of additional complexity, and will make AI to not be pixel precise.
yup. you could have 200 apm, but as long as your clicks and button presses are perfect, you are going to win against someone with 800 but is super imprecise.
blink stalkers are basically perfect for an AI because of the precision they can blink them around.
I assume you’re joking, but just in case you aren’t, Scrabble bots have outperformed top humans for 20 years with little more than a basic Monte Carlo tree search.
In the TLO matchup, the ai wins with an army of disruptors, and unupgraded stalkers; ofc, TLO wasnt playing his best (in terms of micro or race), but it was still doing well with a micro-lacking unit (outside of blowing up its own army repeatedly)
You'll likely be happy to hear that this has been (is being) addressed.
I watched the live broadcast of this announcement where they did a recap of all 10 previous matches (against TLO and Mana) and they talked about this concern. During today's announcement they presented a new model that could not see the whole map and had to use the camera movement to focus properly. The deepmind team said it took somewhat longer to train but they were able to achieve the same levels of performance according to their metrics and play-testing against previous version.
However...
They did a live match vs LiquidMana (6th match against Mana) against the latest version (with camera movement) and LiquidMana won! LiquidMana was able to repeatedly do hit-and-run immortal drop harassment in AlphaStar's base, forcing it to bring troops back to defend its base, causing it to fall behind in production and supply over time and ultimately lose a major battle.
It sounds to me like, although it could see the whole map at once, the fog of war was still applied. So the bot really just got as much information as the minimap would normally give a human player.
> it could observe the attributes of its own and its opponent’s _visible units_ on the map directly
No, not true. Just had an extended argument with a friend over this. Here are some of my arguments against what you're saying:
1. While it's true that a human player could see everything the AI is seeing, the human player has to spend time and clicks to go see those things, whereas the AI sees it all simultaneously without having to use any actions or spend any time to take it in.
2. Emphasis on the computer seeing it all simultaneously. The computer can see the state of two health bars on opposite sides of the map at the same time, or 100 healthbars in a hundred places at a time. A human cannot do that, and even trying to move the view around fast enough to do so would render it impossible to actually do anything else.
3. If it's true that seeing more at once is not advantageous, then it must also be true that seeing less at once is not disadvantageous. So by that reasoning a player playing on a 1 inch x 1 inch screen would not have any disadvantage, since after all they're getting just the same amount of information as long as they move the screen around enough! Reducto ad absurdum, a player with a 1 pixel x 1 pixel screen has no disadvantage either, because they have access to the same information as long as they move around quick enough. It quickly becomes evident that smaller screens inhibit your knowledge of the game state, and therefore larger screen benefit your knowledge of the game state.
One thing they said early on in the ~2 hour video I was watching was that, while AlphaStar had access to the full data of everything within its fog of war, it seemed to need to partition its access to it, in a way that was similar to a human checking different screens, and did so about ~30 (or was it 37?) times per minute.
This might be why changing to having to observe only one screenful at a time (rather than the zoomed out view) didn't seem to have as large an effect.
This is why a lot of competitive games have rightly decided not to support ultrawide monitors. Being able to observe more of the game map simultaneously is a huge advantage. The only fair way to support them would be to cripple the player, by cutting off the top and bottom of the viewable range, not by extending the left and right range.
> whereas the AI sees it all simultaneously without having to use any actions or spend any time to take it in.
Starcraft is a single-threaded game, so I would think that the AI ultimately still has to enumerate through each visible unit one-by-one to collect their information. Why is that so much different than enumerating through each visible screen and then enumerating through each unit on that screen? Either way, the AI could do it much faster than a human, whether it had to click through the screens manually or not. How would it be possible to eliminate this advantage? It seems to me that it's just part of the nature of AI.
No, that doesn't eliminate the advantage -- that's what I'm trying to say. Even if you make the AI move the screen around manually and only let it enumerate units that are on-screen, that's still going to take roughly as long as just enumerating through all the units on the map in one go. It's just a matter of executing "foreach all_units" versus "foreach screens { foreach units_on_screen }". In either case a computer could do that much faster than a human.
Let me put it the opposite way: If you gave the human player a real time list of every visible unit on the map and all of their information, such that they didn't have to move the screen around manually and could see everything at a glance just like AlphaStar can, would that take the advantage away from AlphaStar? No, it wouldn't because AlphaStar could still go through all that data much faster than any human ever could -- no matter how it's formatted or what you have to do to access it. To AlphaStar, checking all the visible screens is just as much work as scrolling through a list of units.
I get what your saying. But screen movement is rate limited (meaning you can't loop through all possible screen positions in 1ms) so you have to actively choose where you want to focus, just like a human player. Think of it more like calls to a web server then "foreach screens".
Can't you click on the minimap to move the camera instantly anywhere on the map?
EDIT: I guess you would still have to wait for the next frame to get rendered, which could add up. True, that does change things a bit, but of course a computer could still do that way faster than a human.
This sounds like a real advantage in the AI's favor though: It can focus its attention on a lot more things simultaneously. It's not just a UI difference; the AI is actually better at this, like how a pocket calculator is actually better at division than people. This latter bit we just accept; we don't defend humans by saying the calculator is cheating because it isn't writing out the calculation by hand.
Similarly, robots are physically stronger than people at any given task you can think of. That's a real advantage of them.
It is certainly a real advantage, but I think the argument is that it's not as interesting as an AI that could win on the strength of better decision-making, or the innovation of novel strategies, etc.
AI wins on the strength of better decision-making and novel strategies in Chess and Go, though. I have no doubt we'll see this in RTSes in the near future as well. For now we may not be quite there yet, as this is simply the first time it's beaten a pro player in any way. Compare with the AlphaGo match vs Fan Hui. A year later and it was dominant over all pro players.
> AI wins on the strength of better decision-making and novel strategies in Chess and Go, though. I have no doubt we'll see this in RTSes in the near future as well.
Yes, likely! I wasn't doubting it's possible or even likely. Only that seeing an AI do flawless 1000 APM stalker micro and macroing perfectly, while pretty cool, is not as exciting as seeing an AI use a novel strategy (edit: especially one that a human could theoretically execute)
I'm guessing that while there's a delay for decisionmaking, there's no delay between when it decides to move a camera somewhere else and when it does move the camera (direct API access), whereas humans need to move the mouse or hit a key, which is gonna take at least like 50-100ms where they're not doing anything else.
When they were talking about delay they were talking about delay between new information -> deciding/acting, which I think obscures the fact that humans have to do new information -> deciding -> acting, where acting takes non-zero time.
{{Delivered in the voice of the female British lady who would narrate Beyond 2000 series of shows - or the Modern Marvels narrator, to your mental predilection}}
After just decades in development, it is clear that the endeavors of those research scientist have finally bore fruit. And today its in the form of:
Intent based modeling, augmented with AI, which provides the reality we see today in both gaming and weapons systems.
The user, who must be human, is provided a range of inputs based on the desired outcome of the interaction with the systems and the real world.
What results, is truly remarkable.
A human is capable of multi-dimensional abstract thought, in a way that a computer cannot. As such - their intent is wired over to a swarm of objects with the most amazing capabilities.
A user can direct a swarm of either virtual bots or physical drones to accomplish a task. She can also combine the efforts of both physical and virtual for even greater effect.
Here we see a swarm of bots who are thwarted by a physical barrier.
The human driver can then instruct his virtual AI bots to attack the security system of the building to allow his drones to have passage.
But she does this through merely the intent for the portal to be open. The bots do the rest.
All the while the user is updated with real-time information on how the operation is progressing.
So, in the future, you may soon see just such technology applied to your kitchen or living room, where bots will cater to your every waking need - and sometimes your non-waking needs as well.
APM is a really really misleading metric to use here. Most starcraft pros spam keys to keep themselves loose and ready when the time comes. Even at the start of a game, you'd often see players with 500 apm warming up the fingers.
Here, there is the laughable graph of the computer apm over time. The key points here is that when there were the mass battles that won the game, the apm spiked to >1000. And if you look closely in slow motion, there was perfect split targeting. A human player wouldn't be able to perfectly select the exact number of stalkers to hit the enemy without wasting a surplus shot. They can, but not when it's mass stalkers. This efficiency is just beyond humans. The APM here indicates much more effective use of an action than a typical human.
This is super impressive as an achievement, but this is clearly not a smarter ai, but moreso an ai like the video a while ago where zerglings could perfectly micro against siege tanks to avoid splash damage. It is clearly better than humans in certain ways, but not smarter.
The APM metric includes all clicks a player makes. AlphaStar's APM is lower than a typical pro's, but that does not mean it is making fewer actions. All pros try to keep a high "tempo" by constantly clicking the screen even when they are not making any actions. e.g. Instead of sending a unit to a specific location with one click, they will click 5-6 times while dragging the mouse to that spot. The theory is that keeping a high tempo allows you to make more useful actions overall.
Unless AlphaStar's average and maximum APM is 3-4x lower than a pros, rather than just 2x lower average and the same maximum, I do NOT believe that this is a fair test of the AI's strategic decision making ability.
I also wanted to note that the millisecond response time performed by AlphaStar also seems an unfair test.
While the average was around 350ms, this number is skewed by a significant long tail extending up to 1300ms for some actions. The most common response is sub-200ms and the third most common response is 65ms-100ms. The best pros can consistently hit close to 200ms but I am not sure if they can even hit below 100ms, let alone 65ms.
In the starcraft scene we have terms to describe what your saying: you have APM and EAPM (effective APM). Historically the fastest players were around ~230 eapm and that was considered godlike.
Here is an example of what you do without artificial limitations on actions.
https://youtu.be/3PLplRDSgpo
Just goes to show how important those restrictions are for competitive gameplay against AI.
I think some card games meet your criteria. Specifically, Bridge, Hearts, and Auction Pitch (aka Setback) are all complex, turn-based, imperfect information games with some amount of cooperation. I’d love to see DeepMind take on one of these games.
AlphaCiv would be interesting as hell, but I fear that there isn't historical data for Alpha to consume. I would expect it would have a similar result to AlphaStar though-- it beats the human player before the end game.
This is really impressive. Even though the camera hack / uncapped APM limits (only the average was capped) made this version slightly unfair since the ability of an AI to micro insanely well with stalkers is basically unbeatable, I feel confident based on this performance that Deepmind will release a superhuman AI with lower EPM than humans very soon from now.
While this is super impressive we should not forget - it was just a single matchup (1/9, requires a lot of adaption; especially on the three races) on a single map (1/N, granted in SC2 the maps are not as diverse as in the original StarCraft) using a single (outdated) patch. Humans are still master in knowledge transfer playing an unknown map and lesser known matchup with greater precision. Once an AI reaches a level where knowledge is transferred more efficiently than what we are capable of I will start getting worried - not beforehand.
I wonder how useful ai will be in balancing games in the future. Games with more than one race and multiple upgrade paths seems like a nightmare to ensure that things are even for players that are equally skilled.
It would be interesting to have a game that would auto-balance itself, especially if you wanted to add extra content without having to worry about throwing everything off.
That's a cool idea but many games already struggle to find a fair balance point at all ranges of human skill, so expanding that range to include AI could just make things more difficult. Maybe there's a way to force the AI to replicate bad players, but I'm not sure what learning objective you would give it to achieve that.
I don't follow gaming balancing too closely, but I (naively) assume there's some reasonable analytical solution to find tuning. Like I would imagine if you're Blizzard and have logs of all the data, you could just regress race attack/defense/movement/etc. stats to find a potential equivalence point.
I suppose there's lots of interactions, but finding "well Zerg beats Protoss X% of the time", you could balance by messing slightly with resources or blanket buff.
I think achieving suboptimal play can be pretty straightforward — keep the parameters the same, but simply limit the amount learning. Some tweaks might be necessary here and there, but as long as you have basic gameplay down but not perfected, it should work pretty well.
Total War Warhammer is a very interesting case for that. Many, many races, units, items etc. Much more than Starcraft. And yet the balancing seems quite good! There are obviously tier1, tier2, tier3 races, but even these tiers are disputed and tier1 races rarely win tournaments
I would argue that without a proper competitive scene (which Total War Warhammer does not have) it is impossible to measure if it is actually balanced or not.
Because that is what we need to teach AI to do, build bases, extract resources, and build units to go out and kill everything else on the map :-)
It is an impressive result, it seems pretty clear to me that as a force multiplier for developing decision tree software this technique works faster and more effectively than the waterfall techniques, and it gets better post release. But beyond the game theoretic applications I am still looking for an application where it reliably creates a better back end code generator for a new architecture faster than a person can.
I think the first ever applications of a more generic AI will be in corporate and military anyway, so yeah, the AI will build bases, extract resources and send security units to kill everything else on the map indeed.
Can anyone who has a full context compare this to OpenAI's work on Dota 2? Which is more impressive, both in how far along it is and the relative game difficulty?
They’re probably comparable and at similar levels of success. Talking about which game offers more entropy isn’t a good metric yet because neither AI seems to be trying to utilize the depth available to them.
E.g. the Dota game had 5 AI working as a team. That feels like it should demonstrate additional competence, but it’s not really clear that they worked as a team.
The Dota game allowed greater variation in starting conditions, but it’s not apparent that the Ai adapts it’s strategy to this well (e.g. hero choice).
Both of them are capable of creating a basic strat and excelling at micro. Neither appears to have great depth of strategy.
This feels more impressive since in SC2 you can pilot many units while in Dota you only (generally) have one. And Deepmind is playing (albeit a slightly older) stock StarCraft II while OpenAI had to play using a very weird ruleset (invulnerable carriers).
Milestones that we have yet to see with Deepmind
- generalize on maps
- generalize on races
- zero-style training
Starcraft II pros are probably much further from perfect play than Dota2 pros (simply because the game is harder). As a result, having perfect micro is a much larger advantage in Starcraft than it is in Dota 2.
Dota is a much harder game strategically, in Starcraft it's easier to compensate with mechanical skill. You can see this in the results. The Dota ai played a very dumbed down version of the game and still got trashed by the pros.
How did they make those data visualizations? (specifically, the visually-skewed ridge plots). That's a nice approach for those types of plots since they can get cluttered without the perceptual skew.
I would try with a Seaborn facet grid? I think they’ve got something custom, but this should get close (be aware this specific example is a kde so it will normalize total area)
On a second glance this won’t get the 3D horizontal offset.
But it seems the techniques used here might be both far more efficient, as well as able to better exploit human players (as opposed to playing a pure game theoretically optimal strategy, which is what they do now).
So, your link is about Cepheus. It's important to understand that Cepheus isn't AI, it's an (asymptotically close to) optimal strategy for Limit Heads Up ("Limit" means you don't need to choose bet sizes, they are fixed, which makes the problem much simpler), and since poker is a game of probabilities the strategy is probabilistic too.
ie this is like when somebody explains Tic-Tac-Toe, there isn't anything interesting going on inside the machine, the insight was purely mathematical, that there are one or more optimal ways to play this game, and this is one of them.
You can literally look at the strategy right now, you can do it while playing against a program that plays by this strategy. But you won't beat it by doing that, that's the point of this optimal strategy, the best you can do is play this or some other equivalent optimal strategy back, in a home game you'll just pass chips back and forth forever.
In contrast Poker AI is a thing, and at _No Limit_ Heads Up the AI state of the art is clearly better than human, named Libratus - it is nothing like Cepheus, there is no fixed strategy.
For "Full Ring", the game of Poker as you've probably seen it played, which has more than half a dozen independent players, AI would be very challenging, not least because if humans realise they're at a disadvantage it would be essentially impossible to prevent them from colluding with other humans to get an advantage, even to some extent unconsciously.
I was essentially asking if the techniques used by DeepMind could be leveraged to create a much more powerful version of AI's like Libratus, or to become very strong at games too big to solve for GTO solutions, such as full ring.
> For "Full Ring", the game of Poker as you've probably seen it played, which has more than half a dozen independent players, AI would be very challenging
This has already been done. There was an AI called Sonia which played both HU and full ring at expert human level or beyond, and was not based on GTO solutions. It created a model of every player which updated in real time, and exploited them the way expert humans do.
I'm just curious if DeepMind would be able to achieve similar or better results.
> not least because if humans realise they're at a disadvantage it would be essentially impossible to prevent them from colluding with other humans to get an advantage, even to some extent unconsciously
Indeed versus a single opponent, you can play an "optimal" (e.g. Nash equilibrium strategy) and basically win/solve the game. But in a multiplayer setting (in poker and similar games), you rely on your ability to predict opponents. In the former you could essentially assume your opponent is as smart as possible ('rational' in game theory language), and if it is not rational it will necessarily do worse. In the latter however, by assuming every player is rational (e.g. playing N.E. strategies) you lack the ability to exploit weak players which are decidedly suboptimal, predict their moves, and capitalize. Thus winning is conditioned as much as the knowledge of your opponents, their skill and style, as in your own power to play probabilistic 'optimal' moves.
A good simple example is also Rock-Paper-Scissors: a N.E. strategy just plays randomly -- it clearly cannot lose even against the most skilled players every -- but it also cannot win versus weak, predictable opponents. So e.g. in a tournament setting it would always lose, depending on the tournament structure (in particular if tournament stages are not all pairwise elimination).
I find it a fascinating contrast to traditional game theory, and sort of conventional view of games, that there are "good players" and "bad players" in the sense of their execution being superior in an absolute sense, independent of opponent. In reality success in a variety of games (and real world scenarios) is won by tailoring your strategy to a particular opponent, using information from outside the game, and specifically predicting his plays vs. utilizing universal strategies (again this is particularly relevant in non-pairwise tournaments, non-zero sum games, etc).
In real life, almost always the "game" is non-zero sum, multiplayer, with non-rational players, etc., even in a sports setting. Roger Federer might play slack and save his body in the early stages of a tournament vs a known weaker player, and give it all vs his most fearsome opponents.
And also finally to quote "The Art of Strategy" (introductory game theory classic), 'There's always a bigger game.' (in real life it's the ultimately the entire Universe, and no one is quite sure of the Rules :p )
And to add that, what if deepmind could get a poker history of the players at the table to create a poker profile of each player it is playing against. Having a percentage of a player's likelihood of folding and bluffing could keep it an advantage over a purely objective game theory aspect of the game. Maybe a certain player is more likely to bluff 5 hours into a game based off of player history analysis. Going further, imagine if DeepMind could get access to every players history outside of poker ( social media, purchasing records, medical records, etc ) and create a behavioral profile of each player.
Game theory, poker profile, behavior profile, etc could give it a serious advantage over human players in addition to its advantage of never getting tired, frustrated, etc.
deepmind is not necessary for that and also probably wouldn’t work.
If you have player history you can construct perfect models for their playing habits and even update it as time goes on like a multi armed bandit problem. This can be simple probability heuristics and an optimization function.
The problem with it is that the machine becomes biased to past results. Top human players are good because they adapt. I would put my money on the humans
Probably not. The AI demonstrated today was shown to be exploited with a strategy that wasn’t especially devious. It’s good at core mechanics but not strategically clever. I would bet on the optimal strategy over this one.
I think the bigger point in the future will be not AI beating known gambling activities (gambling in the sense that you put in money and have a random outcome about whether you get money back, let's not have a discussion about poker being gambling or competitive sport). The bigger point will be AI creating gambling activities that are even less resistable than our existing options. There might be digital drugs.
I was confused too; it has a weird format that I think hurts comprehension rather than aids it by making you think you're not looking straight at the data.
Ignore the fact that the "Training Days" axis is drawn diagonally. The system is creating about 40 agents per day; by the end of day 14 it's made 610 or so. The graph shows, for any given time of training (vertical axis, going down), what is the distribution of trained agents that it's chosen in order to be unexploitable (you wouldn't want to choose rock all the time in rock-paper-scissors, for example). So, for example, at the end of day 14, it's using a selection of agents with numbers 595 through 610 or so, which means they've all been created within the last day.
I think it's to help illustrate the time dimension as going forward rather than something that goes up and down. And also to not measure the hills against some global X-axis. It is confusing.
Impressive, but there's a couple of things I'd like to see them try one day.
Plug AlphaStar into a robot that physically interacts with a keyboard and mouse to control the game. This "robot" should only have what's relevant for playing the game and emulates a human i.e. a camera that looks at a screen (this is the only knowledge it has of the game), and two arms & hands with five digits that control the mouse and keyboard. Then limit its APM to the best a human can realistically do.
The other thing I want to see them address is the virtual training time. 200 years of StarCraft is insane. LiquidMana has been playing for ~20 years and of course he hasn't played the game 24/7. Lets pretend he has played StarCraft like it's a full-time job since he was 5; 8 hours a day, 5 days a week, for 20 years. That's ~42,000 hours of playing StarCraft.
Develop an A.I. that is only trained for that many hours of virtual game time.
If they can create an A.I. with those requirements, that can defeat top-level players, I will be completely blown away.
Human pro gamers have sevaral advantages that mean they should need slightly (but probably not that much) less game time.
Transfer learning from the rest of life, games are designed to be understandable to humans with familiar concepts, that AIs don't start knowing.
Discussion with other players. Mana benefits immensely from every else's 20k hours of SC2 as well.
Selection bias, there are many many people who try SC2, only the people who are naturally good at it succeed. So in some sense we need to be counting the rejects training hours as well.
I would like to see advances on training AI using less data. I just wanted to comment that the comparison in number of hours isn't quite fair.
There seems to be an impulse to deny the AI progress that people see in front of them. When IBM Watson won on Jeopardy, everyone claimed it was cheating because, after all, "everyone knows the answers, so Jeopardy is really only about who presses the button first". But what about the fact that a computer could know the answers at all, and so quickly? Many people didn't think it was possible, and as soon as it happens they attack the speed of the button press as if knowing the answer wasn't the hard part.
Anyone with this point of view does not seem to understand what is being shown here. This is not just "this ai beat this one player in this one match". It's an entire system of techniques for machine learning. The AI was not hard coded to learn those micro steps, like all other AIs have been. To ignore those things is to miss the point entirely.
I see your point and partially agree. But I think part of the problem with these kinds of stunts (for lack of a better word) is it's hard to tell the difference between actual scientific advancements and benefits of throwing hundreds of years of GPU time at it.
Exactly. It's not like the AI of the past where it was essentially flowchart plus heuristics. Now, it's almost "experiential". It's tries paths and essentially sees which ones work and which don't and "learns" from it. It's similar to Dr Strange in Avengers using the Time Stone to try out possible futures to see which ones work.
It's amazing what they achieved so far, but I wonder how it would stack up against Serral, who draws a lot of his wins by supremely judging an engagement and continiously gains small advantages over the course of a game. This is a skill even most other professional players only have on a much lower level.
If they don't cap the apm at 300 or so, this means nothing. I am not impressed with super speed, only decision making matters. Deepmind, you can do better.
I think there's a lot more to limit than average APMs and reaction time, in order to have human-like capabilities. For instance:
- the context switching cost of multitasking
- the precision/speed tradeoff : pro players adjust the mouse speed according to what they want. Selecting exactly the units wanted is very difficult, AlphaStar seemed to do it perfectly.
- the timing precision of an action
With restrictions such as these, I'm honestly confident that AlphaStar won't beat human players anytime soon (once the human players adopt the interesting findings of the AI).
I'm mostly interested in how it will change the meta. Players are often biased by their experience, and regularly the meta shifts entirely without any significant balance update, simply by players finding original strategies
Many comments here are about how the AI information advantage (seeing the whole map at once sans fog of war except the last game; seeing exact unit stats like health etc) leads to higher APM-value, whether APM itself is higher or each Action is more meaningful, and discussing different ways to nerf it to bring it down to a human level.
I'm more interested in the limits that an AI could be pushed to vs humans, and if humans can't match the AI's APM, just add more humans until they can. E.g. 1v7 would allow humans to manage multiple disparate flanks at once just like an AI, and still leave someone to free to manage macro play etc that suffers when a human focuses on micro.
I didn't catch if they mentioned this in the interview, but what would have happened if they let AlphaStar play against other races, not Protoss only, would it be completely lost, unable to achieve anything?
The current version would flounder because it has never seen the other races. But that is not a fundamental limitation. All it would take to fix it is more training time.
Or they trained all the combinations but it was best at Protoss vs Protoss, so they publicized those results. But given the reaction to the first AlphaZero announcement and the later follow-up paper (summary: they cheated a bit but it's still incredibly strong) I would give them the benefit of the doubt here.
Good question. What's important, I think, is to remember that this is an early iteration, and regardless of what it can do now, they will continue to improve it and in a year or two or three (probably less) it will be able to play totally unfettered in all conditions.
How I would summarize the development of AlphaStar and Mana's strategy over the series:
1. AS studies games played by humans to learn what they do.
2. AS takes advantage of high-APM, high-precision blink stalker micro to defeat immortals (something no human can cognitively/mechanically accomplish).
3. Mana realizes he cannot play vs AS as if he were playing vs a human.
4. Mana discovers an AI exploit using the warp prism + immortals to force AS's army back, keeping his own base safe. This is a specific counter-AI strategy, not something that would have worked vs a human player. AS does not know how to properly react because it has not seen any replays of humans going up against an AI by exploiting it.
5. Mana gets enough breathing room to build up a large enough force to win the game.
In short, Mana won because he "solved the problem" of how to exploit this particular AI.
This is actually not a new strategy -- several years back, the stock SC2 AI would do the same thing: pull back when you attacked its base. I could win vs AI using the exact same trick that Mana used. Blizzard has since updated the AI not to fall for this trick.
The real test of DeepMind's learning abilities is thus: if AlphaStar had seen replays of that exploit vs AI, along with all other replays of humans vs AI over the years, would Mana still have been able to win?
I wonder how far AI can progress. Can for example five robots, built to have insane reaction times and agility, outperform and safely subdue an entire army of 1 million humans equipped with the latest melee weapons?
Or how about a tiny flying robot with millisecond reaction times in a room with a crowd of people trying to catch it, being able to eg buzz and tag everyone without being so much as touched?
How about one robot which builds replicas or otherwise organizes some massive “real time strategy growth” across a city, versus the entire city and its police and army being called in? With the robot swarms coordinating distributed superhuman strategies and timing? I feel that, whatever we are discovering now, aliens who make it to Earth will have had hundreds of years advantage on that. It seems Monte Carlo Tree Search is the best we’ve got against the aliens, for now.
Think Ender’s game but for real. It will require something like that.
Anyone also remember the narration in Edge of Tomorrow? There the aliens had the additional advantage of resetting the clock to the beginning thus nullifying any win, but really with trillions of games being played by a league against itself, that rare “restart” seems quaint versus what we have here.
Or over eons perhaps long term strategy with mining asteroids with Von Neumann probes competing.
These developments sound wonderful but all I can think about is how this advanced AI is going to used to try to control me (mostly my spending and consumption)
And now they are superhuman with regards to "Fog of war"!
A truly crushing next step would be to make "World in Flames" into a computer game, and have Alpha-Star-Zero, which is coming, to become the best it can on that.
If they do that, every single military on the planet should crap themselves, including the US, Nato, Russia, China, and EU. It means that 20 years or less to robots able to operate at every level in an army from soldier to general, and able to comprehensively defeat the best humans at it, at every level. The first nation to the battlefield with this wins the world. One ironic part is that an excellent application is non-battlefield warfare. You don't have to drive a tank to conquer the world; perhaps you can buy a factory or make a deal. The corporate interface here should also be compelling.
Perhaps Alphabet will finally be able to use this to make non-advertising profits.
You are comparing formal systems, like Starcraft, where the entire game world is quantized, with the real world where the games are not quantized.
Not only is the game of war not “written down” in digital form, there is far too much data to ever write down. Military strategists have been trying to model war forever. But no one can even agree on what the “game pieces” are, let alone list all of the Deus Ex Machina that might show up. An AI that is better than any human commander at tank warfare would have been just as dead as any human army when the opponent shows up with an H-bomb.
But being good at tank warfare also isn’t anything like being good at a tank game. To be good at tank warfare you need to be good at building factories. Pouring concrete in swamps. Convincing grannies to buy fewer cans of beans, and eat more squash. Figuring out when your workers really need to go home and sleep. Part of the game is just realizing thosethings are even things you might want to think strategize around.
And the nature of competition in real life is that as players gain advantages, opponents copy their skills and those advantages cease to work. Then you have to find new advantages in some aspect of the game that has never been documented before.
In a sense, high level real world competition is more like MAKING games than playing them. It’s about designing a competitive landscape where your opponent won’t be able to design their way out of it. Something AIs haven’t, to my knowledge, even begun to be able to think about.
When deepmind announced they were working on sc2 expected it to be good, but not this good. Compare this to the top bots in SSCAIT (a BW AI tournament) and it's already so far ahead of anything that's existed up to this point. I'm eagerly looking forward to them playing all maps and matchups, I suspect it should be relatively easy to extend to. They've still got a long road ahead but this makes me think they can do it.
However, they keep harping on the APM being similar to a top human and it's just not. Maybe the average for a whole game is, but during fights it was bursting over 1500 APM with perfect execution. This wasn't just spam clicking corrosive bile or something like a human might to get that high but truly coordinated targeting and unit movement. This had lead it to use pathological unit compositions that wouldn't be effective for human play (like pure stalkers beating immortals). If they set a hard max APM to something like 600 I think it would develop more useful strategies.
"In a final match against Komincz, streamed live on Twitch, the DeepMind team used a new agent that, unlike those in the other matches, could only see by moving the focus of the in-game camera. As a result, the AI was forced to move its units in a more focussed, human-like way. Despite dominating the game early on, AlphaStar lost. In all the games contested to date, the score is: AlphaStar 10 - Humans 1."
I'm not quite sure I follow the test then. How was DeepMind making decisions before? In a less-human-like way?
If I understood correctly. In the first 10 matches AlphaStar can see the whole map in detail (not just the mini map) at the same time. The humans can only see their camera's view. In the 11th match the AlphaStar can only see via a camera view as well. It has to move the camera the way a human would to see other parts of the map in detail. The 11th game was a fairer test.
This, along with the documentary of the human go players vs the deep mind AI, seems to indicate we are entering an interesting time. The humans start out cocky, then gradually concede the high ground of whatever skill is supposed to be our exclusive domain. I think we will make it through to some new way of living with very capable machines. But, to transition there we are going to have some rough years. Just watching the Star Craft player's fingers flying at the end of the movie makes my carpal tunnel wrists ache. It reminds me of the folklore about "John Henry" trying to keep up with rail laying machine. TL/DR: In the story he beats it for a while, then falls over dead from the Herculean effort required to do so.
Allowing the machine to have a far-more informed view of the game than a human seems like these tests are not realistic battles. The human should have the same information about the game as the machine. I would certainly hope a machine with more information than the human could beat a human. Perhaps I am not understanding the situation?
What are the games that, so far, still look like they will be too difficult for ML to play them at the highest level? I know Go was held to be this kind of game for a long time and is now close to being dominated by AI. Magic: the gathering perhaps?
I think Deepmind definitively showed the agents are learning high-level play, which was great to see. I didn't really pay attention to AlphaGo, but did skim the AlphaZero paper, and I'm not really left with any doubts about how good RL/LSTMs can get against other AI or humans given enough time to train.
That said, it's an open question whether given all the constraints of the last live match and what was mentioned during the talk (that newer strategies keep getting discovered by humans and agents), whether humans could even win 50% of the time against an agent.
I'd love this to be released for Company of Heroes (Relic), the game has terrain that effects engagements and vehicle movement, multi directional cover, individual model health in squad units, territory control based resource income and tick down win condition and much more that makes it feel more tactical than SC overall.
It's got a bit of randomness where engagements are not just pure deterministic math inputs so I guess that's why it never gained pro league attention but it would be hell of an interesting challenge for AI because of it.
i dont mean to be negative but ML and even conventional computing are starting to make me tired. im always wondering what it will be next that ML can do better than humans. what is next up for automation? will this be the one to send a shock-wave through an industry / the economy? i feel like i need to constantly watch and keep track of the progress thats being made. and im starting to get tired from having to re-think life again and again.
for example, google has published voice synthesis samples, voices generated from text, that are indistinguishable from real human speech. it hasnt been perfected yet, but i think most people would agree that we basically now live in a world where voice recordings cant be automatically trusted the way they used to be. it completely changes the way you think about and navigate the world. it will open up a universe of new schemes, methods of fraud, etc etc that we will have to adapt to.
then there are deepfakes. there are limitations, and the results arent perfect, but its very early days. again i would say that the consensus among us is that we now live in a world where video evidence is basically no longer intrinsically trust-worthy in the way that it used to be.
i practically grew up inside a computer. but i am now sensing that as ML fills in, its going to be a very uncomfortable ride for me personally -- and i dont understand how it couldnt be for anyone else. and what about when AGI comes? just curious to see if anyone else shares my experience with this.
This is more about historical AI challenges. When Chess was beat people realized painfully and happily that they can't beat Go with the same approach. Therefore it was the new Mount Everest.
And Starcraft is a competition because SC1 was a game that was easily adaptable for AI hobby coders and competitions. A little like they have this Robo Soccer world championship for Robo builders. It's part of the domain culture I guess.
It sounded like that was based partly on Supply (Army size). But the AI only knows it's own Army Size, and not the human's army size.
So how can the AI's outcome prediction be so accurate? At game 3 against Mana, the AI's outcome prediction changed from 60% win to 99% win, before the AI decide to go up the ramp. It had no way to know if the human had more army up that ramp
In imperfect information games (like SC2) the outcome prediction implicitly takes into account the unknown.
Given what it has observed and what it has not observed, it is essentially comparing the present state to similar situations from the past.
You can see this in the replay-- at seemingly random times, its outcome prediction jumps up, even when it hasn't had any interaction with its opponent.
But that's precisely why it's going up-- it notices that its opponent has not executed a faster rush or cheese that it's unprepared for, hasn't expanded early, and the scout has not been destroyed.
Similarly, after having won a fight, the worst thing that could happen is that a bigger force emerge from your opponents base, destroying your army and giving them a chance to rebuild.
When that does not happen, you know that your opponent probably doesn't have such an army (because otherwise they are falling behind in resources due to being bottled up).
Either way, after a certain point it's worthwhile to press on to cause more damage, because you're now far enough ahead in resources that you will win regardless if they manage to repel that particular attack.
Yes but it seems impossible to manually assign a score to all those different situations? There are too many situations like that
Is the outcome prediction score, itself, also produced by AI training?
It has to be, right? Because it's clearly not just calculating the outcome based on Army Size of the AI and the human. There must be some non-direct way it's calculating the outcome prediction.
I mean, the hard part is accurate outcome prediction. Once you have that, it's easy to train an AI by just throwing CPU's at the problem and making the AI play a crazy lots of games
There are three things this seems to do. And what's changed is the perceived value of these.
1) Increased mining rate. Although there's a prescribed "maximum", as I understand it adding more workers, up to some limit, does yield more minerals beyond the prescribed.
2) Buffer against harassment. If you are over-saturated and lose two probes, you rate of income isn't affected.
3) Bootstrapping an expansion. All of the excess probes can be moved over to the new expo.
Here's exactly how the mining mechanics work in sc2.
1) only one worker can mine from a mineral patch or gas geyser at a time
2) workers phase through each other while mining, so they don't lose time pathfinding around fellow workers.
3) every base has 4 "close" mineral patches and 4 "far" mineral patches. The close mineral patches are mined at full efficiency by only 2 workers; as soon as one finishes mining, the other has already returned and starts mining immediately. For the far patches, when one worker finishes mining, the second worker is only halfway to the minerals. Sticking a third worker on a far patch can cover that little mining gap, but the benefit is small (like ~30%).
4) therefore for efficient mining you want 16 workers on minerals. For "perfect" mining, you want 20 workers on minerals with the extra 4 on far patches only.
5) usually a triple-stacked close patch will automatically lose the excess worker to a far patch, but sometimes you need to "worker stack" manually to achieve this. You can see MaNa worker stacking manually at the beginning of his showmatch when he doesn't have anything else he could be doing with his excess apm. 24 workers on minerals guarantees maximum income because 4 workers will never fit on a patch; they'll start automatically reassigning themselves to different patches until they find a less crowded one.
6) more than 24 workers on minerals provides exactly 0 extra income. Still good to have extra workers though so you're not losing mining time when you build buildings, get harassed, etc.
7) every base has 2 vespene geysers, and 95% of the time these are in a position where 3 workers will saturate them perfectly. There are a couple of maps (or at least used to be... might be fixed now) where you can eke out the tiniest bit of extra gas by having 4 workers on a geyser.
8) every serious sc2 player already knows all of these mechanics
I personally think Artosis was overstating the innovation of DeepMind's worker over-saturation. I agree with something Rotty said earlier in the series; sometimes you make a suboptimal move and it works, but that doesn't mean it was the "correct" play. It just means that you papered over a flaw with stronger execution somewhere else.
I'm still very impressed with these games, just not with regards to the AI's build orders.
I read that AlphaStar might be able to play this strategy because it is really confident in its defense skills. For humans, that might be taking a risk that is too big. That said, I want to try it out anyway :)
Edit: I recall that Rottie or someone said that there were 1-2 rare situations in HOTS where pros played like this.
My vague recollection is that from 16 to 19 workers the 17/18/19 worker is still harvesting >50% of a <16 worker, from 20-24 it's more like <25%, and the 25th worker onwards contributes 0.
I think this was one of the most interesting aspects of seeing AlphaStar play.
MaNa already started to use oversaturation when he played live game against the AI. I'd bet soon this will be the new meta, and everyone will play this way
I was always wondering why this is not so common in SC2... I mean in StarCraft that was / is well-known and having good worker(-building) control is an essential skill (for Zerg even more than for the other races due to the natural decision where to spent the larva to).
Do you know if there's a good API for paradox games? I have off the top of my head seen extensible mods for games, but I assumed they all were based off of the predefined Clausewitz Engine.
That said, HOI4 could really use some AI love. Right now, I think the AI has a lot of hard coded values (when invasion should happen vs not). One of the potentially beautiful things about modern AI is that if you can clearly define your objectives, it's possible to find a parsimonious analytical solution to solve that problem. I have immense hope that these gaming AI systems will transform the way we game in the future.
"Along with the original title, it is among the biggest and most successful games of all time, with players competing in esports tournaments for more than 20 years."
...
Dear Mister Language Person: I am curious about the expression, "Part of this complete breakfast." The way it comes up is, my 5-year-old will be watching TV cartoon shows in the morning, and they'll show a commercial for a children's compressed breakfast compound such as "Froot Loops" or "Lucky Charms, " and they always show it sitting on a table next to a some actual food such as eggs, and the announcer always says: "Part of this complete breakfast." Don't they really mean, "Adjacent to this complete breakfast, " or "On the same table as this complete breakfast"? And couldn't they make essentially the same claim if, instead of Froot Loops, they put a can of shaving cream there, or a dead bat?
Here’s the question I have. Will it consistently beat the top player over and over?
I see so much brittleness in AI such as this. Humans are much less prone to “bugs”. In an evolutionary adversarial environment, the human brain invariably comes out on top.
In the very long run it's hard to see that we could compete with arbitrary scalability and lightning fast operations.
In present and imminent situations, I thing Louis Rosenberg had an exceedingly important point in
"Imagine a flying saucer lands in Time Square and an alien steps out carrying the game of Go. He walks up the nearest person and says the classic line – “Take me to your best player.” Now, let’s assume that the alien spent years studying how humans play Go, watching replays of every major match.
If that was the situation, it would seem Humanity was being set up for an unfair challenge.
After all, the alien had the opportunity to thoroughly prepare for playing humans, while the humans had no opportunity to prepare for playing aliens. The humans would likely lose. And that’s exactly what happened last month when an “alien intelligence” named AlphaGo played the human Go master, Lee Sedol. The human lost in 4 out of 5 games. But, if we look at the big picture, it wasn’t a fair match."
I'd be interested to see its hierarchical strategy and planning, especially across such a long timespan. Does anyone have any good references for similar hierarchical planning work (Feudal Networks, etc.) to look at?
Every year or so we get another huge advance... Well, more accurately, something comes along to benchmark the state of AI research against a human activity.
Then come the HN comments.
For Alpha Go:
Oh this is impressive but can't generalize.
Wake me up when it doesn't have to have information precoded/doesn't learn from human players
For alpha go 0:
So this is cool but not amazing because they're all perfect information games.
And now here we have this, and people who haven't even watched the presentation are yammering about stuff like the APM, or how this isn't impressive because of ... Well, something?
If I believed in a terminator scenario, I would point out that a robot, too, will presumably have higher APM than all of us desk warriors.
It really feels like there are just some people who are religiously attached to being the only known intelligent entities on the planet, to the point where, when presented with evidence that hey, this is actually a thing, they will stick their fingers in their ears and shout about how unreal/unfair it all is.
I invite you to have a look back at the other announcement threads from Deepmind and OpenAI. I decided not to directly quote people as I don't want this to get personal, but I couldn't just sit here and watch the same old story play out again without at least mentioning that these same people have been wrong, and wrong, and wrong, and will presumably continue to be wrong.
It's extremely impressive, but at the same time it's still an interesting and fair question to say, "can we make an AI that beats humans while playing in a 'human like' fashion?", or alternatively, could we make an AI that would win if we put it inside a human body and made it play through those physical input restrictions?
(I do agree though that HN is way too negative overall. Partly it's just because, negative comments are more interesting. "Hey, this is cool!" doesn't have too much content. Nitpicks give you something to talk about.)
(Still, I've imagined a site where commenting is explicitly broken into two columns: Positive comments and critical comments. Then you could self-select to bask in the positivity for a bit before diving into the nitpicks.)
> "can we make an AI that beats humans while playing in a 'human like' fashion?"
Yes, now.
From the article:
“I was impressed to see AlphaStar pull off advanced moves and different strategies across almost every game, using a very human style of gameplay I wouldn’t have expected,” he said. “I’ve realised how much my gameplay relies on forcing mistakes and being able to exploit human reactions, so this has put the game in a whole new light for me. We’re all excited to see what comes next.”
I was impressed with DeepMind's work on Go and Chess. It seemed to truly grasp the strategy and "understand" the games better than any human, which is in stark contrast to previous engines that relied on brute-force tactical brilliance.
AlphaStar plays like I'd expect a computer to play. It makes some very stupid decisions that suggest a lack of real "understanding" or strategic thinking, such as its wacky unit compositions, the packs of five observers moving around together, and its miserable response to the immortal drop in the last game. It won through superhuman multitasking and micro, which is where I'd expect it to shine.
We tend to think in hierarchical terms: tactics, strategy, and the like. It allows us to quickly create imperfect viable solutions. AlphaStar is probably lacking such distinction and sees the game as a sequence of actions. So it takes a lot more time for it to learn behavior patterns that we call strategy.
I think after training for maybe 20 thousand in-game years it will have decent strategy, while still lacking a real "understanding" of what a strategy is.
We probably will not see humanlike learning speed and adaptability before development of methods for learning hierarchical representations.
It’s more so that the purpose of these tests is to see if the computer can strategize well. The result seems to be... not really, not yet at least.
To take a terminator situation, sure the robot will be a lot better at gun fighting, but if we can kite it with trivial attacks to make it walk into an obvious trap (not really that different from what happens in Starcraft today) then it’s not so threatening.
Nobody denies that machines are better at small tasks that are founded on reaction time, precise inputs, and wide span of attention. But the thing we’re trying to actually get them better at is strategic planning.
The proof is in the pudding, and we are making more and more pudding everyday. Instead of caring about naysayers, we need to be working under the assumption AI is here to stay and rapidly expanding in scope, and we need to build the social and political structures to be able to handle it.
The saying is actually "The proof of the pudding is in the eating". I don't quite understand what "the proof is in the pudding" is supposed to mean and how that relates to making a lot of pudding ^^.
>we need to build the social and political structures to be able to handle it.
This would be a massive waste of resources depending on how far you misinterpret the nature of the “AI thats here to stay”. There is little to support that we’re near-approaching a general purpose, “true” AI, the kind of superintelligent, creative, potentially world-ending and self-improving thinking machine that brings us to the singularity. Its much fairer to characterize current technologies as an algorithm that excels in certain fields, with distinct limitations that we’re still exploring, and have some idea but not perfect of where those limits are.
Functionally, they just find probabilities matrixes for a certain sequence of actions, and can search the problem space much faster than we did before, by simulating the event indefinitely. And they come with the issue that anything that can’t be well-simulated and quickly can’t be “AI’d”, as well as the common issue of catastrophically failing due to not actually understanding the object in total (eg change a picture of a cat to an ostrich by editing key pixels). And afaik, they’ve shown no ability to “change the problem”, a key component of creativity (if you can’t find a good answer to something, consider changing the question; our “AI”s do not.)
And this is naturally why they excel at games (almost by definition a repeatedable simulation) that typically have a very well-defined question. But at the same time, we don’t expect current AI to be capable of taking its “strategies” forward to the next update of starcraft (the problem changes) without re-searching a lot of the problem space (there exist algorithms for training future networks; I don’t know how much progress they’ve made), because they don’t really have strategies in the first place, or a real model for how things interact (they struggle to predict new interactions without sinulating them, or rather, “experiencing” them).
Which is also why its difficult to imagine AI’s will ever truly be driving cars around with the current tech — rather, they’ll succeed likely as an awkward combination of nueral networks, expert systems, hueristics and safeguards. We’d naturally expect most sci-fi usecases of AI to be the same — eg political decision making. And they’ll be limited to the extent that we can render simulations.
And if we pretend these distinctly limited algorithms are in fact the predecessors to our post-singularity successors, simply because they’re able to do a few things we weren’t really expecting (just as computers have proved to be a whole lot more capable than the 60’s general population thought, but far less than what 60’s scifi thought), we’ll do a whole lot of work for quite a bit of nothing.
The fact that its called AI doesn’t mean we’re quickly approaching star trek’s Data AI. It didn’t mean it in the last few AI hype cycles either.
I don't think the algorithm is changing. The infrastructure, engineering, practice regimens and environments, the training data, the selection of metrics for success, are changing. One can say broadly they are part of the algorithm, but these are learnings for curators of AI that can easily be transferred to other spaces to "generalize" as well.
Man, I can imagine the militaries the world over are rubbing their hands in anticipation when watching this. Imagine a perfectly orchestrated swarm of UAVs dealing damage like this (potentially to another swarm).
It's not "games", as it was a single game, and it came with the big caveat that they didn't have much time to actually prepare and train this one as much. The one that won against Mana was trained for 2 weeks, versus this one that only had one week of training.
While the AI did have an advantage in terms of micro/camera view, it still was able to make decent decision making and independently come up with a bunch of interesting strategies. At the end of the day, that's really the goal of the research, not whether or not it uses the right number of APM or uses the camera properly. Those are just artificial restrictions put to make it look fair and entertaining.
I would love DeepMind to put that AI on the ladder as they discussed during the last Blizzcon.
The APM and camera restriction is not for entertainment, it's to develop intelligence rather than 1500+ APM. The interesting part of StarCraft II is decision making and the meta of the opponent, and we didn't see that today.
Remember their Dota2 bot that was beaten by a lot of players after a single day. I want to see if AlphaStar can really adapt to the real world, therefore the meta, the real challenge of StarCraft.
I honestly do not think meta or strategy will be interesting in SC2. (Obviously as a player it will, but from an AI standpoint not). AlphaGo already showed us that it can handle strategy well; A good AI in SC2 will simply scout the minimaly needed amount of time to prepare the perfect responses.
It's certainly interesting but reminds me of DeepBlue playing Jeopardy against humans but having nd the questions fed to it electronically. Half the challenge of the game is buzzing in first. For humans, requires reading/listening and potentially making a judgement they you'll be able to answer the question and buzzing in before even hearing the whole thing. Same thing for StarCraft. If I could nap out my movers in advance and feed them to the API with precise timing, I think I could likely beat a lot of pros - all the strategies mentioned a in the article are well known. The dexterity and timing are a huge part of the challenge.
They mention that the latency of the neural network, from input to action, is around 360ms, which is on the lower end of professional level.
It can do context switches between micro/macro very quickly, which is where it's strength lie, and it did some very impressive micro in the later matches against MaNa, but it didn't win solely off the back of "computers are fast".
I feel like this was a lost opportunity to play it centaur style, letting a human choose and play an overall economic strategy (macro) and let the AI do the combat (micro).
It would be interesting to see a match of [human controlling macro while computer handled micro] vs [computer controlling macro and micro].
I haven't played the game.
They did the same with AlphaGo back then. First get some "good" player in there to see if they are on the right track. Then get a better player in there. Finally, prepare for a real showdown. In this case it would be ShowTime, Neeb or even some Korean pro (e.g., Stats). Maybe its time to switch matchups and make it PvZ - get Serral and then let's see if the current SC2 champion is good enough to beat this AI.
So... I’m curious how long this will be before we can apply this to real life?
At this point it seems like it’d be fairly complicated, but you could build a solid simulator for battles. Then direct humans and / or robots around the battle field as nessecary to win a battle.
Upload a virtual map utilizing some point clouds, estimate densities, start with estimating enemy combatants, add some scoring metrics negatively impacting civilian deaths and probably a lot of other stuff. Run real life scenarios in training environments, and bam.
There are a lot more variables that need to be added to for an "AI general".
- There needs to be a system for simulating real-world battles. (Since we need to iterate the AI, afterall.) In WW2 the WATU was a good simulation of German submarine vs. Allied Convoy battles, but I imagine that ground battles are messier. Link for background: https://www.youtube.com/watch?v=fVet82IUAqQ
- Autonomous directions. If a unit loses contact with the AI, what orders should they follow?
- Need to react quickly to changes in the effectiveness of weapons. If army-B has a Surface-to-air missile that has a 80% hit rate, rather than the estimated 40% hit rate, the AI needs to adapt.
- Different armies have different tolerances for causalities, both military and civilian.
I suspect you're getting downvoted because people don't like the idea of military-general AI. I don't really love it either, but it's going to happen. Hopefully we can encourage its programmers to include the Geneva convention rules for war.
I would really like to see this for Age of Empires II. I think AOE has far more races and is a far more complex game ( although I'm biased because I haven't played SC2 as much as AOEII ).
I have played both games and a fan of both. Starcraft is definitely more complex than AoE for AI development and that's why the researchers must have chosen it. The complexity of AI depends on how many potential decisions you can make at any point of time. Here are a few reasons why:
1) Starcraft races have completely different build trees and different advantages. This has a large cascading effect of early decisions in the game.
2) Starcraft has much more micro potential that AOE. There are many units with sort of super powers like Stalkers can teleport and infestors can take control of enemy units. In AOE, you can only issue attack commands to most units.
3) Variety of units. Since Starcraft also has air units which may or may not be able to attack ground units, you have more options within a race to create a unique army/air force combination.
4) The Starcraft map terrain is hierarchical whereas AOE happens on a flat map. There are interesting locations on the map where a smaller army may be able to defeat a larger army based on positioning.
AoE maps are not flat and units have an attack bonus when uphill (and defense penalty when downhill). Top players will place castles on hills for example and micro their units so that they are more elevated than their opponents.
AoE has monks which can take control of enemy units.
The fact that Starcraft has more micro potential should make AI development easier, not harder. Micro management is a relatively mechanic task that is time consuming for humans but which a computer should excel at.
Micro is often written off as a mechanical task, but this is only relevant with expert systems (like microbots).
There is a lot of tactical nuance and understanding that comes in micro, for a neural network/learning algorithm to understand and optimize is actually very impressive just as much as macro.
Knowing it's optimal to blink when your shield runs out before taking hull damage? To dance your weakened stalker to bait your opponent to overextend? To focus fire but not overkill? To wait for a projectile to be mid-air before blinking or picking in a transport? There's so much more to micro than precision and APM.
To your other points, SC maps have lots of ramps and cliffs and high-ground/low-ground impact (no bonuses, but vision constraints and battle surface area and choke points)
AI development is not easier because of micro potential, because it means the AI needs to be prepared to deal with a wider range of effectiveness for any given unit or set of units, they won't always scale linearly in impactfulness or between different player's playstyles
There is the concept of height. Units reveal the fog of war around them up to a certain distance unless they are a ground unit and the tile is higher than them. This leads to things like marching your units up a ramp, and not seeing that there's a ton of enemy units in there until you're right in the middle of them.
Also, if units are walking through a valley, units on the high ground can shoot the ones on the low ground but not vice versa.
I haven't played AOE for ages, but aren't the races only different in a few details like 2 units and some passive effects? In StarCraft races are vastly different.
I can't really judge AOE's complexity, but from what do you draw your conclusion? Just based on the number of rules?
Each civ (race) has about 2 different bonuses (e.g. Indian villagers work faster) and in addition a unique unit. Competitive players now play with about 20 civs. Each civ also has a "team bonus" (e.g. Teams with a Spanish player get 33% more gold per trade trip, teams with a Persian player get upgraded buildings ). In addition to that, each civ has a different tech tree, as well as starting resources or -- in rare cases -- starting units.
Same, totally agreed. I'm a really experienced competitive AoE2 player (not pro, just a serious casual), and I think the long-term planning inherent to AoE is far more complex than SC. Not to mention the procedurally generated maps that are different every game-- it's far more nuanced in my view and requires a truly vast amount of knowledge to make good decisions. There are so many different valid approaches to any given situation.
They really should have used BroodWar for deep mind. SCII is way too volatile. Way more gimmicks available that will make it difficult for AI to come close to a human player.
As much as I love brood war, great micro over a large number of units with wide awareness (things which appear to be easy to AlphaStar) would be ridiculously overpowered, even with completely average strategy. Brood War is an awesome game because it's constantly asking too much of its players at all times. I can't imagine games against an AI which has far more attention and micro to give would be too interesting.
They are planning on limiting the APM of the AI anyway. With BW, you can focus on an AI learning basic strategy with incomplete information. With SC2, you mix in a hole assortment of figuring out how crazy abilities like force fields and recall. It's going to take much longer for an AI to anticipate when its opponent is going to force field the ramp and warp into the main.
> They are planning on limiting the APM of the AI anyway.
They did that in these games. (Or at least, it didn't abuse absurdly high APM.)
It still had insane micro a) because it had a FoV which basically extended to the combined FoV of all if its units[1] rather than having to move a screen-size FoV, and b) when it micros it never really "misclicks" like a human would do under pressure.
(This was most obvious in how it could micro against MaNa's army on 3 fronts in game 4 and how it was able to basically perfectly drain the Immortal barriers in the game where MaNa actually should have been able to defend against a human mass stalker build ~100% of the time.)
One thing they weren't clear on was how it could tell how much health, etc. each enemy unit had -- did it have to spend an "action" (like a player would have to click) to do that? If so, then that's even more insane in terms of micro ability.
Anyway, disagree about BW. Perfect micro in BW is possibly even more devastating than in SC2, IMO, because there are all these weird glitches that you can do -- the best players can do them some of the time, but no players can do it perfectly all of the time.
[1] This was not the case for the final game which it lost.
>It still had insane micro a) because it had a FoV which basically extended to the combined FoV of all if its units[1] rather than having to move a screen-size FoV, and b) when it micros it never really "misclicks" like a human would do under pressure.
This is an oversight that I imagine they will eventually fix as well. Doesn't make sense to allow the AI to do this, because the focus is on the AI understanding the game.
>Anyway, disagree about BW. Perfect micro in BW is possibly even more devastating than in SC2, IMO, because there are all these weird glitches that you can do -- the best players can do them some of the time, but no players can do it perfectly all of the time.
I didn't mention micro specifically, I mentioned abilities. The perfect micro issue can again be solved by limiting the AI's APM. They may be able to execute micro tricks, but not constantly.
Once machines exceed normal human play, further research does not focus on trying to make the machines play badly so that it's fun to face humans again. What ever would be the point of that?
Instead we just watch the machines versus other machines.
This is why we didn't see "AlphaZero plays chess versus grand master" games - they'd be dull, A0 wipes the floor with grand masters because it's an AI and grand masters aren't, boring.
But lc0 and similar have been entering computer chess competitions with (a clone of) the Google Alpha Zero design. It does pretty well.
Just as with TAS in speed running, you get a synergy. On the one hand, the machines play a distinctly different game, perfect on its own terms, a TAS run never succeeds in a frame perfect trick on the second or third try, always the first - the AI will never mis-blink a stalker to a pointless death. But human play continues, not against the machine but parallel to it, and learning from it. Golden Eye speedrunning was hugely influenced by TAS findings. Modern human chess is influenced by the machine chess play styles.
Are you so sure? APM limits on the AI aren't necessary, which is a real insight FTA:
>In its games against TLO and MaNa, AlphaStar had an average APM of around 280, significantly lower than the professional players, although its actions may be more precise. This lower APM is, in part, because AlphaStar starts its training using replays and thus mimics the way humans play the game.
Where does it say the APM spiked during battles? If it's using it head-to-head with the human player, then that's bad, but how do you know it's not just herding workers and units individually?
I get the analogy but based on the distribution, 75% of the APM for alphastar is below the mean of APM for the human during gameplay. Following that analogy, both cars are on the racetrack, except one is applying acceleration with more precision - i.e. accelerating out of a turn rather than always accelerating. Capping acceleration wouldn't matter in this case.
The APM maxed out at 1500 at one point, far in excess of any human, especially considering many human actions are meaningless actions or spam clicks. The AI could win based on inhuman micro with stalkers and blink. I would be far more interested in seeing the agent with a fully realistic camera and fully realistic apm limit.
It does make a difference, because if the APM spike is during a head-on interaction with the human then it provides a competitive advantage, if it's randomly during another part of the match, then it doesn't provide a competitive advantage.
I'm not really sure why you think Brood War is such a pure example of basic strategy without difficult to comprehend mechanics. Even in your example, recall is far crazier in Brood War. Arbiters can also create a version of a force field on a ramp that's uncounterable and lasts far longer. They're also harder to defend against because they have so much HP.
The fact that arbiters are stronger or weaker than the equivalent abilities in SCII isn't the issue. The issue is getting the AI tech to the point where it can figure out that you have to sac a base because it will waste time sending your army there only to have them recall is going to take a lot longer. An AI can figure out that there is an arbiter on the map, and that means there is a potential for it to recall some units into your base. That's easy.
BW isn't the perfect example or pure strategy, but there are dramatically less game-changing abilities in it. Think cyclone/marauder cheeses or 2 base warp gate allins or ravagers seiging your wall. SCII in general makes attacks come quicker relative to scouting. Injects, chrono, and mules make tech and attacks come sooner compared to BW. Reliable scouting is still basically tier 2.
What's going to be uninteresting is when their AI can never get past bronze league because the AI can't figure out that his bunker is going to get force-fielded or that observer is going to be used for a blink all-in.
I had a feeling we'd see Brood War elitism in here. There are many, many gimmicks in BW. Every time I watch ASL and KSL, I see numerous quick stomps based on gimmicky builds, just like in SC2, Dota, and every other strategy game with fog of war and a collision of messy strategies.
The games it won, it had an access to inputs human players physically can't do (controlling units off-screen without panning the camera first), so I'm hesitant to call the victories legitimate.
Edit: And reviewing the game it lost, the decision making was questionable at best.
The problem is see is the AI is not better than the human, it's just 1000x faster at APM / macro. It could have a "shit" bo and still win because it's so much faster than a human. It's not really interesting tbh.
Wrong. Did you watch the presentation? Its APM was less than half that of its human opponent. If you aren't going to watch the presentation first, don't try to get involved in the discussion in such a derisive and dismissive way. "It's not really that interesting tbh"....wow. It's something that's never been algorithmically done before, no matter who tried. How can you write that off as "not interesting" as if it's totally insignificant?
That's instantaneous APM, not general APM over the course of the game. The human player's instantaneous APM went well past 1000 at certain moments during the course of the matches, too. All that means is a few clicks as fast as possible.
Yes, but individual perfect micro of 50 blink stalkers is beyond the capability of any pro, whereas I can spam-create Zerg units by holding down a key and have higher APM for a split-second, and I'm just a garbage diamond player.
You can't compare the two. No human can do 1000 meaningful actions per-minute. Even when a human is doing 400apm, probably 100 of those are meaningless spam clicks. All of AlphaStar's actions are meaningful actions.
So when AlphaStar is doing 1000-1500 APM, there's no way a human could do that or compete with that. This doesn't impress me because it's obvious if you create something that can do something way faster than a human it will win.
I watched it and it doesn't means anything, they said it has lower APM than a human, but what an APM means for a computer anyway? APM is just a way of mesuring how fast you click on a mice / keyboard, but the descion making is not related to the APM speed. My point is it could click 10x slower than a human and still out beat it at macro.
The real issue I have here is the game can be won by just outplaying the other at macro.
Edit:
They actually say exactly what I just said:
These results suggest that AlphaStar’s success against MaNa and TLO was in fact due to superior macro and micro-strategic decision-making, rather than superior click-rate, faster reaction times, or the raw interface.
It feels like bruteforce not really AI, that's my point.
If a computer knows everything about the game ( mvmt speed, projectile speed, ect ... ) how it is you call it smart where it's just computing power to know when to engage and when not.
....but.....but...that's EXACTLY the challenge of RTS games.....no AI has been able to beat a human at macro until now. That's the whole point...
> was in fact due to superior macro and micro-strategic decision-making
That's exactly the crux of the accomplishment! How are you writing this off as not impressive?! Like, what exactly do you want the AI to be able to do if not win by superior strategic decision-making??
> how can you call it smart when it's just computing power to know when to engage and when not.
But it ISN'T just computing power, any more than AlphaGo was. If it was, then they would have already done this years ago, wouldn't they? Anyone could just rent some servers off AWS and do it. it isn't even remotely that simple.
It's a major work of algorithmic innovation, and you don't seem to understand that.
GP has it backwards. The Ai won on micro, not macro.
Computer apm is not the same as human apm. Humans have numerous unnecessary actions, whereas computers don’t. This is visible just watching the game. Computer dominated on micro.
When the computer lost, it was because it failed to cope with a basic macro curveball
That said, I'm not sure I agree that it was winning mainly due to better decision making. For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.
The stalker micro in particular looked to be above what's physically possible, especially in the game against Mana where they were fighting in many places at once on the map. Human players have attempted the mass stalker strategy against immortals before, but haven't been able to make it work. The decisions in these fights aren't "interesting"--human players know what they're supposed to do, but can't physically make the actions to do it.
While they have similar APM to SC2 pros, it's probably far more efficient and accurate so I don't think that alone is enough. For example, human players have difficulty macroing while they attack because it takes valuable time to switch context, but the AI didn't appear to suffer from that and was extremely aggressive in many games.