It would be interesting to know AlphaGo's estimated probability of winning as the game progressed; presumably we can see directly how worried it was at any given point in the game. And thus get another sense of whether it ever really felt threatened by Sedol.
At this point, the match is won, but games 4 and 5 will commence. The question shifts from whether AlphaGo is better than humanity's best, to whether humanity can even have a chance of beating AlphaGo in a single game. And so far, it sounds like the answer is likely to be a resounding no.
This isn't exactly what you were asking for, but here are some graphs from a different Go AI (Facebook's "darkforest") showing its estimated win probabilities: http://zhuanlan.zhihu.com/yuandong/20639694
Neat! My understanding is that AlphaGo is a stronger player, and so will likely have different estimations than the darkforest. But this is still really cool.
Looked up the strength (of dark forest) and it is claimed to be mid level amateur similar to crazystone and zen. So, the close measurements probably aren't too precise (when they are both around 50%), but the break away is probably about right.
First game, AlphaGo looked ambiguously weak until end of mid-game when it suddenly looked very strong.
Second game, AlphaGo looked obviously strong throughout.
So the probabilities are maybe not entirely wrong.
For reference I am 4k rating, somewhat below darkforest(1d amateur) and so very very far from 9dan pros that AlphaGo is beating. Usually, pro matches are hard to understand but the excellent commentary and abundant community analysis helps greatly.
I felt that Lee thought global sometimes, especially when he had time on the clock, but his mind could only take so much thinking, and time was catching up, so for a great many number of moves he thought locally.
AlphaGo doesn't tire and thinks globally on every move.
We saw that time and time again. Lee playing locally and AlphaGo surprising everyone with an unexpected move somewhere else on the board.
Obviously it is better to be globally optimal than locally optimal.
I believe the OP refers to global/local as they are used in Go. There, global/local play is a concept taken for granted by humans, with explicit names like "sente" and "tenuki".
I think the OP simply implies AlphaGo doesn't seem bound by these human cognitive crutches.
This is not the same as "global" in the theoretical sense, as in "finding the perfect solution / exhaustive proof". See also "game temperature" [1]
Correct. Google seems to agree: "AlphaGo has the ability to look “globally” across a board—and find solutions that humans either have been trained not to play or would not consider." [1]
AlphaGo is approximating global optimality by finding local optimality. Local optimality is already computationally very hard, but it is exactly what AlphaGo is doing.
The rollouts they are doing, evaluating every probable move, it is a search process of trying to find local optimality. You have a current state of the board and you're trying to find a decision that minimizes your future regret. It is by definition local optimality.
Global optimality would be finding a sequence of moves that wins you a game.
> Global optimality would be finding a sequence of moves that wins you a game.
Is it known that Go is winnable? It may be that only a draw could be guaranteed. A globally optimal player would be provably able to do (whichever of these it turns out to be) for any legal position. This is harder than winning a particular game.
7.5-point komi variant played by AlphaGo and Lee has a win or lose outcome. There's no draw. But yes, a more formal definition of global optimality does not include victory as a necessary outcome.
"the stronger player will typically play with the white stones and players often agree on a simple 0.5 point komi to break a tie ("jigo") in favor of white."
Which suggests ties can occur in practice with human players, but are then broken wrt an external parameter.
In terms of optimality, is it known in Go whether an optimal player can force a win against (i) all suboptimal players; and (ii) another optimal player, based on whether they play first or second?
I realize we don't know the optimal policy but sometimes we can decide (non)-existence anyway. Anyone happen to know for Go?
Instead of downvoting this comment, let me provide a counter example. Imagine the knapsack problem, where you can lift 10kg and you have a 10kg gold bar and 5kg cinderblocks. In this case solving 2 5kg knapsack problems will give you two cinderblocks, which is obviously not optimal.
I don't know a lot about the strategy of go, but it seems to me that any play that doesn't take into account the entire state of the board is allowing for the same class of sub-optimal behavior as the example above.
In general, it's false that sub-problems of a problem with an optimal solution are optimal themselves.(Wikipedia's article says the same differently: "In computer science, a problem that can be broken apart like this is said to have optimal substructure.", implying some problems can't be broken apart like that). Also see: https://en.wikipedia.org/wiki/Optimal_substructure
Maybe Go has this property of optimal substructure. Maybe it doesn't. I don't know, but it sure isn't immediately evident.
Under this definition of "subproblem" the different parts of a Go board are not subproblems either.
If the moves on different parts of a Go board didn't influence each other and could be solved separately via dynamic programming, it would be a much easier game (and probably uninteresting).
From the very section of the article you quoted: "In computer science, a problem that can be broken apart like this is said to have optimal substructure." - There is no reason whatsoever that any problem could be broken down to this. In fact there are many problems for which it is known to be impossible. The article on optimal substructure lists a few.
Yes. The principle of suboptimality only applies to problems which are optimally solved by dynamic programming. This is a small subset of problems and the knapsack problem is not one of those problems solved by dynamic programming. It does not have optimal substructure.
But in Go you can only make one move at a time. So even if a move is optimal for one part of the board, there might be a different move that you should have made first.
> Obviously it is better to be globally optimal than locally optimal.
the implication of having machines dictate our behavior [WRT environment, each other] to optimize the survival of humans far into the future seems a bit closer to reality.
I don't want to downplay the amazing breakthrough that AlphaGo represents, but I think it's a bit melodramatic to extrapolate from a program winning a board game to AI's impact on humanity's survival.
it's a board game of military strategy. while it is severly limited by its rules of play relative to real life, it is not inconievable that at some point it can dictate life and death as a capable military advisor.
i think it is inevitable that we will utilize AI to make better predictions about our own future than weather forecasters. not in our lifetimes, of course.
the program that beat Kasparov did so using a brute-force search (full?) of a very limited search space. the reason that AI is different is because it doesnt need to (and cannot do so because the space is simply too large.)
and 20 years ago is still well "within our lifetimes"
The program that beat Kasparov avoided a full breadth first search because it was fairly easy to decide if something was a bad move. So, it could generally pick from a fairly small number of moves when branching. What makes GO so much harder than chess is how hard it is to evaluate the board and decide if a move is good or bad.
First order branching factor is less important than you might think.
You could setup a game of Nim: https://en.wikipedia.org/wiki/Nim with much higher branching factor vs Go, but because you can trivially separate a winning moves vs. losing moves those branches become irrelevant in Nim. AKA, if there are 2,000 winning moves then they are all equivalent.
how do you know that AI is not already advising the military on certain tactics? even an isolated system that analyzes data gathered by intelligence analysts, weather, radar, etc. and can suggest an a flight path and timeline for insertion or airstrike (especially in context with other operations in-theater) would basically be that, wouldn't it? i mean, i doubt it's guys on the DoD version of google maps plotting routes by hand.
i have a hard time believing something like that isn't already in use to some extent, at least for special operations forces. how long until they just hook that into the drone fleet? what if they already have? autopilot on a modern fighter/bomber is basically the same thing, except there's a human in the loop, right?
that would also go along nicely with the whole matrix / skynet plot that seems playing out in real life right now...
Humans are better at thinking globally than they are at acting globally. It's not clear that a "wise" AI would be helpful to us. It would tell us to do things that we already know we ought to be doing, but that we don't do for local, near-term-selfish reasons.
this is true, but it mostly comes down to 1. don't obliterate each other, 2. don't destroy your dependencies, 3. don't out-consume your dependencies (overpopulate, etc).
however, point 2 is only understood from a very high level by humans. we know the dependency chain goes deep, but we don't know exactly which branches can be ignored without effect and which are critical.
our default is to protect all the things "just in case"
"Optimization" presumes the existence of concrete criteria by which an outcome may be judged, and often a timeframe within which to judge it (e.g. if I want to maximize my wealth when I retire I should fully fund my 401(k), but if I want to maximize my wealth next year then I shouldn't fund it at all). A board game has unambiguous criteria for victory; human civilization, not so much.
Skeptics have said that achieving this milestone would not happen within this decade, our lifetimes or even ever. It happened yesterday.
It took Lee Sedol many decades of his life to train to achieve this level. And his ability to pass on his skills is limited. Now that a computer has achieved this level, the state of the neural network behind it can be serialized and ran into an unlimited number of computers and have millions of systems that are more proficient at Go than the best player in the world.
People have said that achieving the cognitive level of the human brain requires to match its computational power. But if you take out all the parasympathetic and motor boilerplate, what is actually left for mental tasks is much less from that, most of that power is not even recruited for higher level mental tasks. That lowers the bar for strong AI.
Then, strong AI can be immortal, and never physically deteriorate from aging. Strong AI can multiply infinitely and communicate at a rate that would be equivalent to writing millions of books in a second. It could transfer all its knowledge in seconds. It can also recursively improve itself. This advantage will lower the bar for strong AI even more.
It's worth remembering that AlphaGo was partially, initially trained on the previous game records of professional players, to form the basis of its policy network and so on. AlphaGo's victory over Lee Sedol is impressive, but it does rest on the combined total experience and long study of humans. That's why it's even more exciting to me to hear that one of the DeepMind team's next efforts will be to see how AlphaGo plays when it starts learning "from scratch".
I agree with the excitement. A possible outcome is that the AlphaGo trained "from scratch" will not be as strong--but another possibility is that by eschewing any bias for human tradition, perhaps it will come up with even better strategies that no one has thought of before.
This was my argument against the StarCraft claim. If it wins, it will likely have just imitated human pros' thousands to millions of games like a smarter version of old chatterbots.
Whereas, good human players learned as they went with far fewer matches, trial and error, planning, and learning from pto games sometimes. We can even go from Age of Empires to Starcraft with little prrparation and still do OK.
Can anyone elaborate on how they would teach it from scratch? Would this mean only giving it the ability to play valid moves (and against itself) and give it access to the final score?
On the other hand i would like to read an assessment of the significance of this advance. While AI may have perfect applicability to these perfect information games, we may still be decades from useful real world generic applications.
Exactly so. While this is a major advance, I roll my eyes at the sea of comments hailing the arrival of a superior intelligence.
When the AI is cognizant of the fact that Go is a game, and knows what a game is, and perceives that the game is taking place in a larger reality, where there are other things happening while the game is going on....then I'll be impressed.
Or maybe not. The history of AI is one of humans always moving the goalposts when AI advances.
Chess? just a computationally simple game.
Driving? Well it's just physics.
Go? now that requires intelligence oh wait, just some deep neural savant thing, not real AI
Well, I think what I'm describing (general awareness) is a little different from your examples there, but I acknowledge my choice of words implied I'm not impressed, which I completely am because this is obviously utterly amazing. It's just not AS big a deal as a lot of people are making it out to be with proclamations of apocalypse.
Huh? IBM Watson was able to answer trivia questions based on some impressive NLP (that I'd like to see more widely available) and a huge database of factual information. Oh, and it was able to beat humans in buzzing in. Watson was impressive, no doubt about it. But it certainly didn't have any meta knowledge of its place in the world or in a contest.
What I mean is that the IBM Watson deals effectively deals with semantic information.
You can make some analogies between sparse matrix data representations and what happens in the mammalian neocortex. It is of course not self conscious, but it effectively deals with semantic information and relationships among it.
AlphaGo's algorithms will generalize easily to other perfect information games, including ones that include randomness like Backgammon. Games with imperfect information will take some more advances, but AlphaGo just opened up a lot of interesting potential paths, so there's a decent chance someone pursuing one of those will find something quickly.
The sheer amount of computing power that AlphaGo needs will be a problem for games as complicated as Go for people who aren't Google. There will probably be more activity in simpler games with interesting properties.
Non-game applications will require a fully formalized set of rules. The fact that formalizing what you want is hard is the reason we need programmers in the first place, so nothing fundamentally different there.
Even knowing what rules one is using isn't always clear.
I did some work for a Forex trader. He had devised an algorithm for trading and used it daily for modest gains. He engaged me to automate it for him via the vWorker freelance website [1]. I got the engine going getting the realtime price data etc. But when it came to his system it didn't work. In practice he didn't always act upon the signals he thought he did. We ended up in payment dispute because he insisted that if I had done the work right then the bot would be making money. By this time, I had learned the error of his thinking. We arbitrated on 50% payment - I wasn't too pleased with that but I had learned about trading.
I do feel that given all the hype and speculation about strong AI researchers in Deep Learning should really come forward to contextualize the real impact of this - which I think is not all that great.
AlphaGo has no form of state space discovery. That really prevents all the magic that strong AI fans want. Even though the actual rules of Go are a minor part of the program, the human who put them in was doing something that AlphaGo fundamentally can't do. AlphaGo can't learn what Go is, it can only take an existing understanding of the rules and learn to play very well.
You can expect a lot of advancement in actual games very soon, including ones with randomness, and after a few years probably hidden information games too. Being able to specify real-world problems as formally as a board game is now a more effective skill than it was before (it was good already).
In my opinion a possible path is to make some sort of Symbolic AI (https://en.wikipedia.org/wiki/Symbolic_artificial_intelligen...) system, and put machine learning on top, and even below of it. You could start with a very basic Symbolic AI, easier to understand. Then growth it gradually. The cool thing of this approach, would be that you could inspect the symbolic system to get a glance at how the AI system is thinking. Obviously a purely neuronal system can also do the trick. That is how the human brain works. But it would be great to make a system based on a symbolic AI, because it could be very interesting to observe its inner working. And it would also be possible to interact with it at the symbolic level.
It is good enough now to replace millions of jobs. Transportation, retail, press, trading, etc.
It is also good enough to put it in a combat drone that visually recognizes objectives and people.
This can be all commanded by a small group of people, anonymously, and detached from accountability and without casualties.
Part of the significance is that it was not the main goal for Deep Mind and Hassabis. He's said his main goal is to solve intelligence and it approaching it by trying to reverse engineer the human brain, having done a PhD in neuroscience. Hippocampus next. At this rate I'd be surprised if it took decades to real world generic applications.
Certainly a lot of people ten years ago would have bet against this happening by 2016--especially before Monte Carlo approaches were tried. But I doubt you would have found many takers for betting against it happening in a 100 years (much less it never happening).
Autonomous vehicles are similar. IMO, it would be a fool's bet to bet against them ever happening. But it's reasonable to debate whether they'll be mainstream Level 4 consumer devices in 10 years, 25 years, or 50 years.
Though it's true many people were skeptical about an AI winning at Go played at this level, I think people are mostly skeptical about general AI, not about algorithms that win at playing board games. It's different to be skeptical about specialized AI than about general AI.
General AI could be the biggest ever discovery for humanity. But it will never be if we don't put a hard enough effort. The bad thing, is that economy laws seem to assign a limited effort toward it. In my opinion we should be trying harder. Worth mentioning is OpenAI (https://openai.com/), an effort by Y Combinator, Elon Musk and others. Brilliant initiative.
Overpopulation in terms of running out of space is definitely overplayed. But it's a direct multiplier for our effect on the environment. If the world's population were cut in half, we'd emit about half as much pollution, half as much greenhouse gas, half the land used for farming would be left wild, etc.
Not necessarily...carbon emissions per capita is not constant around the globe. Energy use is skewed towards modern countries, i.e United States. If you wiped out the 300 million people in the US, you would save more on carbon emissions than wiping out 300 million in China.
Technically, it's two neural networks, each calculating reasonably simple functions (an estimate of the value of a board position and an estimate of which moves might be good) combined with a tree search algorithm.
My grandfather was very racist and never changed as long as he lived. I'm glad most people aren't immortal. There's no guarantee a full AI wouldn't be tempted by evil and just become republican (or worse, a VC).
It can also recursively improve itself.
So can people (the more you know, the more you can learn), but most don't. It's important to remember "intelligence" isn't an abstract concept—intelligence is also embodied in personality—and personalities have wishes and goals and desires and loves and hates and that one song they can't get out of their head. A true "strong AI" will be fully conscious, not just algorithmic function bating.
Good luck telling a mildly strong godform to stop tripping on youtube videos and instead solve the global economic stability equation over lunch.
This advantage will lower the bar for strong AI even more.
That's kinda foofy conjecture. Being good at rectangular grid outcomes isn't necessarily a step in any direction towards a hands-off tax evaluating robot.
It feels really really good to talk about how AI will be a hundred billion trillion times smarter than the combined brainpower of all humans that have ever lived, but it feels good in the same way thinking dead people live again after they die feels good—it triggers that warm wishful thinking parietal lobe that removes a bit of reason for the sake of an overarching calmness.
Enthusiasm is great, but tempering with real expectations and less technopriesthood is better.
You can recruit neurons within your own brain, not plug the equivalent of 4 brains into your brain and increase your cognitive abilities.
Now, while many people die and their ideas die with them, there are built in aspects of our brain that are hardwired. People eat animals because they are tasty, people follow the life of Kim Kardashian and waste money buying a rolex and wear fur just because the stupid way our social functions are hardwired in our brain.
In the grand scheme of things, what is the point of humans repeating billion of times the same cycle?: born, grow, learn, achieve some things, have some fun, and die. It is awesome a few times. But it is worth to repeat and repeat that for ever? Not if there is a much better option: The singularity. It has the potential to enhance the human race in formidable ways. Yes,there is the risk that the AI may chose to destroy us. But in my opinion is a bet that we should take. Just try our best to not be destroyed in the way. But we shouldn't leave this opportunity unexplored. It would be the biggest achievement of humanity.
It is NOT the same cycle.
Every life is different.
So what the people learned 500 years ago made sense for them - but they would face a hard time now.
Our knowledge of daily living will be mostly obsolete in 200 (or much less) years, too.
Fresh minds don't have the burden of outdated ideas... quoting Max Planck: "a new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it."
You are assigning value to "us" beyond what might actually be there. What if our only evolutionary value is to give rise to machine intelligence [that will go on to explore the universe] and die off?
We are primitive forms of life after all and are at the point where we should be able to say, with certainty, that machine intelligence will be objectively superior and something that we should feel OBLIGATED to give rise to, regardless of consequences.
> What if our only evolutionary value is to give rise to machine intelligence
There is no prescribed script about what our evolutionary value should be. We will be whatever we ourselves make of us. Unless something unexpected happens -more advanced aliens intervening with our specie for example- Our human instinct is to grow, explore, discover, compete, share as much as we can. AI should be seen as a tool to our goals. Not as an end by itself. There is of course the possibility that we may loose control to the AI. That is not our desired goal, but it could happen. In that case our future will not be our decision. We will be at the mercy of whatever the AI choose for us. That will depend on what kind of AI we built, and how the rampant AI chooses to modify itself. But we will strive to reach our human goals. Because that is the human nature.
Just because you or me see no prescribed script, doesn't mean it's not there. You make a philosophical point here, but my point is more than that.
"We will be whatever we ourselves make of us" sounds meaningless to me. The universe obeys certain physical laws, evolution is directed and follows a specific pattern. We have up to this point been bound by both. Occam's razor says this will continue to be the case.
I think we are already at the crossroads where we can more than speculate about the function of machine intelligence in evolutionary terms, but hopefully sooner rather than later we will have actual _evidence_ that we can look at and use to refine our assumptions.
What I found especially chilling was how the AI took its time when it thought it was ahead, and played exactly as aggressively as needed to win. It seems like we're seeing the full benefit of detaching logic from human emotional drives like pride, loss of nerve, etc. Which at first sight looks awesome, but in a way it's deeply terrifying, because if such a machine is ever given power in the real world, we would really like it to have some regard for human emotion. More than anything else I've seen, this makes me feel in my gut like I'm seeing my replacement.
> the AI took its time when it thought it was ahead, and played exactly as aggressively as needed to win
Don't human professional players do exactly this? They care about playing the best move and winning. The difference is that AlphaGo is likely much more accurate at determining what "95% probability of winning" means in terms of gameplay. A human has a harder time judging the eventual outcome of a game, and so plays more "aggressively" (favouring point margins to account for variance in the estimate, etc.) than AlphaGo might.
On the perhaps more reassuring side, it's possible for there to be an AI that takes into account human emotion, but does not have any of its own. Given power in the real world, it would do a better job of making people happy than a human with all their biases. It wouldn't assume that other people must like things it like, for example.
(Of course, making an AI that understands emotion isn't easy. And happiness by itself is an insufficient goal if we're building ourselves a benevolent overlord.)
I view all such arguments about "friendly AI", "emotion", having machine intelligence "understand us" as wishful thinking at best and laughable delusions otherwise.
I think the writing on the wall is clear, to anyone who cares to take a look. The moment we create an artificial intelligence capable of self-improvement and let it loose, we will have fulfilled our function (others say destiny) and will therefore be obsolete in every sense of the word.
What will happen to humanity after that point is irrelevant.
You don't see humans trying to keep bacteria "in the loop", why expect otherwise from our artificial progeny?
That is an interesting point. In my opinion AIs _could_ have "emotions" put in them too. Emotions are nothing more than a mechanism baked in the neural network to work toward goals. AlphaGo for example, could be seen as having a single very basic emotion: Work toward winning the match. We may not clearly understand how to put emotions in the AIs that we will build. But we have some working example: The emotions in our brains. Or in some animals like dogs, or dolphins. These are mechanism to guide individuals toward goals: to survive, to mate, accumulate resources, etc. And are pretty effective. If we successfully reach to understand how these mechanisms work, then we will be able to bake emotions in the AIs that we will build. That we will use to push these AIs toward specific goals. If we succeed, then we could actually have "friendly" AIs. If we don't do this, then we will have "cold" AIs. Without emotions, that we will have to try to keep under control in other ways.
You're anthropomorphising the machine. An AI will do what it's programmed to do, not develop free will and decide on its own destiny.
Correctly programming something that takes into account all of the nuances of the human condition might be impossible, and the results of mistakes are uniformly terrifying, but the result is entirely in the hands of humans (and human mistakes).
Actually the better way to look at this is human are like genes to AIs. Just as our genes generally get what they want from us despite having no thought, it is possible we could fulfil the same role for AIs.
"Motivated" is a weird word, it anthropomorphises the machine. It will do what it's programmed to do. It doesn't have a personality.
Formally specifying what we actually wanted is why programmers have a job. And the difficulty of that is why we have bugs. So that's still scary. But it's not the machine's choice in any way
Yes, I shouldn't anthropomorphize computers, they hate that.
Anyway, yes, it will optimize whatever function it's told to optimize. So you'd better hope that whoever gives it that function 1) cares about people's wellbeing and 2) is good at expressing that in a way the computer can interpret.
1 is pretty obvious, 2 may be a bit less so. For example, a superintelligent computer commanded to maximize global average happiness might decide the best route would be to work towards killing the entire human population except for one individual, who is kept constantly drugged out of their mind.
I don't understand how this is considered fair. AlphaGo has been trained on a database that includes every recorded game Sedol has ever played while Sedol is seeing AlphaGo's play style for the first time. Sedol should have been allowed to play against AlphaGo for a few months before the match so he could study its style.
Go AIs weren't expected to reach this level for at least another 10 years.
Before AlphaGo, Zen and Crazy Stone (the previous Go AIs) could only play against top-level professionals with a significant 4-5 stone starting handicap, and this was less than 3 years ago. A 4-5 stone handicap is basically taking control of half the board before the game has even started.
It really shows how the neural network approach made a huge difference in such a short time.
Part of this timing jump is Google throwing hardware at the problem with a large 280 GPU + 1920 CPU cluster. I would venture this is almost 100x bigger than most of the Go AI hardware we've seen to date. The nature paper suggests without this cluster it would be playing competitively with other single workstation Go AI, but nowhere near top level players.
> throwing hardware at the problem with a large 280 GPU + 1920 CPU cluster.
You have a trillion connection neural net wrapped in 2 pounds of flesh inside your head. This is a massively larger amount of hardware compared to just about every animal out there. Throwing hardware at a problem is a solution to intelligence.
I'm not comparing brain wetware to hardware. The parent's post was interested in how we achieved such great AI go performance today that was supposed to take 10 years. If you look at the components that fueled this, the performance of the system was advanced significantly by having additional hardware; both in training the policy and value networks with billions of synthetic Go games and at runtime.
I don't like the biological comparison, but using your metaphor it would be like God saying "Hey I've created a brain but only have 10 billion synapses. Evolution would normally take 10 years to get to human-scale at our current organic rate but if I throw money at building a bigger brain cavity I can squeeze in the 1 trillion to get there today!"
Extrapolating Deep Blue's 11GFLOPs supercomputer to today with Moore's law would be equivalent to a 70TFLOPs cluster. AlphaGo is using 1+PFLOPs of compute. While they likely aren't actually achieving that compute throughput, to put this in perspective this is the compute scale used to model huge geophysics simulations covering a 800km x 400km x 100km volume with 8+ billion grid points around the San Andreas.
At the very least, it's interesting to see how much more accessible computation has become. Back when I was in school I could only dream of having a cluster of 280 GPUs. When sometimes the dream would come true and you had access to a cluster you would have to wait your turn in the job queue and hope you had enough compute in our resource quota to prevent your job from being terminated.
Now I could spin up a 280 GPU cluster on AWS (after dealing with pesky utilization limits) for only $182/hour. If researchers at Google have been doing this non stop for the past year they have "racked up" $1.6M on just compute. This is a drop in the bucket for a marketing department and the publicity they have achieved. I don't think normal Go AI developers have access to those resources :)
Don't underestimate algorithmic improvements. Today's chess engines running on DeepBlue hardware outperform DeepBlue running on today's hardware.
Modern chess engines are built on a testing infrastructure that makes it possible to measure how each potential change affects the playing strength. This "Testing revolution" has brought massive improvements in playing strength.
For AlphaGo, it's probably the training that requires the most computational resources. The 'distilled knowledge' could perhaps run on a desktop PC. The program would search fewer variations and would be weaker, but if AlphaGo improves further, that version might still be stronger than any human.
My understanding is that the significant part was that before this, throwing more hardware at the available Go AIs still didn't make them competitive against high level players.
Also, it feels like training the AI against many games with lots of hardware is somewhat equivalent to a human progressional who engrossed themselves in the game and trained since childhood.
One member of the deepmind team responded to this very question during the interview at the beginning of part 3.
He said that the training data set size is much, much larger than the number of lee sedol games. It is like a drop in the ocean and not enough to significantly influence the resulting policy network.
Perhaps the computer didn't know that it was playing against Lee Sedol, but from Wikipedia "As of February 2016, he ranks second in international titles (18), behind only Lee Chang-ho (21)."
I don't know the details of the algorithm and perhaps it doesn't give more explicit weight to his games. But I wouldn't be surprised if after some iterations the algorithm decided to give more implicit weight to the games of the world leaders.
They mentioned what the training input to AlphaGo was in the paper; a database of a few hundred thousand games from dan ranked players on KGS. This means mostly amateur players, though there are a few professionals who play on KGS as well.
However, this only gets you so far, and the training set is fairly small compared to what you want to really train both the policy and value networks well. So then they had it play millions of games against different versions of itself, training both a new policy network and the value network based on that.
It's unlikely that Lee Sedol's games made much if any impact on AlphaGo's training. It was bootstrapped off of high-level amateur, and some casual pro, play, but from then on it just trained against itself.
The implication is that because Sedol wins more tournaments than almost anyone else, if you feed "Tournament winner" or "ELO" etc. as a feature, his games will be much more weighted than others.
So even if AlphaGo isn't explicitly designed to play against his style, it's implicitly trained in it.
But it's pretty subtle and I'm guessing that the volume of high-level tournaments overwhelms any effect, as the AlphaGo team said.
I don't think this is unfair, but I think other people are replying to a suggestion that AlphaGo might be optimized to beat Lee that I don't see in the parent comment.
What does seem right is that whatever strategic insights are in Lee's play are reflected in current games--his and the younger generation who came up in his shadow. Whatever strategic novelties shape AlphaGo, they are totally new to Lee.
I don't think that would make the difference: 1) there's no trick to be learned and 2) the same thing happens with human players to some extent--when Lee Changho appeared on the international go scene, his style was misunderstood and underestimated, even when his games were public.
However, it is true that there is a real asymmetry--AlphaGo may not know Lee from other players, but it has had the opportunity to "study" the best games of the current players, and no one outside of DeepMind has had an opportunity to study its games.
As it happened in chess, machines will beat humans in Go no matter how much knowledge about their inner workings will be provided to the human. (I've watched this story unfold in chess and the various hopes of how humans are still somehow better. $1K sez Go is exactly the same story. Can't beat a machine in a formal universe with a defined goal.)
It's actually entirely possible that if the program were unsupervised, i.e. had to learn Go "from scratch" without relying on any human games, it would be even stronger than it is now.
Deepmind is going to work on that next (according to one of the staff interviewed on the broadcast). It will be interesting to see if other playing styles develop.
I can't help but feel training it with previous human games is fair as that seems the equivalent of how humans are taught. You don't just explain the rules of Go to somebody and leave them to learn on their own without playing anyone or picking up tips that have been passed along for centuries.
Even more importantly, the policy network that chooses which move to explore must choose human like moves in order to function correctly because it must choose to explore the correct moves of Alphago's opponent.
That's not right. It just needs to choose equally good or better moves in playouts. It doesn't need to anticipate when its opponent plays bad moves, that's just a bonus. Basically: if you're good enough you don't need psychology, you just play the winning move.
I don't think that follows. To beat the machine the move must be both unpredicted and profitable. Random moves are not profitable. Training purely by reinforcement learning rather than on humans could create a policy network that ignores more subtrees that are profitable than the current one does. In short, it isn't good enough for the AI to be good at playing itself, it has to be good at playing every possible player, and while it is playing humans it is sufficient for it to be good at playing every human player.
That depends on the reasons that humans make flaws. If human flaws are mostly errors related to failures in our meat (stress, lack of focus, jittery nerves) that keep us from looking in depth then the algorithm will easily use the good points in each game and with its superior ability to look deep into the future the results are predictable that the machine will win every time.
So what? I could watch every game Roger Federer played in any tennis tournament and still lose all sets to love.
It used to be that computers could only do combinatorics better, but that where 'intuition' played a strong part, there was still hope for us humans...
Well, guess I will have to start playing Calvinball...
Also assuming you are a tennis player...if you do not study your opponents' shots during warm up, you are doing it wrong.
The way the would play against a lefty is different than a righty. Someone with lots of topspin vs someone who hits flat, a pusher vs power hitter, etc.
Lee Sedol didn't complain about this. In fact, even a few days before the match Lee Sedol was predicting a 5-0 win for him.
If someone is clearly better than you, you can study it's style all you want, it won't make a difference. You probably wouldn't even understand it's style.
That's a very amateurish and closed minded comment.
There's a reason professional sports team and players study the styles of their opponent. Teams employ statisticians for this exact role. Before games, teams study the style of their next opponent. After every after game teams go and study the recordings of their last game.
Which is why I qualified my comment with "clearly better". You can study Usain Bolt's running style all you want, it will still leave you in the dust. Studying makes sense only when the difference is small.
But I think a computer as a training buddy is indeed a good idea to improve Lee's skill. I don't know how Lee feel right now, he's defeated, there must be an enormous pressure inside of him, but I think he will appreciate the challenge because he needs more challenge! He has played against top players all over the world now for his 20+ years since earning 1 dan rank. Also, I think the computer play against itself and many random moves. Unlike human players, one would expect computers to play unconventionally since the computer can predict so far ahead about the probability of maximizing winning.
It's probably not that different from some new hotshot kid that has no recorded history of playing, but got really good playing on the street and studying the masters. Kid gets discovered by some credible major promoter who challenges Sedol to a $1M game.
sedol's game actually has zero bias over how alphago plays. the team lead for alphago himself said that its like a drop in the ocean/bucket for alphago.
I'm curious if AlphaGo is simply winning by virtue of having more computational capability (and therefore can never be defeated consistently by humans) or if in its training it actually discovered and is now deploying interesting new tactics that humans, studying these games, can uncover and therefore use to defeat AlphaGo in the future.
AlphaGo was first trained with human data only to create an initial "predict the next move" network. Afterwards, they let the network play with itself to learn how to win. So it's quite possible that it has its own tactics.
Yeah, reddit /r/baduk had a thread about it including video from a different angle. The commentator only noticed that the move had been played and he was gone, didn't get the ordering correct.
I understand very little of Go beyond its basic rules and having played only a couple of matches with a friend who knows as little as I do...
I reviewed the match, and it seems that very quickly White (AlphaGo) went aggressive, and Black was trying to contain it. I suppose a human player can commit serious mistakes by playing too aggressively from the start, right? But the summary of the game speculates move 31 might have been the losing move (maybe implying the game until that moment wasn't going so bad for Lee Sedol?), while to my untrained eyes it looks as if White was constantly on the offensive from the start, and Black was playing almost exclusively to contain it.
I think that most high-level Go players (including AlphaGo) play moves that serve multiple purposes simultaneously. The ideal move reinforces your own position while forcing your opponent to defend their own.
White certainly played plenty of moves that required black to respond, but black also did the same. For example, the sequence beginning at 77 is black getting a foothold inside an area that white might've hoped to claim later. Then at move 115 black attempts an invasion into a very strong white area, ultimately failing to escape or live, leading to resignation.
That all said, I wouldn't say that either player was particularly aggressive. I've watched plenty of human games where one player or the other very intentionally picks an all-out fight. If you're really good at reading ahead locally and you think your opponent is not, this make sense. This is especially true if your large-scale game play is not as good as your tactical play.
This is also common in high handicap games. If white can't slowly eke out an advantage early on, the only remaining tactic might be to go for a huge fight and see what happens.
In game 3, when Lee Sedol played on the left side, a capture race started.
AlphaGo made his dragon look vulnerable to be captured. Lee Sedol attacked it, and AlphaGo defended it enough so it would not die, but making a huge territory on the bottom part of the board.
When AlphaGo's dragon lived, Lee Sedol realized AlphaGo was ahead in score, and needed to invade the largest AlphaGo's territory in the bottom. This invasion was not feasible, and Lee Sedol was relying on a long shot strategy involving kos. This idea was finally refuted by AlphaGo, leaving him with no feasible ko threats, forcing Lee Sedol to resign.
It was fascinating when the author experienced a weird bit of nausea when thinking about the implications of AI. I'm not sure how I feel about it yet. It does scare me a bit.
Important distinction: children get their genes from us and share many of our values by default. Computers do not share many of our values by default. Instead they do what they are programmed to do.
But the problem is that computers do what you say, not what you mean. If I write a function called be_nice_to_people(), the fact that I gave my function that name does nothing to affect the implementation. Instead my computers behavior will depend on the specific details of the implementation. And since being nice to people is a behavior that's extremely hard to precisely specify, by default creating an AI that's smart enough to replace humans is likely to result in a bad outcome.
That's not really relatable because a student can only replace one teacher. When it comes to software, it's more so like saying that one student can replace all the teachers. We're seeing this now with self driving vehicles and the trucking industry. Once the software gets good enough and it's approved, there'll be a massive push for it's adoption and I'm sure that within roughly 3 years, about 70% of truckers will be replaced from that single breakthrough. Then comes taxi drives, maybe pilots and train operators in another 8 years or so. It'll be a cascade of lay-offs and that's just with this technology. So while it's a good thing that a student surpasses the teacher, this isn't exactly that same scenario.
Sure, but this phenomenon is not new. It's called creative destruction: innovators discover a market advantage, creating new industries while destroying the old.
This is different from DeepBlue and Garry Kasparov
Chess may have a big search space, but it is "well behaved". Moves might be out of the ordinary, but not too much (weird moves that still help to win the match are rare)
AlphaGo might also have learned from past matches, but that doesn't give all the answers
The black-box aspect, the fact that you can't understand what it is thinking is curious, to say the least
> The black-box aspect, the fact that you can't understand what it is thinking is curious, to say the least
How true is this? Presumably you could ask AlphaGo to give you a list of the top 10 best moves it considered and perhaps, given a certain move, what it sees the board looking like 10 moves ahead.
If you're watching the commentary, one of the commentators (Michael Redmond) will try to explain the outcome of certain moves and this would seem to be equivalent although I imagine he could explain this in terms of the endgame better.
I don't know if that's true. "Computer moves", referring to weird moves that computers play that often turn out to be very good, tend to get mentioned a lot in chess. And you might hear about them less and less, because many of them have been absorbed into the human chess repertoire.
Also notable is the verb "spacebar", meaning "to play the same move an AI would". The chess world provides an interesting microcosm of how humans adapt to a world dominated by AI.
I just thought of an interesting (and somewhat more "real-world") task for AI research: Can an AI outperform a human at play-calling in [American] football?
Yahoo's fantasy football software creates millions of machine-written summaries of their fantasy football matches after each week of play. They're several paragraphs long and point out interesting things about the particulars of the teams' performance.
I would imagine that given the "All-22" camera footage from a football game, it would be possible to construct a piece of software to monitor what was going on (visually identify players by name/number, determine who has the ball, etc.) and then convert that model to a stream of English text.
Making it nuanced and entertaining, on the other hand, would be a challenge. :)
Some of the coaches already do this. Their "card" includes plays that say "In situation X, the probabilities say to call play Y."
The problem is small sample size. You only have the plays as called in the games to rely on. You can't "replay" the game with a different set of plays and see what the result would be.
This was a problem even in Moneyball. It's just that On-Base Percentage has a really good correlation to runs and therefore wins.
Considering that plays are often called or changed by quarterbacks on the field right before the snap based on what the QB sees from the opposing team, I think it would be difficult to provide all that input to a computer AI.
A neural net would be able to read the opposing defense more quickly and accurately than any human, but it would not be a fair fight (the human doesn't get a bird's eye view of things!).
It just doesn't have as big of an impact though. Each team has a very limited set of plays it's actually able to call and execute. Maybe 300 max (as a guess). Many times much less than that. So there's really not much room for an AI to see huge improvements via play calling.
It's really all about execution on the field and the "feel" of the game.
DeepMind doesn't generally comment on these sorts of things, but I think they knew it would be strong, just not how strong.
It is hard to accurately gauge the playing strength of such a program for the following reasons:
* Go is not solved, it doesn't really even admit a good heuristic for close games.
In chess, given enough time, we can do a pretty good job of searching the possible outcomes from a given position, and even if we can't evaluate all the possible endgames, evaluation functions exist that give us an idea of which side is ahead when we terminate the search.
Go has a much higher branching factor (more moves available at each turn, making the search more expensive) and short of a catastrophic blunder, it's hard to quantify who is ahead at each point in time[1].
So we can't (in general) quantify optimal play, and therefore cannot quantify how much AlphaGo (or anyone else) deviates from it.
* One known aspect of programs using Monte Carlo Tree Search is that they play to win, and are willing to sacrifice margin of victory to maintain or increase their odds of winning.
According to some people I've talked to, this can be suboptimal, but there are methods of addressing this[2].
Note that you can't just change the objective function from "winning" to "win by as much as possible" without potentially reducing AlphaGo's strength.
* The value function learned by a deep net is hard to interpret, partly because it encodes information about the possible futures arising from a position, and partly because it involves a tremendous number of calculations to compute.
We do not know what it is representing at the intermediate levels-- techniques that aim to visualize or cluster unit activations can provide a bit of insight, but there's always the possibility that we're interpreting the patterns incorrectly because we're trying to fit it into the framework of "what would a human think?".
Further, the representation is somewhat monolithic-- we can't tweak the value of one thing without changing the values of others.
In chess engines, we might modify the material value of, say, a knight, without affecting the value of a rook.
In a convolutional net, if we adjust the value of one position, it will tend to affect the value of many others.
* We can attempt to quantify its strength by comparing it to other programs, but AlphaGo has already crushed other programs (99% win rate), so all that it tells us is that it is stronger than those programs.
Essentially, without the ability to perform a significant amount of searching, or gauge strength via margin-of-victory, or to examine the program from other angles, it's hard to gauge just how good the program is.
The only thing we can do is throw skilful opponents at it, and see if it fails eventually[3].
In its current incarnation, though, it seems like its playing strength is just going to be "stronger than you".
---
1. Hence the need for Deep Reinforcement Learning-- we learn how valuable each position is based on the results of the positions that it can lead to.
2. By modifying the search to maintain a given margin once it has been attained-- but I do not work at DeepMind, nor am I an expert on MCTS like some of my colleagues, so I don't know if this could adversely affect AlphaGo's overall strength.
3. We might be able to get a better idea by having extremely strong players play against weaker versions of AlphaGo (ones with less computing power) and then sort of telescope upwards once a good baseline has been set, but it remains to be seen if the current level of invincibility is due to it not having been around long enough for its weaknesses to become apparent[4].
4. How would an AI researcher play against it? I've talked to a few people, and the answers have been: (a) play the game to the conclusion, don't resign; (b) attempt to take the game off-the-rails into parts of the state-space it hasn't really explored before (although this is risky because the human player is more likely to make mistakes in these cases as well), and (c) let it get ahead somewhat so that it "relaxes" (see the points about margin), allowing you to catch up and overtake it.
The way to find out how strong AlphaGo is is to start giving handicap and see when the game becomes even. A difference in rank of 1 stone corresponds roughly to a 2:1 advantage. It is traditional in long-running matches to increase the handicap by 1 after one player wins three consecutive games. I haven't heard any talk of that here -- I'm sure Lee Sedol wants the chance to beat AlphaGo in an even game -- but it would be interesting. It would also be interesting to see how AG plays when it thinks it is behind.
That might work; the concern is that this would take it too far off of the task it was trained on.
That is, if it doesn't have a lot of experience being significantly down, then it won't play nearly as well when trying to catch up-- but that doesn't matter in even games because it never gets that far behind.
You're right that it would be interesting to see, though-- we need to get better at understanding these sorts of systems, at least until they can start optimizing themselves.
Alternatively, we might train a different agent (OmegaGo?) to try to win by the largest margin possible-- if it works as well as AlphaGo, then that might give us some more insight into how strong both programs are.
>One known aspect of programs using Monte Carlo Tree Search is that they play to win, and are willing to sacrifice margin of victory to maintain or increase their odds of winning.
One interesting implication of this is that it very much applies to perfect information games--not games that involve an element of luck or, indeed, pretty much anything that involves the real world. In those latter environments, you absolutely want to run the score up or otherwise take small chances to gain an immediate advantage to create a cushion against future unexpected events (even if they slightly hurt your odds if those events don't occur).
> are willing to sacrifice margin of victory to maintain or increase their odds of winning. According to some people I've talked to, this can be suboptimal,
I suspect this is less suboptimal when a computer is involved since a computer never gets tired or makes an attnetion-based mistake.
The actual match is largely effected by psycology at the time of game. The match is somewhat unfair to human player. AlphaGo and its team know all about each human player and preference. But human player knows little about AlphaGo. Also it is AlphaGo team who decides whom to choose to play against AlphaGo, not the other round. In the sense of psycology and , the game is greatly favor to AlphaGo right now. This is the difference with the match between IBM DeepBlue and the chess world champion 10 years ago. It should be that any player(human or AI) can be allowed to enter the game, with or without a fee.
No, they did not misrepresent the strength of the program. They did not know the strength of the program. The purpose of the match with Lee Sedol was to determine that.
So far, they have tested a previous iteration of the program against a much lower level professional, as well as all of the other Go engines they could test it against. It easily defeated all other Go AIs, but no previous Go AI has been able to compete at a professional level. It also defeated Fan Hui, the lower level professional, though it made several mistakes and lost a couple of the informal games (with lower time limits) it played against.
Since then, they've had several months to tweak it, and then let it train against itself some more. The thing is, to really determine its strength, you need to have it play serious games against professional players. So there really was no good way for them to determine its strength without playing these games. It's possible that it only would have improved a little since it's games with Fan Hui; in that case, the match with Lee Sedol may have been easy for him. However, it appears that it has improved substantially.
It implies that it costs money to get a high-profile match against someone like Lee Sedol, and it costs money to draw public attention, and it implies that they thought they at least had a decent chance. Nothing more.
Don't be silly, a Go AI has very little direct value for Google's business. But the publicity is great, easily worth $1M. More important is keeping the DeepMind team busy and engaged -- the real value is in spinning that know-how off into other projects.
> Don't be silly, a Go AI has very little direct value for Google's business.
On the other hand: Knowing what makes some problems much easier for humans than for computers can have direct value on Google's business. Go was a hot candidate for a game with this property.
I think they expected that the computer to play a good game, but probably they were not sure about the outcome. Anyway, Google has Billons in the bank, so if they were defeated they would not miss that single Million too much.
Which was meaningless at the time, and I don't think he knew that game beat the European champ until shortly before his game. The difference between the two champions was so large that Lee would be expected to win 95% of the time. All other Go AI was exceptionally weak when compared to the top level players.
This is awesome, finally AlphaGo has beaten Lee Sedol. Congratulations to the team at DeepMind and Google. But I will go straight to the really interesting point: My completely baseless intuition tells me that we will have full AGI (artificial general intelligence) in 5-10 years. Maybe not very powerful at first, but a least working. Real general intelligence. I say this as a programmer with superficial knowledge on ML (machine learning). All that is needed is to figure a way, to make neural networks handle the flow of thoughts of human consciousness. Reasoning, short/long term memory, basic controlling "emotions", etc. Then to assemble this central thought unit, in a system with various neural networks to handle more specialized tasks: Vision, natural language processing, audition, motion,etc. The task is of course extremely difficult, but I have the feeling that it will be possible to overcome.
Once this have been achieved, it is downhill from there. The singularity. In my opinion, we must start to consider how this tremendous breakthrough will affect the human race. Our lives, economy, culture, etc. Awesome things we will be witnessing in the coming times. It would be great if it were possible to make public, the progress made researching specifically AGI. I wonder how big of an effort is currently being done. This would be possibly the biggest discovery ever for humanity. It deserves a fantastic effort.
I don't know if you're serious, but how can you make such grand predictions, considering you yourself admit to having only superficial knowledge of machine learning?
It is just a guess. Human thought, although highly complex, is in my opinion something actually feasible to be simulated. If I had a large research budget under my control, I would be betting most of it on AGI.
No one can really predict the future. But given the number of times AI pessimists have been proven wrong in recent years and the massive increase in AI investment, my best guess is that truly human-level artificial general intelligence is in fact around the corner.
Considering what a large impact this will have, we must take that possibility seriously. Not to do so would be foolish.
AIs can play chess, recognize speech consistently, drive cars, win at Jeopardy and now Go. Deep Mind has some general capabilities. We must anticipate programs with even more general intelligence.
At this point, the match is won, but games 4 and 5 will commence. The question shifts from whether AlphaGo is better than humanity's best, to whether humanity can even have a chance of beating AlphaGo in a single game. And so far, it sounds like the answer is likely to be a resounding no.