Google Research Football: A Novel Reinforcement Learning Environment

lostdog · on June 9, 2019

If I read correctly, the agent only controls one player at a time. On offense it controls the player with the ball, and on defense it controls probably the player closest to the ball. The other players are controlled by the built-in AI. Controlling a single agent kind of takes away from the appeal of deep-RL: that entire teams can learn to coordinate in novel and optimal ways.

mkolodny · on June 9, 2019

> Modeled after popular football video games, the Football Environment provides a physics based 3D football simulation where agents control either one or all football players on their team, learn how to pass between them, and manage to overcome their opponent’s defense in order to score goals.

It looks like the AI agent can control one or all of the players on their team.

lostdog · on June 10, 2019

You are right! From the paper:

> by default, our non-active players are also controlled by another rule-based bot. In this case, the behavior is simple and corresponds to reasonable football actions and strategies, such as running towards the ball when we are not in possession, or move forward together with our active player. In particular, this type of behavior can be turned off for future research on cooperative multi-agents if desired.

zug_zug · on June 9, 2019

I guess I don't get it... What does this game have that SC2/Dota doesn't?

As far as I can tell, the main goal for reinforcement learning is to make it so that it doesn't take 10k learning sessions to learn what a human can learn in a single session, and to make self-training without guiding scenarios feasible.

gwern · on June 9, 2019

Should be much cheaper to run despite being a physics simulation:

"The Football Engine is written in highly optimized C++ code, allowing it to be run on off-the-shelf machines, both with GPU and without GPU-based rendering enabled. This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine."

Also has some features geared towards generalization, and benchmarking:

"With the Football Benchmarks, we propose a set of benchmark problems for RL research based on the Football Engine. The goal in these benchmarks is to play a “standard” game of football against a fixed rule-based opponent that was hand-engineered for this purpose. We provide three versions: the Football Easy Benchmark, the Football Medium Benchmark, and the Football Hard Benchmark, which only differ in the strength of the opponent. "

"As training agents for the full Football Benchmarks can be challenging, we also provide Football Academy, a diverse set of scenarios of varying difficulty. This allows researchers to get the ball rolling on new research ideas, allows testing of high-level concepts (such as passing), and provides a foundation to investigate curriculum learning research ideas, where agents learn from progressively harder scenarios."

So, as compared to SC2/Dota, you can train much faster, with good baselines to benchmark or compete again, and already built-in support for curriculum learning. SC2/Dota weren't designed for RL, so they're just harder to work with in RL - SC2 has a curriculum which was added on afterwards, for example (the minigames), but Dota2 still doesn't.

Soccer is also more popular than either SC2/DoTA2 as well, so that may be a draw for researchers. More interesting to work on something you like and already know about.

Dunno if all of that together is enough to make it a worthwhile testbed, but it's not obviously worthless or redundant.

FartyMcFarter · on June 9, 2019

> This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine."

That is 24 steps per second per thread (or 48 per core).

This doesn't seem that impressive: much more complex games run at that frame rate? FIFA games from the 90s don't look much worse and certainly achieved those frame rates on much older hardware.

gwern · on June 9, 2019

Most of them aren't necessarily trying for realistic physics simulations (they're games, not simulations).

tachyonbeam · on June 9, 2019

> What does this game have that SC2/Dota doesn't?

That's not the point. Yes it superficially appears to be a simpler and less polished game than SC2/Dota, but the point is that it's a different kind of learning environment. It has different actions (things the AI can do, how it controls player movements), different observations (what the AI perceives/sees is structured differently), different rules/physics (this has a ball you can run to and kick).

In other words, you can perform different kinds of experiments and learn different things by studying this environment. The kind of AI that would excel at this game could have a different architecture and be trained differently than what you need to be effective at SC2/Dota. Just as you would learn different things designing a Quake bot using deep learning (your Quake bot would need to learn to navigate and map out 3D space, for one).

izacus · on June 9, 2019

> I guess I don't get it... What does this game have that SC2/Dota doesn't?

For starters, it doesn't require you to buy it and it won't bring Activision trademark lawyers with copyright claims down on your YouTube channel if you record it.

ehsankia · on June 9, 2019

I see this as somewhere between Go and SC2/Dota. The latter are incomplete information games, whereas a game of soccer, you can see the entire field. Also, unlike SC2/Dota, most of the game is focused on a single point (the ball). You also only have a small number of units with very limited control, unlike Dota/SC2 which have hundreds of different characters, items and combinations.

zug_zug · on June 9, 2019

Okay... but all of those differences make this game EASIER than SC2, so why work on a reduced difficulty problem now.

currymj · on June 9, 2019

an ordinary grad student can't reasonably train an SC2 agent to do anything on their 1 GPU workstation.

that's the motive -- an RL environment that's a little more interesting than mountaincar or Atari, and that doesn't require a $500 software license like the MuJoCo based physics sims, but that is simple enough to do something on a small machine, and can scale up to a really interesting and big strategic game-playing challenge.

aurelwu · on June 9, 2019

the biggest advantage I see is that the game is open source instead of proprietary

rq1 · on June 9, 2019

The football market.

ur-whale · on June 9, 2019

"real bayesians" vs "frequentists united" at 0:33 in the video :D

empath75 · on June 9, 2019

I bet that ai’s will find a lot of physics bugs to exploit early on.

milleramp · on June 8, 2019

Perhaps this will be used in live sports in the future. Giving real time feedback to players for optimum positioning. Would be a cool test but I still prefer to watch sports played the ‘traditional’ way.

ashes_in_space · on June 14, 2019

Tell me about this 'traditional' way you talk about. Professional sports has always been about competition/winning (within the rules). If technology can give a team an advantage, it's only a natural progression!

7ewis · on June 8, 2019

Wonder if they use the same tech to predict the outcome of football matches. I've seen them show it on the Premier League games.

pitt1980 · on June 9, 2019

Do all the players have the same skill set?

Interesting to see how things like faster players change optimal play

dmos62 · on June 9, 2019

Any chance for Google Research Rugby?

oliv3er · on June 9, 2019

> The Football Engine is written in highly optimized C++ code, allowing it to be run on off-the-shelf machines, both with GPU and without GPU-based rendering enabled. This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine.

Missed opportunity to use Rust for memory safety.

caraffle · on June 9, 2019

What? Do you have evidence there's a memory problem with the code as is?

Blind language allegiances are useless. All of them have their advantages and it makes no sense to claim one is superior.

feanaro · on June 9, 2019

Well, to be fair, statically guaranteed memory safety is a completely separate category from potential memory safety of well-written code. That said, the meaningless calls for Rust at each possible opportunity are a bit unnecessary.