If I read correctly, the agent only controls one player at a time. On offense it controls the player with the ball, and on defense it controls probably the player closest to the ball. The other players are controlled by the built-in AI. Controlling a single agent kind of takes away from the appeal of deep-RL: that entire teams can learn to coordinate in novel and optimal ways.
> Modeled after popular football video games, the Football Environment provides a physics based 3D football simulation where agents control either one or all football players on their team, learn how to pass between them, and manage to overcome their opponent’s defense in order to score goals.
It looks like the AI agent can control one or all of the players on their team.
> by default, our non-active players are also controlled by another rule-based bot. In this case, the behavior is simple and corresponds to reasonable football actions
and strategies, such as running towards the ball when we are
not in possession, or move forward together with our active
player. In particular, this type of behavior can be turned off for future research on cooperative multi-agents if desired.
I guess I don't get it... What does this game have that SC2/Dota doesn't?
As far as I can tell, the main goal for reinforcement learning is to make it so that it doesn't take 10k learning sessions to learn what a human can learn in a single session, and to make self-training without guiding scenarios feasible.
Should be much cheaper to run despite being a physics simulation:
"The Football Engine is written in highly optimized C++ code, allowing it to be run on off-the-shelf machines, both with GPU and without GPU-based rendering enabled. This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine."
Also has some features geared towards generalization, and benchmarking:
"With the Football Benchmarks, we propose a set of benchmark problems for RL research based on the Football Engine. The goal in these benchmarks is to play a “standard” game of football against a fixed rule-based opponent that was hand-engineered for this purpose. We provide three versions: the Football Easy Benchmark, the Football Medium Benchmark, and the Football Hard Benchmark, which only differ in the strength of the opponent. "
"As training agents for the full Football Benchmarks can be challenging, we also provide Football Academy, a diverse set of scenarios of varying difficulty. This allows researchers to get the ball rolling on new research ideas, allows testing of high-level concepts (such as passing), and provides a foundation to investigate curriculum learning research ideas, where agents learn from progressively harder scenarios."
So, as compared to SC2/Dota, you can train much faster, with good baselines to benchmark or compete again, and already built-in support for curriculum learning. SC2/Dota weren't designed for RL, so they're just harder to work with in RL - SC2 has a curriculum which was added on afterwards, for example (the minigames), but Dota2 still doesn't.
Soccer is also more popular than either SC2/DoTA2 as well, so that may be a draw for researchers. More interesting to work on something you like and already know about.
Dunno if all of that together is enough to make it a worthwhile testbed, but it's not obviously worthless or redundant.
> This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine."
That is 24 steps per second per thread (or 48 per core).
This doesn't seem that impressive: much more complex games run at that frame rate? FIFA games from the 90s don't look much worse and certainly achieved those frame rates on much older hardware.
That's not the point. Yes it superficially appears to be a simpler and less polished game than SC2/Dota, but the point is that it's a different kind of learning environment. It has different actions (things the AI can do, how it controls player movements), different observations (what the AI perceives/sees is structured differently), different rules/physics (this has a ball you can run to and kick).
In other words, you can perform different kinds of experiments and learn different things by studying this environment. The kind of AI that would excel at this game could have a different architecture and be trained differently than what you need to be effective at SC2/Dota. Just as you would learn different things designing a Quake bot using deep learning (your Quake bot would need to learn to navigate and map out 3D space, for one).
> I guess I don't get it... What does this game have that SC2/Dota doesn't?
For starters, it doesn't require you to buy it and it won't bring Activision trademark lawyers with copyright claims down on your YouTube channel if you record it.
I see this as somewhere between Go and SC2/Dota. The latter are incomplete information games, whereas a game of soccer, you can see the entire field. Also, unlike SC2/Dota, most of the game is focused on a single point (the ball). You also only have a small number of units with very limited control, unlike Dota/SC2 which have hundreds of different characters, items and combinations.
an ordinary grad student can't reasonably train an SC2 agent to do anything on their 1 GPU workstation.
that's the motive -- an RL environment that's a little more interesting than mountaincar or Atari, and that doesn't require a $500 software license like the MuJoCo based physics sims, but that is simple enough to do something on a small machine, and can scale up to a really interesting and big strategic game-playing challenge.
Perhaps this will be used in live sports in the future. Giving real time feedback to players for optimum positioning. Would be a cool test but I still prefer to watch sports played the ‘traditional’ way.
Tell me about this 'traditional' way you talk about. Professional sports has always been about competition/winning (within the rules).
If technology can give a team an advantage, it's only a natural progression!
> The Football Engine is written in highly optimized C++ code, allowing it to be run on off-the-shelf machines, both with GPU and without GPU-based rendering enabled. This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine.
Well, to be fair, statically guaranteed memory safety is a completely separate category from potential memory safety of well-written code. That said, the meaningless calls for Rust at each possible opportunity are a bit unnecessary.