I think neural networks will be part of the solution, but they are probably not ...

an_opabinia · on Feb 5, 2019

> where BAD achieves an average score of 24.174 points in the two-player setting, surpassing the best previously published results for learning agents by around 9 points and approaching the best known performance of 24.9 points for (cheating) open-hand gameplay

I didn't realize Hanabi is already that close to being solved.

My gut reaction is that the game is a lot simpler than it appears. I guess your simpler "matrix" game points to that--you already had an intuition for reducing Hanabi. Indeed, looking at the code you share for the "matrix game," it would seem that Hanabi's problem is that, like Chess and Go, it doesn't really resemble more sophisticated games as much as it resembles something that can be literally expressed in Tensorflow.

jakobnicolaus · on Feb 5, 2019

The good news is that we have open-sourced the environment, so if you think it's easy I would love to see a simple method that solves it.