Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think neural networks will be part of the solution, but they are probably not the entire answer. For an example of a method that combines Deep RL with Bayesian reasoning, you can take a look at our recent paper (https://arxiv.org/abs/1811.01458). BAD achieves best known scores in for 2 player Hanabi in self-play.


> where BAD achieves an average score of 24.174 points in the two-player setting, surpassing the best previously published results for learning agents by around 9 points and approaching the best known performance of 24.9 points for (cheating) open-hand gameplay

I didn't realize Hanabi is already that close to being solved.

My gut reaction is that the game is a lot simpler than it appears. I guess your simpler "matrix" game points to that--you already had an intuition for reducing Hanabi. Indeed, looking at the code you share for the "matrix game," it would seem that Hanabi's problem is that, like Chess and Go, it doesn't really resemble more sophisticated games as much as it resembles something that can be literally expressed in Tensorflow.


The good news is that we have open-sourced the environment, so if you think it's easy I would love to see a simple method that solves it.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: