Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> where BAD achieves an average score of 24.174 points in the two-player setting, surpassing the best previously published results for learning agents by around 9 points and approaching the best known performance of 24.9 points for (cheating) open-hand gameplay

I didn't realize Hanabi is already that close to being solved.

My gut reaction is that the game is a lot simpler than it appears. I guess your simpler "matrix" game points to that--you already had an intuition for reducing Hanabi. Indeed, looking at the code you share for the "matrix game," it would seem that Hanabi's problem is that, like Chess and Go, it doesn't really resemble more sophisticated games as much as it resembles something that can be literally expressed in Tensorflow.



The good news is that we have open-sourced the environment, so if you think it's easy I would love to see a simple method that solves it.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: