Hacker News new | past | comments | ask | show | jobs | submit login

There is an obvious win/loss situation for games though, the same is not true for debates.



Right, as I said this is an unsolved problem.


I wonder whether the problem could even become sufficiently well defined to admit any agreed upon loss function? You must debate with the goal of maximising the aggregate wellbeing (definition required) of all living and future humans (and other relatable species)?


It would require some sort of continuously tuned arbiter, ie. similar to in RLHF as well as an adversarial-style scheme a la GAN. But I really am spitballing here - research could absolutely go in a different direction.

But lets say you reduced it to some sort of 'trying to prove a statement' that can be verified along with a discriminator model, then compare two iterations based on whether they are accurately proving the statement in english language.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: