The hallucinations are a result of RLVR. We reward the model for an answer and t... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		mountainriver 89 days ago \| parent \| context \| favorite \| on: Gemini 2.5 Pro Preview The hallucinations are a result of RLVR. We reward the model for an answer and then force it to reason about how to get there when the base model may not have that information.

mdp2021 89 days ago [–]

> The hallucinations are a result of RLVR

Well let us reward them for producing output that is consistent with database accessed selected documentation then, and massacre them for output they cannot justify - like we do with humans.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact