Hacker News new | past | comments | ask | show | jobs | submit login

The Gold Answer is not the output of the model but the expected answer in the benchmark. Probably the benachmark contains multiple consecutive images of the same patient.



Thanks, I probably shouldn’t be commenting on an LLM article if I don’t even know what a gold answer is! Oh well.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: