Hacker News new | past | comments | ask | show | jobs | submit login

> “ Question: Write a detailed radiology note based on the chest X-ray. Gold Answer: AP upright and lateral views of the chest were provided. Left chest wall pacer pack is again seen with leads extending into the right heart. ”

The bit about a “wall pacer pack is again seen…” leads me to believe this was based on another doctors note about a similar looking X-ray which was probably paired with other information like another scan at the time. That would be problematic imo.




The Gold Answer is not the output of the model but the expected answer in the benchmark. Probably the benachmark contains multiple consecutive images of the same patient.


Thanks, I probably shouldn’t be commenting on an LLM article if I don’t even know what a gold answer is! Oh well.


Problematic for the functionality? If it works well enough, I’m pretty fine with them stealing data to create a useful medical tool.


It’s problematic because the LLM is describing another person’s scan and not the one presented to it. It should at least present the other scan as workings and the percentage difference between the two. Finding a similar looking scan is very useful no doubt but if the result is hallucinated that it is less so. Dangerous even. There is no confidence percentage and there should be.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: