> “ Question: Write a detailed radiology note based on the chest X-ray. Gold Ans...

planb · 2024-06-07T09:56:55 1717754215

The Gold Answer is not the output of the model but the expected answer in the benchmark. Probably the benachmark contains multiple consecutive images of the same patient.

davidhyde · 2024-06-07T12:56:48 1717765008

Thanks, I probably shouldn’t be commenting on an LLM article if I don’t even know what a gold answer is! Oh well.

BriggyDwiggs42 · 2024-06-07T04:21:34 1717734094

Problematic for the functionality? If it works well enough, I’m pretty fine with them stealing data to create a useful medical tool.

davidhyde · 2024-06-07T08:46:51 1717750011

It’s problematic because the LLM is describing another person’s scan and not the one presented to it. It should at least present the other scan as workings and the percentage difference between the two. Finding a similar looking scan is very useful no doubt but if the result is hallucinated that it is less so. Dangerous even. There is no confidence percentage and there should be.