For all the complaints about AI generated content showing up in scientific journals, I'm exited for the flip side, where an LLM can review massive quantities of scientific publications for inaccuracies/fraud.
Ex: Finding when the exact same image appears in multiple publications, but with different captions/conclusions.
The evidence in this case came from one individual willing to volunteer hundreds of hours producing a side by side of all the reports. But clearly that doesn't scale.
I'm hoping it won't have the same results as AI Detectors for schoolwork, which have marked many legitimate papers as fraud, ruining several students' lives in the process. One even marked the U.S. Constitution as written by AI [1].
It's fraud all the way down, where even the fraud detectors are fraudulent. Similar story to the anti-malware industry, where software bugs in security software like CrowdStrike, Sophos, or Norton cause more damage than the threats they prevent against.
> For all the complaints about AI generated content showing up in scientific journals, I'm exited for the flip side, where an LLM can review massive quantities of scientific publications for inaccuracies/fraud.
How would this work? AI can't even detect AI generated content reliably.
Not in a zero shot approach. But LLMs are more than capable of solving a similar scenario to the one presented:
- Parse all papers you want to audit
- Extract images (non AI)
- Diff images (non AI)
- Pull captions / related text near each image (LLM)
- For each image > 99% similarity, use LLM to classify if conclusions are different (i.e. highly_similar, similar, highly_dissimilar).
Then aggregate the results. It wouldn't prove fraud, but could definitely highlight areas for review. i.e. "This chart was used in 5 different papers with dissimilar conclusions"
Wouldn’t it be cool if people got credit for reproducing other people’s work instead of only novel things. It’s like having someone on your team that loves maintaining but not feature building.
LLMs might find some specific indications of possible fraud, but then fraudsters would just learn to avoid those. LLMs won’t be able to detect when a study or experiment isn’t reproducible.
Of course, but increasing the difficulty of committing fraud is still good. Fraudsters learn to bypass captchas as well, but they still block a ton of bad traffic.
Won't the scientist use some relatively secure/private model to fraud-check their own work before submitting? If it catches something, they would just improve the fraud.
Ex: Finding when the exact same image appears in multiple publications, but with different captions/conclusions.
The evidence in this case came from one individual willing to volunteer hundreds of hours producing a side by side of all the reports. But clearly that doesn't scale.