Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> although it's possible Claude 4 was trained on that discussion lol

This is why we can't have consistent benchmarks



Yeah I agree, also, what is the use of that benchmark? Who cares? How does it related to stuff that does matter?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: