Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They trumpet the exam results, but isn't it likely that the model has just memorized the exam?


It's trained on pre-2021 data. Looks like they tested on the most recent tests (i.e. 2022-2023) or practice exams. But yeah standardized tests are heavily weighed towards pattern matching, which is what GPT-4 is good at, as shown by its failure at the hindsight neglect inverse-scaling problem.


I believe they showed that in GPT4 reversed the trend on the hindsight neglect problem. Search for "hindsight neglect" in the website and you can see that it's accuracy on the problem shot up to 100%.


oh my bad, totally misread that


Well, yeah. It's a LLM, it's not reasoning about anything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: