Hacker News new | past | comments | ask | show | jobs | submit login

Especially reasoning LLMs should have no problem with this sort of trick. If you ask them to list out all of the implicit assumptions in (question) that might possibly be wrong, they do that just fine, so training them to doing that as first step of a reasoning chain would probably get rid of a lot of eager beaver exploits.





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: