Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Eagerness to tool call is an interesting observation. Certainly an MCP ecosystem would require a tool biased model.

However, after having pretty deep experience with writing book (or novella) length system prompts, what you mentioned doesn’t feel like a “regime change” in model behavior. I.e it could do those things because its been asked to do those things.

The numbers presented in this paper were almost certainly after extensive system prompt ablations, and the fact that we’re within a tenth of a percent difference in some cases indicates less fundamental changes.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: