Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They definitely can.

I rolled out reasoning for my interactive reader app, and I tried to extract R1's reasoning traces to use with my existing models, but found its COT for writing wasn't particularly useful*.

Instead of leaning on R1 I came up with my own framework for getting the LLM to infer the reader's underlying frame of mind through long chains of thought, and with enough guidance and a some hand edited examples I was able to get reasoning traces that demonstrated real insight into reader behavior.

Obviously it's much easier in my case because it's an interactive experience: the reader is telling the AI what action they'd like the main character to try, and that in turn is an obvious hint into how they want things go otherwise. But readers don't want everything to go perfectly every time, so it matters that the LLMs are also getting very good picking up on non-obvious signals in reader behavior.

With COT the model infers the reader expectations and state of mind in its own way and then "thinks" itself into how to subvert their expectations, especially in ways that will have a meaningful payoff for the specific reader. That's a huge improvement over an LLM's typical attempts at subversion which tend to bounce between being too repetitive to feel surprising, or too unpredictable to feel rewarding.

(* I agree that current reasoning oriented post-training over-indexes on math and coding, mostly because the reward functions are easier. But I'm also very ok with that as someone trying to compete in the space)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: