Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Continual feedback means continual training. No way around it. So you’d have to scope down the functional unit to a fairly small lora in order to get reasonable re-training costs here.


That's not quite true. The system prompt is state that you can use for "training" in a way that fits the problem here. It's not differentiable so you're in slightly alien territory, but it's also more comprehensible than gradient-descending a bunch of weights.


If you treat it as vectors instead of words, it might be differentiable.


Or maybe figure out a different architecture

Either way, the end user experience would be vastly improved




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: