Continual feedback means continual training. No way around it. So you’d have to ...

regularfry · 2025-04-13T18:15:57 1744568157

That's not quite true. The system prompt is state that you can use for "training" in a way that fits the problem here. It's not differentiable so you're in slightly alien territory, but it's also more comprehensible than gradient-descending a bunch of weights.

immibis · 2025-04-13T23:19:09 1744586349

If you treat it as vectors instead of words, it might be differentiable.

nico · 2025-04-13T17:30:24 1744565424

Or maybe figure out a different architecture

Either way, the end user experience would be vastly improved