what you're referring to has nothing to do with how GPTs are *pretrained* or wit...

JohnMakin · 2024-09-10T20:45:37 1726001137

I mean, in so many words that's precisely what I am complaining about. Their attempt to solve it is to make it appear more human. What's wrong with an error message? Or in this specific example - why bother at all? Why even stop the conversation? It's ridiculous.

throwaway314155 · 2024-09-11T01:23:29 1726017809

RLHF is what was responsible for your frustration. You're assuming there is a scalable alternative. There is not.

> What's wrong with an error message?

You need a dataset for RLHF which provides an error message _only_ when appropriate. That is not yet possible. For the same reason the conversation stops.

> Or in this specific example - why bother at all? Why even stop the conversation? It's ridiculous.

They want a stop/refusal condition to prevent misuse. Adding one at all means sometimes stopping when the model should actually keep going. Not only is this subjective as hell, but there's still no method to cover every corner case (however objectively defined those may be).

You're correct to be frustrated with it, but it's not as though they have some other option that allows them to detect how and when to stop/not stop, error message/complain for every single human's preference patterns on the planet. Particularly not one that scales as well as RLHF on a custom dataset of manually written preferences. It's an area of active research for a reason.