How do they know the critic did not make a mistake? Do they have a critic for th...

jsheard · on June 27, 2024

Per the article, the critic for the critic is human RLHF trainers. More specifically those humans are exploited third world workers making between $1.32 and $2 an hour, but OpenAI would rather you didn't know about that.

https://time.com/6247678/openai-chatgpt-kenya-workers/

IncreasePosts · on June 27, 2024

That is more than the average entry level position in Kenya. The work is probably also much easier (physically, that is).

vidarh · on June 28, 2024

OpenAI may well still be employing plenty of people in third world countries for this. But there are also contracts providing anywhere from $20 to $100+ an hour to do this kind of work for more complex prompt/response pairs.

I've done work on what (at least to my belief) is the very high end of that scale (not for OpenAI) to fill gaps, so I know firsthand that it's available, and sometimes the work is complex enough that a single response can take over an hour to evaluate because the requirements often include not just reading and reviewing the code, but ensuring it works, including fixing bugs. Most of the responses then pass through at least one more round of reviews of the fixed/updated responses. One project I did work on involved 3 reviewers (none of whom were on salaries anywhere close to the Kenyan workers you referred to) reviewing my work and providing feedback and a second pass of adjustments. So four high-paid workers altogether to process every response.

Of course, I'm sure plenty lower-level/simpler work had been filtered out to be addressed with cheaper labour, but I wouldn't be so sure their costs for things like code is particularly low.

golergka · on June 27, 2024

Exploited? Are you saying that these employees are forced to work for below market rates, and would be better off with other opportunities available to them? If that's the case, it's truly horrible on OpenAI's part.

soloist11 · on June 27, 2024

Every leap of civilization was built off the back of a disposable workforce. - Niander Wallace

wmeredith · on June 27, 2024

He was the bad guy, right?

ertgbnm · on June 27, 2024

That's the human's job for now.

A human reviewer might have trouble catching a mistake, but they are generally pretty good at discerning a report about a mistake is valid or not. For example, finding a bug in a codebase is hard. But if a junior sends you a code snippet and says "I think this is a bug for xyz reason", do you agree? It's much easier to confidently say yes or no. So basically it changes the problem from finding a needle in a haystack to discerning if a statement is a hallucination or not.

esafak · on June 27, 2024

It's called iteration. Humans do the same thing.

citizen_friend · on June 27, 2024

It’s not a human, and we shouldn’t assume it will have traits we do without evidence.

Iteration also is when your brain meets the external world and corrects. This is a closed system.

vidarh · on June 28, 2024

We are not assuming that. The iteration happens by taking the report and passing it to another reviewer who reviews the first review. Their comparison is between a human reviewer passing reports to a human reviewer vs. CriticGPT -> human reviewer vs. CriticGPT+human reviewer -> human reviewer.

soloist11 · on June 27, 2024

Are you sure it's not called recursion?

finger · on June 27, 2024

There is already a mistake. It refers to a function by the wrong name: os.path.comonpath > commonpath

soloist11 · on June 27, 2024

In the critical limit every GPT critic chain is essentially a spellchecker.

nmca · on June 27, 2024

A critic for the critic would be “Recursive Reward Modelling”, an exciting idea that has not been made to work in the real world yet.

soloist11 · on June 27, 2024

Most of my ideas are not original but where can I learn more about this recursive reward modeling problem?

nmca · on June 27, 2024

https://arxiv.org/abs/1811.07871

GaggiX · on June 27, 2024

It's written in the article, the critic makes mistakes, but it's better than not having it.

soloist11 · on June 27, 2024

How do they know it's better? The rate of mistakes is the same for both GPTs so now they have 2 sources of errors. If the error rate was lower for one then they could always apply it and reduce the error rate of the other. They're just shuffling the deck chairs and hoping the boat with a hole goes a slightly longer distance before disappearing completely underwater.

yorwba · on June 27, 2024

Whether adding unreliable components increases the overall reliability of a system depends on whether the system requires all components to work (in which case adding components can only make matters worse) or only some (in which case adding components can improve redundancy and make it more likely that the final result is correct).

In the particular case of spotting mistakes made by ChatGPT, a mistake is spotted if it is spotted by the human reviewer or by the critic, so even a critic that makes many mistakes itself can still increase the number of spotted errors. (But it might decrease the spotting rate per unit time, so there are still trade-offs to be made.)

soloist11 · on June 27, 2024

I see what you're saying so what OpenAI will do next is create an army of GPT critics and then run them all in parallel to take some kind of quorum vote on correctness. I guess it should work in theory if the error rate is small enough and adding more critics actually reduces the error rate. My guess is that in practice they'll converge to the population average rate of error and then pat themselves on the back for a job well done.

svachalek · on June 27, 2024

That description is remarkably apt for almost every business meeting I've ever been in.

vidarh · on June 28, 2024

> How do they know it's better?

From the article:

"In our experiments a second random trainer preferred critiques from the Human+CriticGPT team over those from an unassisted person more than 60% of the time."

Of course the second trainer could be wrong, but when the outcome tilts 60% to 40% in favour of the *combination of a human + CriticGPT that's pretty significant.

From experience doing contract work in this space, it's common to use multiple layers of reviewers to generate additional data for RLHF, and if you can improve the output from the first layer that much it'll have a fairly massive effect on the amount of training data you can produce at the same cost.

GaggiX · on June 27, 2024

>How do they know it's better?

Probably just evaluation on benchmarks.

OlleTO · on June 27, 2024

It's critics all the way down

azulster · on June 27, 2024

it's literally just the oracle problem all over again