Don’t understand this take. If it was easy to make an LLM that quickly parsed al...

kolbe · on Feb 4, 2023

It kind of depends on the frame of the solution. Google can answer leetcode questions, leetcode's answers section can answer them as well. If ChatGPT is solving them, that's one thing, but if it's just mapping the question to a solution found somewhere, then not so impressive.

morelisp · on Feb 4, 2023

While I think the jury is still out on whether ChatGPT is truly useful or not, passing an L3 hiring test is not evidence of that one way or another.

pixl97 · on Feb 4, 2023

If it doesn't point out that ChatGPT is useful, especially if its proven it is not, then maybe the hiring tests are not useful.

layer8 · on Feb 4, 2023

The hiring tests are designed to serve as a predictor for human applicants. How well an LLM does on them doesn’t necessarily say anything about the usefulness of those tests as said predictor.

morelisp · on Feb 4, 2023

Well, what it shows is that hiring tests are not useful as Turing tests. But nobody designed them to be or expected them to be! At best it "proves" is that hiring tests are not sufficient. But again, nobody thought they were. And even still, the assumption a human is taking the hiring test still seems reasonable. Why overengineer your process?

cevn · on Feb 4, 2023

That's exactly what it proves..

Nursie · on Feb 4, 2023

We have a winner …

petesergeant · on Feb 5, 2023

> the jury is still out on whether ChatGPT is truly useful or not

I'd pay $100 a month for ChatGPT. It allows me to ask free-form questions about some open-source packages with truly appalling docs and usually gets them right, and saves me a bunch of time. It helps me understand technical language in papers I'm reading at the moment regarding stats. It's been useful to find good Google search terms for various bits of history I wanted to find out more about.

I don't think the jury is out at all on whether it's useful. The jury is out on the degree to which it can replace humans for tasks, and I'd suggest the answer is "no" for most tasks.

dariusj18 · on Feb 4, 2023

I just used to it write a function for me yesterday. I had previously googled a few times and came up dry, asked Chat GPT and it came out with a solution I had not considered, and was better than what I was thinking.

freejazz · on Feb 4, 2023

what does it being easy have to do with it?

SpeedilyDamage · on Feb 4, 2023

You don't understand the take that just because ChatGPT can pass a coding interview doesn't mean the coding interview is useless or that ChatGPT could actually do the job?

What part of that take do you not understand? It's a really easy concept to grasp, and even if you don't agree with it, I would expect at least that a research scientist (according to your bio) would be able to grok the concepts almost immediately...

brhsagain · on Feb 4, 2023

> doesn't mean the coding interview is useless or that ChatGPT could actually do the job

Aren't these kind of mutually exclusive, at least directionally? If the interview is meaningful you'd expect it to predict job performance. If it can't predict job performance then it is kind of useless.

I guess you could play some word games here to occupy a middle ground ("the coding interview is kind of useful, it measures something, just not job performance exactly") but I can't think of a formulation where this doesn't sound pretty silly.

l33t233372 · on Feb 4, 2023

It’s possible that being able to pass the interview is indicative of performance in humans but not in LLMs.

Humans think differently from LLMs so it makes sense to interpret the same signal in different ways.

ianbutler · on Feb 4, 2023

We've been saying for years these interviews are not predictive of job performance. Here's the proof.

Nothing you do in an interview like this resembles day to day work in this field.

SpeedilyDamage · on Feb 4, 2023

For what it's worth, when I ask these kinds of questions (rarely anymore), I'm looking more at how the problem is solved, not what the solution is.

A wrong answer with good thinking is better than a correct answer with no explanation.

jokethrowaway · on Feb 4, 2023

Chatgpt can provide you a great explanation of the how.

Oftentimes the explanation is correct, even if there's some mistake in the code (probably because the explanation is easier to generate than the correct code, an artifact of being a high tech parrot)

morelisp · on Feb 4, 2023

Finding a single counterexample does not disprove correlation or predictive ability. A hiring test can have both false positives and false negatives and still be useful.

hacym · on Feb 4, 2023

I don’t think your militant attitude helps them understand any better.

SpeedilyDamage · on Feb 4, 2023

I don't think I had a militant attitude, but I do think saying, "I don't understand..." rather than "I disagree with..." puts a sour note on the entire conversation.

hacym · on Feb 4, 2023

You literally went to their profile and called them out about how they should be able to understand something you’re describing as so easy to understand.

SpeedilyDamage · on Feb 4, 2023

Yeah, what is the problem with that? They engaged dishonestly by claiming they didn't understand something, why should I do anything other than call them on that?

hacym · on Feb 4, 2023

OK — just don’t be surprised when people think you’re being a jerk because you didn’t like the words someone chose. I’d assert you’re acting in bad faith more than the person you responded to.

SpeedilyDamage · on Feb 4, 2023

I just... how is what you're doing here different from what I was saying, other than you're explicitly calling me names?

hacym · on Feb 4, 2023

> engaged dishonestly by claiming [someone used a word they didn’t like] why should I do anything other than call them on that?

Have a great day.

SpeedilyDamage · on Feb 4, 2023

...what? I'm so confused.

hacym · on Feb 4, 2023

It’s really very easy to understand. When someone gives you the same crap back that you just got done giving someone, you don’t like it and act like that shouldn’t happen.

SpeedilyDamage · on Feb 4, 2023

Did I say I didn't "like" (I'd use the word "appreciate") it, or that I didn't think it should happen? If so, could you please highlight where?

I just see, in what you're doing, a wild lack of self awareness. You're criticizing me for doing to someone else a milder version of what you're trying to do to me now; I'm genuinely confused how you can't see that, or how you could possibly stand the hypocrisy if you do understand that.

hacym · on Feb 4, 2023

I feel that you’re being dishonest if you are saying you’re confused or don’t understand my point.

SpeedilyDamage · on Feb 5, 2023

I can't change how you feel.

mensetmanusman · on Feb 4, 2023

I was just responding to the ‘water is blue’ level dismissal of a seemingly clear advancement. Sorry it seemed like a sour note— not my intent:)

SpeedilyDamage · on Feb 4, 2023

All good, it's just hard sometimes to understand intent online, maybe I overreacted to what I perceived as bad faith engagement.

Words are hard!

jokethrowaway · on Feb 4, 2023

I'll try to phrase it so that even someone who is not a research scientist (?) can understand. I'm not one, whatever that means.

Let's define the interview as useful if the passing candidate can do the job.

Sounds reasonable.

ChatGPT can pass the interview and can't do the job.

The interview is not able to predict the poor working performance of ChatGPT and it's therefore useless.

Some of the companies I worked for hired ex fang people as if it was a mark of quality, but that hasn't always worked out well. There is plenty of people getting out of fangs having just done mediocre work for a big paycheck.

thaumasiotes · on Feb 4, 2023

> Let's define the interview as useful if the passing candidate can do the job.

The technical term for this is "construct validity", that the test results are related to something you want to learn about.

> The interview is not able to predict the poor working performance of ChatGPT and it's therefore useless.

This doesn't follow; the interview doesn't need to be able to exclude ChatGPT because ChatGPT doesn't interview for jobs. It's perfectly possible that the same test shows high validity on humans and low validity on ChatGPT.