I had an interesting experience with triplebyte which wasn't as objectively bad ...

ammon · on March 9, 2017

We actually put a bunch of effort into consistency/repeatability checks. Every interview is recorded (video), and we re-watch and re-grade a percentage of them to measure the consistency. A long-term experiment we're running is comparing qualitative scores (code quality, good process, how good did the interviewer feel the candidate was) with quantitative features (which tests passed, how long did it take, what design--picked from a decision tree--did the candidate take). We calibrate the qualitative scores with the recorded interviews. So far, quantitative scoring is winning (when judged against predicting interview results at companies). We're waiting, however, until we can see which better predicts job success.

pfarnsworth · on March 9, 2017

It sounds like your ability as an interviewer is pretty poor. There are several examples of the same smug behavior. What sort of training have you gone through to ensure that you're actually an appropriate and qualified person to be interviewing?

Harj · on March 10, 2017

I'm his co-founder and this comment is unnecessarily personal. Ammon has done over 900 technical interviews (https://www.reddit.com/r/cscareerquestions/comments/5y95x6/i...) and there's one negative reference to him specifically on this thread.

pfarnsworth · on March 10, 2017

Nah, you can't use forum threads as proof. You need to do the analysis scientifically, which based on the feedback, it sounds like you need to do. You don't bother quantifying the ability of you and your interviewers, you just assume you're great.

_qc3o · on March 10, 2017

I interviewed with Ammon. He did not come across as smug. I went through the process just as a curiosity because I was fed up with the typical interview process. I haven't followed through anything else related to Triplebyte in terms of an actual job in case people think I have a favorable view because I got a job through them.

pyb · on March 9, 2017

Wasn't there a study by Google a while back, where they found, as a trend, that their most successful people had only marginally passed their job interview ?

jameside · on March 10, 2017

The finding was that people who had received a "no hire" recommendation yet still received an offer tended to do well. The reason being: to compensate for the poor feedback, someone else on the hiring loop believed in the candidate so much and saw something exceptional and were willing to go to bat for them.

astrange · on March 9, 2017

That sounds like the Pareto-optimal solution to a job interview. It's like how you're not doing grad school properly if your grades are better than C.

Also, candidates who do too well could be too good for the job, since Google supposedly likes having incredibly overqualified people maintain do-nothing internal apps.

mbesto · on March 9, 2017

> better predicts job success

How do you rate job success?

ammon · on March 9, 2017

being happy with the job and not having left after 1 year

mbesto · on March 11, 2017

Wait so job success is based on the interviewee, and not the company?

sillysaurus3 · on March 9, 2017

Notably absent from that list is "Are we verifying we're doing a good job as interviewers?"

It doesn't matter how good the interviewer feels the candidate is, or whether a design was picked from a decision tree. All that matters is whether the candidate can do the work at actual companies.

I think people here are reacting to irrelevancies during the interview process -- questions which cannot possibly be reflective of a candidate's real-world competency. (When was the last time you shifted a gigabyte of memory? And even if you did, that's not what companies are going to employ people to do. So why ask the question? Are you sure it isn't trivia?)

ammon · on March 9, 2017

Interviews absolutely should be grounded in trying to predict how a candidate would do on a job. That's the whole ballgame. The question is how to best do that. First, you need to run a repeatable process (my previews comment). Second, you need to look at the right skills. The approach we take is to track a lot, and figure out what works the best over time. What we've found to be most predictive (so far) is a base level of coding competency, plus max skill (how good an engineer is at what they are best at). So (beyond the coding portion) we don't actually care very much about what a candidate is bad at. We care about how good they are at what they are good at. To give as many candidates as possible the opportunity to show strength, we cover a number of areas. This includes back-end web development, distributed systems, debugging a large codebase, algorithms, and -- yes -- low-level systems (currency, memory, bit and bytes). We do not expect any one engineer to be strong in all of these areas (I'm weak on some of them). But they all are perfectly valid areas to show strength (and we work with companies that value each of the areas).

We've recently moved to a new interview processes organized around this idea of max skill. It's working great in terms of company matching and predictive ability. However, it seems we may have underestimated the cost to candidates of being asked about areas where they are weak. There's more negative feedback here than we've seen in previous HN discussions, and I think that the interview change may be behind that. I'm taking that to heart. I think we can probably articulate it better (that we measure in a bunch of areas and look for max strength). We're also running an experiment now where we ask engineers are the start of the interview which sections they think they'll do best on. I'm excited about this. If engineers can self-identify their strongest areas, we'll be able to make the process shorter and much more pleasant!

So, the bit shift question: that come up down one branch of a system design question that we used for a while (we've since moved to a more targeted version that is more repeatable). The (sub)issue involved adding a binary flag to a large data blob (this came up as part of a solution to a real-world caching problem). Adding a single bit flag to the front of a 1GB blob has a problem. To really add just one bit, you'd have to bitshift the entire 1GB. This is clearly not worth it to save 7 bits of storage (ignoring that that would not be saved in any case). You can just use a byte (or word), or add the flag at the end. When candidates suggested adding a bit flag at the front, we would follow up asking them how they'd do it (to unearth if they were using 'bit' as a shorthand for a reasonable solution, or if they really are a little weak in binary data manipulation). This was one small part of our interview. By itself it in no way determined the outcome of the interview, or even of the low-level systems section. Plenty of great engineers might get it wrong. But I don't think it was unfair.

thaumasiotes · on March 10, 2017

> Interviews absolutely should be grounded in trying to predict how a candidate would do on a job. That's the whole ballgame.

This is directly in conflict with this comment from Harj:

> The metric we optimize for is our onsite success rate i.e. how often does a Triplebyte candidate onsite interview result in an offer.

If you want passing your interview to mean that a candidate will pass their onsite interview because your interview is tailored to deliver people who look good in an onsite, you're acting as a recruiter, trying to give companies whatever they say they want. This model provides a lot of value to companies but zero value to candidates, since anyone who passed your interview would have gotten the job anyway.

If you want passing your interview to mean that a candidate should pass their onsite interview because they would perform well in the job, you're acting as a credential, telling companies what they want. This model provides value to companies and to candidates (assuming you can tell who will perform well). These aren't the same model.

Your candidate-directed advertising leans heavily towards the second model ("just show us you can code!"). That makes sense, since that's the model that provides value to candidates. It's disappointing to hear Harj say that what you really believe in is the first model, and disconcerting to see you openly disagree with your cofounder about what your company is trying to do.

itsdrewmiller · on March 10, 2017

It seems a bit disingenuous to accuse them of openly disagreeing - if you assume that the client company's interview is optimized for finding folks who will perform well on the job and Triplebyte's process matches you to the employers who are most likely to hire you, the statements are perfectly consistent. Of those two assumptions the latter seems demonstrably true, while the former is obviously a bit suspect. I don't think Triplebyte is in a good position to change it right now, but maybe some day they will get enough actual performance data to start influencing it.

thaumasiotes · on March 10, 2017

Given their existing positioning of "the interview process is broken and we're here to fix it", I see no reason to credit them with the opinion that client company interviews are optimized for finding people who will perform well. There is no way to reconcile "the current interview process is broken" with "our goal is to find people who do well under the current interview process".

itsdrewmiller · on March 10, 2017

"The current interview process is broken because strong candidates are excluded due to weak resumes and companies and candidates do a terrible amount of duplicative work just to confirm a candidate is baseline competent."

Assuming that, would you accept the position to be consistent?

swyman · on March 10, 2017

> The approach we take is to track a lot, and figure out what works the best over time.

That's really, really bad statistics.

itsdrewmiller · on March 10, 2017

Why? Sure you will get some things that might initially end up looking significant but aren't, but it's not that hard to retest those things from scratch to minimize that.

In general if you're going to provide a critical comment I think it would be better for the community here to expound on it a bit so everyone can understand your argument.

sillysaurus3 · on March 9, 2017

So, the bit shift question: that come up down one branch of a system design question that we used for a while (we've since moved to a more targeted version that is more repeatable). The (sub)issue involved adding a binary flag to a large data blob (this came up as part of a solution to a real-world caching problem). Adding a single bit flag to the front of a 1GB blob has a problem. To really add just one bit, you'd have to bitshift the entire 1GB. This is clearly not worth it to save 7 bits of storage (ignoring that that would not be saved in any case). You can just use a byte (or word), or add the flag at the end. When candidates suggested adding a bit flag at the front, we would follow up asking them how they'd do it (to unearth if they were using 'bit' as a shorthand for a reasonable solution, or if they really are a little weak in binary data manipulation). This was one small part of our interview. By itself it in no way determined the outcome of the interview, or even of the low-level systems section. Plenty of great engineers might get it wrong. But I don't think it was unfair.

Of course it's unfair. The candidate isn't actually programming a solution when they're talking to you. They're on a tight time crunch, under a microscope, in front of an interviewer. The answers to your questions will literally make or break their future with you. Did you specify to them in your original question that the entries in the cache are 1GB large? If you assigned them the task of implementing a solution to your caching question, they would immediately notice using a 1-bit flag is a poor design decision.

The point is:

Plenty of great engineers might get it wrong.

That says quite a lot more about Triplebyte's question than the engineers. A wrong answer doesn't mean they're weak in bit manipulation or that they decide to implement poor solutions. It says they're suffering from interview jitters. They're weak in the artificial environment you've constructed for the purposes of the interview, which may or may not correlate with their actual ability.

This may sound like useless theorizing, but unfortunately a massive number of excellent engineers are awful in an interview setting. But if you give them a problem to actually solve, they pass with flying colors.

Triplebyte does give problems to candidates to solve, but it sounds like you also care about whether they can pass your interview (by demonstrating sufficient max skill when prompted) instead of whether they can implement solutions to the problems you assign to them. This rules out candidates who would otherwise do very well, which is the type of candidate you're trying to find.

I know that you're saying the verbal section of the interview isn't the whole process, but are you sure it's an effective one?

It might be positively misleading. A candidate who is very strong in the area you're looking for is also likely to be someone who will get your questions completely wrong, because they're not programming. They're talking. So it sounds like you're selecting for people who can talk well: those who can show strength during your interview when prompted verbally. Is that the right metric to find talented candidates?

If you were to put together a pipeline where e.g. you give candidates an XCode codebase and say "There are bugs in this codebase, and <specific missing features>. Implement as many fixes or improvements as you wish or have time for, then send us the code," you would have a mechanism which selects for candidates who are ~100% competent, since that's exactly the type of work they'll be doing on a day-to-day basis.

Some candidates wouldn't want to do that, so perhaps there should be an alternative for them. But it'd be vastly more effective than quiz-style questions during a timeboxed interview.

It's possible to come up with endless reasons why it might be a bad idea to set up a pipeline like that. But all the companies that have set it up have been shocked how well it works when they rely solely on that test. Instead of an opportunity to show strength during an interview, the candidate is able to directly answer the question "Can they do the work?"

EDIT: From one of the other comments (https://news.ycombinator.com/item?id=13834231):

> I applied through their project track. It was described as a low-pressure way to write your code ahead of time and talk about it in the interview. The interview was, instead, about making changes to my project while Ammon watched. (Also, there was a request to derive a formal proof while Ammon watched. I didn't get it.) After which I got a rejection saying that my project was great but my interview performance was so poor that they wouldn't move forward.

It sounds like Triplebyte almost has the pipeline described above, but it won't work if you watch the candidate or ask them to do more work. The project alone has to be set up to be a sufficient demonstration of skill.

zug_zug · on March 10, 2017

Well, my objection was much more fundamental. I never suggested packing a bit at the beginning, I suggested using a header of about 20 bytes or so (trivial against a GB). As for the shifting 1 bit question, my complaint with it was that I was in the middle of answering 1 question and then another question was brought up as a non-sequiter. The fact that I was asked it made me assume there must be some answer superior to iterate over all the data, which I don't believe there is.

The other reason I dislike this type of interview question is that if the interviewer never proposes a superior alternative at the end, you get no opportunity to challenge them. How do we know my solution didn't solve problems the other party didn't see?

For context, it seems the "preferred" solution was to build a wrapper around existing memcache. However as a real-world engineer the solutions I was solving for (90% of your clients won't use this wrapper at a company over 100 people, so we want to avoid key collision with people who aren't using this driver) were not the theoretical ones (How could we make the header only 4 bytes!) that the interviewer was evaluating on.

Plus on top of all this, I have no idea if the person interviewing me is unaware that memcached supports atomic increment, knows but doesn't care, or is deeply concerned about preserving this functionality. There are dozens of facets that an individual could consider "important" in this type of problem, and there's no objective basis for most of these concerns without a context of the user (because that's what all engineering comes down to after all).

My guidance to companies in this type of situation is: A great engineer can do 80 hours of work in 15 hours, but not necessarily do 30 minutes of work in 29.

ammon · on March 10, 2017

This is all just really hard (and counterintuitive). We've tried take-home projects, with and without asking the engineer to make changes live. This is actually very controversial (a lot of engineer are, reasonably, against the time commitment). And (we found) to get an equivalent level of consistency (the 1st step toward accuracy) the project needs to be pretty big. The work really needs to be of the level of what engineers do in a full day or more on the job. Shorter projects than this introduce the noise of how much time the candidate spent on the project. You can introduce a hard time limit, but then you're back in interview stress territory.

Large take-home projects (and trial employment) totally are better ways to evaluate engineers than interviews. Unfortunately, they require major time commitments from the candidate that many engineers are not able to give. Most (about 80%) engineers select a regular interview if given the choice. I do think they are a good option to offer, but they can't (unfortunately) replace interviews in the majority of cases.

We've also tried debug sections (where we give the candidate a program with bugs in it and ask them to fix test cases). This works great as a portion of the interview (but misses some people with other skills, so it can't be the entire interview).

tptacek · on March 10, 2017

We didn't find this at all. You guys should feel free to reach out sometime; I'm happy to talk about what we did at NCC/Matasano.

wglb · on March 10, 2017

We have found that well-crafted small problems are enormously successful predictors of good hires. Sizing them to require an evening, possibly two, of work is what to aim for.

closeparen · on March 10, 2017

Studying for in-person interviews is a major time commitment. One that's reusable across employers, sure, but a Triplebyte take home project should be just as reusable as studying for a Triplebyte interview.

Waterluvian · on March 10, 2017

Have you guys considered that seeking to quantify engineering skill may be fundamentally at odds with the goal of hiring good engineers?

I'd be interested in seeing the argument for why this is a good thing, beyond the fact that it enables a company like triplebyte to exist.

To me it seems like these concerns always boil down to the same thing: tests that try to quantify something that may be unquantifiable. I suspect this is because it's not cost effective to facilitate a process that digs well beyond engineering trivia.

itsdrewmiller · on March 10, 2017

There are two things to measure - ability to get an offer (easy) and effectiveness on the job (hard). Triplebyte has clearly provided an improvement on the former. Seems like the latter is very difficult to measure, but I don't know why anyone would assume quantifying would negatively correlated with performance - at worst one might assume it is not correlated.

bhl · on March 10, 2017

Have you guys thought about trial employment with payment -- that was an idea suggested a couple months ago. I think the upfront cost would be better than having a false positive.

ammon · on March 10, 2017

Yeah. Trial employment better be paid! :) Even with pay, unfortunately, folks with an existing job often can't take the time off. And when people have the option between one company making an offer and another with a trial period, they often go for the offer (which makes sense). The other issue is that a company can't do a trial period with every person who applies (too much cost for the team), so there has to be screening step in front, which sort of just moves the problem to an earlier level.

I do think trial employment can be a great thing. But it's not a universal replacement.

lazzlazzlazz · on March 10, 2017

Trial employment immediately screens out candidates who do not have the time or patience to spend an entire day working very hard with a prospect. It also introduces an NDA, a need for a sandbox, etc.

jtchang · on March 10, 2017

How would you actually bit shift 1 gigabyte worth of data? Would you start at the end and read in words? Possibly using some temp variable?

liveoneggs · on March 10, 2017

so like every other interview ever?