Interviews absolutely should be grounded in trying to predict how a candidate would do on a job. That's the whole ballgame. The question is how to best do that. First, you need to run a repeatable process (my previews comment). Second, you need to look at the right skills. The approach we take is to track a lot, and figure out what works the best over time. What we've found to be most predictive (so far) is a base level of coding competency, plus max skill (how good an engineer is at what they are best at). So (beyond the coding portion) we don't actually care very much about what a candidate is bad at. We care about how good they are at what they are good at. To give as many candidates as possible the opportunity to show strength, we cover a number of areas. This includes back-end web development, distributed systems, debugging a large codebase, algorithms, and -- yes -- low-level systems (currency, memory, bit and bytes). We do not expect any one engineer to be strong in all of these areas (I'm weak on some of them). But they all are perfectly valid areas to show strength (and we work with companies that value each of the areas).
We've recently moved to a new interview processes organized around this idea of max skill. It's working great in terms of company matching and predictive ability. However, it seems we may have underestimated the cost to candidates of being asked about areas where they are weak. There's more negative feedback here than we've seen in previous HN discussions, and I think that the interview change may be behind that. I'm taking that to heart. I think we can probably articulate it better (that we measure in a bunch of areas and look for max strength). We're also running an experiment now where we ask engineers are the start of the interview which sections they think they'll do best on. I'm excited about this. If engineers can self-identify their strongest areas, we'll be able to make the process shorter and much more pleasant!
So, the bit shift question: that come up down one branch of a system design question that we used for a while (we've since moved to a more targeted version that is more repeatable). The (sub)issue involved adding a binary flag to a large data blob (this came up as part of a solution to a real-world caching problem). Adding a single bit flag to the front of a 1GB blob has a problem. To really add just one bit, you'd have to bitshift the entire 1GB. This is clearly not worth it to save 7 bits of storage (ignoring that that would not be saved in any case). You can just use a byte (or word), or add the flag at the end. When candidates suggested adding a bit flag at the front, we would follow up asking them how they'd do it (to unearth if they were using 'bit' as a shorthand for a reasonable solution, or if they really are a little weak in binary data manipulation). This was one small part of our interview. By itself it in no way determined the outcome of the interview, or even of the low-level systems section. Plenty of great engineers might get it wrong. But I don't think it was unfair.
> Interviews absolutely should be grounded in trying to predict how a candidate would do on a job. That's the whole ballgame.
This is directly in conflict with this comment from Harj:
> The metric we optimize for is our onsite success rate i.e. how often does a Triplebyte candidate onsite interview result in an offer.
If you want passing your interview to mean that a candidate will pass their onsite interview because your interview is tailored to deliver people who look good in an onsite, you're acting as a recruiter, trying to give companies whatever they say they want. This model provides a lot of value to companies but zero value to candidates, since anyone who passed your interview would have gotten the job anyway.
If you want passing your interview to mean that a candidate should pass their onsite interview because they would perform well in the job, you're acting as a credential, telling companies what they want. This model provides value to companies and to candidates (assuming you can tell who will perform well). These aren't the same model.
Your candidate-directed advertising leans heavily towards the second model ("just show us you can code!"). That makes sense, since that's the model that provides value to candidates. It's disappointing to hear Harj say that what you really believe in is the first model, and disconcerting to see you openly disagree with your cofounder about what your company is trying to do.
It seems a bit disingenuous to accuse them of openly disagreeing - if you assume that the client company's interview is optimized for finding folks who will perform well on the job and Triplebyte's process matches you to the employers who are most likely to hire you, the statements are perfectly consistent. Of those two assumptions the latter seems demonstrably true, while the former is obviously a bit suspect. I don't think Triplebyte is in a good position to change it right now, but maybe some day they will get enough actual performance data to start influencing it.
Given their existing positioning of "the interview process is broken and we're here to fix it", I see no reason to credit them with the opinion that client company interviews are optimized for finding people who will perform well. There is no way to reconcile "the current interview process is broken" with "our goal is to find people who do well under the current interview process".
"The current interview process is broken because strong candidates are excluded due to weak resumes and companies and candidates do a terrible amount of duplicative work just to confirm a candidate is baseline competent."
Assuming that, would you accept the position to be consistent?
Why? Sure you will get some things that might initially end up looking significant but aren't, but it's not that hard to retest those things from scratch to minimize that.
In general if you're going to provide a critical comment I think it would be better for the community here to expound on it a bit so everyone can understand your argument.
So, the bit shift question: that come up down one branch of a system design question that we used for a while (we've since moved to a more targeted version that is more repeatable). The (sub)issue involved adding a binary flag to a large data blob (this came up as part of a solution to a real-world caching problem). Adding a single bit flag to the front of a 1GB blob has a problem. To really add just one bit, you'd have to bitshift the entire 1GB. This is clearly not worth it to save 7 bits of storage (ignoring that that would not be saved in any case). You can just use a byte (or word), or add the flag at the end. When candidates suggested adding a bit flag at the front, we would follow up asking them how they'd do it (to unearth if they were using 'bit' as a shorthand for a reasonable solution, or if they really are a little weak in binary data manipulation). This was one small part of our interview. By itself it in no way determined the outcome of the interview, or even of the low-level systems section. Plenty of great engineers might get it wrong. But I don't think it was unfair.
Of course it's unfair. The candidate isn't actually programming a solution when they're talking to you. They're on a tight time crunch, under a microscope, in front of an interviewer. The answers to your questions will literally make or break their future with you. Did you specify to them in your original question that the entries in the cache are 1GB large? If you assigned them the task of implementing a solution to your caching question, they would immediately notice using a 1-bit flag is a poor design decision.
The point is:
Plenty of great engineers might get it wrong.
That says quite a lot more about Triplebyte's question than the engineers. A wrong answer doesn't mean they're weak in bit manipulation or that they decide to implement poor solutions. It says they're suffering from interview jitters. They're weak in the artificial environment you've constructed for the purposes of the interview, which may or may not correlate with their actual ability.
This may sound like useless theorizing, but unfortunately a massive number of excellent engineers are awful in an interview setting. But if you give them a problem to actually solve, they pass with flying colors.
Triplebyte does give problems to candidates to solve, but it sounds like you also care about whether they can pass your interview (by demonstrating sufficient max skill when prompted) instead of whether they can implement solutions to the problems you assign to them. This rules out candidates who would otherwise do very well, which is the type of candidate you're trying to find.
I know that you're saying the verbal section of the interview isn't the whole process, but are you sure it's an effective one?
It might be positively misleading. A candidate who is very strong in the area you're looking for is also likely to be someone who will get your questions completely wrong, because they're not programming. They're talking. So it sounds like you're selecting for people who can talk well: those who can show strength during your interview when prompted verbally. Is that the right metric to find talented candidates?
If you were to put together a pipeline where e.g. you give candidates an XCode codebase and say "There are bugs in this codebase, and <specific missing features>. Implement as many fixes or improvements as you wish or have time for, then send us the code," you would have a mechanism which selects for candidates who are ~100% competent, since that's exactly the type of work they'll be doing on a day-to-day basis.
Some candidates wouldn't want to do that, so perhaps there should be an alternative for them. But it'd be vastly more effective than quiz-style questions during a timeboxed interview.
It's possible to come up with endless reasons why it might be a bad idea to set up a pipeline like that. But all the companies that have set it up have been shocked how well it works when they rely solely on that test. Instead of an opportunity to show strength during an interview, the candidate is able to directly answer the question "Can they do the work?"
> I applied through their project track. It was described as a low-pressure way to write your code ahead of time and talk about it in the interview. The interview was, instead, about making changes to my project while Ammon watched. (Also, there was a request to derive a formal proof while Ammon watched. I didn't get it.) After which I got a rejection saying that my project was great but my interview performance was so poor that they wouldn't move forward.
It sounds like Triplebyte almost has the pipeline described above, but it won't work if you watch the candidate or ask them to do more work. The project alone has to be set up to be a sufficient demonstration of skill.
Well, my objection was much more fundamental. I never suggested packing a bit at the beginning, I suggested using a header of about 20 bytes or so (trivial against a GB). As for the shifting 1 bit question, my complaint with it was that I was in the middle of answering 1 question and then another question was brought up as a non-sequiter. The fact that I was asked it made me assume there must be some answer superior to iterate over all the data, which I don't believe there is.
The other reason I dislike this type of interview question is that if the interviewer never proposes a superior alternative at the end, you get no opportunity to challenge them. How do we know my solution didn't solve problems the other party didn't see?
For context, it seems the "preferred" solution was to build a wrapper around existing memcache. However as a real-world engineer the solutions I was solving for (90% of your clients won't use this wrapper at a company over 100 people, so we want to avoid key collision with people who aren't using this driver) were not the theoretical ones (How could we make the header only 4 bytes!) that the interviewer was evaluating on.
Plus on top of all this, I have no idea if the person interviewing me is unaware that memcached supports atomic increment, knows but doesn't care, or is deeply concerned about preserving this functionality. There are dozens of facets that an individual could consider "important" in this type of problem, and there's no objective basis for most of these concerns without a context of the user (because that's what all engineering comes down to after all).
My guidance to companies in this type of situation is: A great engineer can do 80 hours of work in 15 hours, but not necessarily do 30 minutes of work in 29.
This is all just really hard (and counterintuitive). We've tried take-home projects, with and without asking the engineer to make changes live. This is actually very controversial (a lot of engineer are, reasonably, against the time commitment). And (we found) to get an equivalent level of consistency (the 1st step toward accuracy) the project needs to be pretty big. The work really needs to be of the level of what engineers do in a full day or more on the job. Shorter projects than this introduce the noise of how much time the candidate spent on the project. You can introduce a hard time limit, but then you're back in interview stress territory.
Large take-home projects (and trial employment) totally are better ways to evaluate engineers than interviews. Unfortunately, they require major time commitments from the candidate that many engineers are not able to give. Most (about 80%) engineers select a regular interview if given the choice. I do think they are a good option to offer, but they can't (unfortunately) replace interviews in the majority of cases.
We've also tried debug sections (where we give the candidate a program with bugs in it and ask them to fix test cases). This works great as a portion of the interview (but misses some people with other skills, so it can't be the entire interview).
We have found that well-crafted small problems are enormously successful predictors of good hires. Sizing them to require an evening, possibly two, of work is what to aim for.
Studying for in-person interviews is a major time commitment. One that's reusable across employers, sure, but a Triplebyte take home project should be just as reusable as studying for a Triplebyte interview.
Have you guys considered that seeking to quantify engineering skill may be fundamentally at odds with the goal of hiring good engineers?
I'd be interested in seeing the argument for why this is a good thing, beyond the fact that it enables a company like triplebyte to exist.
To me it seems like these concerns always boil down to the same thing: tests that try to quantify something that may be unquantifiable. I suspect this is because it's not cost effective to facilitate a process that digs well beyond engineering trivia.
There are two things to measure - ability to get an offer (easy) and effectiveness on the job (hard). Triplebyte has clearly provided an improvement on the former. Seems like the latter is very difficult to measure, but I don't know why anyone would assume quantifying would negatively correlated with performance - at worst one might assume it is not correlated.
Have you guys thought about trial employment with payment -- that was an idea suggested a couple months ago. I think the upfront cost would be better than having a false positive.
Yeah. Trial employment better be paid! :) Even with pay, unfortunately, folks with an existing job often can't take the time off. And when people have the option between one company making an offer and another with a trial period, they often go for the offer (which makes sense). The other issue is that a company can't do a trial period with every person who applies (too much cost for the team), so there has to be screening step in front, which sort of just moves the problem to an earlier level.
I do think trial employment can be a great thing. But it's not a universal replacement.
Trial employment immediately screens out candidates who do not have the time or patience to spend an entire day working very hard with a prospect. It also introduces an NDA, a need for a sandbox, etc.
We've recently moved to a new interview processes organized around this idea of max skill. It's working great in terms of company matching and predictive ability. However, it seems we may have underestimated the cost to candidates of being asked about areas where they are weak. There's more negative feedback here than we've seen in previous HN discussions, and I think that the interview change may be behind that. I'm taking that to heart. I think we can probably articulate it better (that we measure in a bunch of areas and look for max strength). We're also running an experiment now where we ask engineers are the start of the interview which sections they think they'll do best on. I'm excited about this. If engineers can self-identify their strongest areas, we'll be able to make the process shorter and much more pleasant!
So, the bit shift question: that come up down one branch of a system design question that we used for a while (we've since moved to a more targeted version that is more repeatable). The (sub)issue involved adding a binary flag to a large data blob (this came up as part of a solution to a real-world caching problem). Adding a single bit flag to the front of a 1GB blob has a problem. To really add just one bit, you'd have to bitshift the entire 1GB. This is clearly not worth it to save 7 bits of storage (ignoring that that would not be saved in any case). You can just use a byte (or word), or add the flag at the end. When candidates suggested adding a bit flag at the front, we would follow up asking them how they'd do it (to unearth if they were using 'bit' as a shorthand for a reasonable solution, or if they really are a little weak in binary data manipulation). This was one small part of our interview. By itself it in no way determined the outcome of the interview, or even of the low-level systems section. Plenty of great engineers might get it wrong. But I don't think it was unfair.