Hiring the Smartest People in the World

drewcrawford · on May 2, 2012

People tend to hire, not so much the best people they can find, but the people who are most like themselves.

And if you hire people like yourself, and your job ad says "We hire the smartest people," then you must be the smartest people! I've found it to be a huge negative signal. These sorts of interviews seem to mostly be about developers patting themselves on the back about how they stumped the applicants in interviews.

I think we need to have the talk about the "hire the smartest developers" meme in our industry like the web hosting industry talks about five 9s of uptime. The first nine is writing FizzBuzz. The second nine is being able to design and maintain large systems. The third nine is knowing the finer points of Prim's and Dijkstra's and a copy of TAOCP with arbitrary dust coating on the shelf.

Do you really need talent that meets a higher standard than that? Maybe if you are Google or NASA. But if you are not launching things into space, you don't need to hire the world's smartest people. Certainly, there is a spectrum between 'we write CRUD software' vs 'we read journals and do some novel algorithms work', and a company in the second category will want to say something positive about the intelligence of their employees as a signal. But "we hire the smartest people" is absolutely the wrong thing to say.

"Smarter" is not even better once you've hit the threshold of "acceptable number of nines to write this program." I routinely pass over hiring extremely smart people. Some of them have abrasive personalities, or are the wrong cultural fit, or are not interested in the problem domain, etc.

gaius · on May 2, 2012

    Maybe if you are Google

Yes, ad sales is the most pressing issue facing humankind.

Wilya · on May 2, 2012

While I don't specially disagree with you, the difficulty of a problem and its benefit to humanity are two completely orthogonal issues.

The fact that it's ultimately aimed at selling ads doesn't make what Google does (for example) in NLP an easy task.

mattmanser · on May 2, 2012

How many google programmers are actually doing that though?

Compared to something inane and dull like the number trying to get the logins for all the different services actually working with each other properly?

sparknlaunch12 · on May 2, 2012

Team fit is important. However definitely want smarter people around us. That is why they are selected otherwise we would do the job ourselves.

troels · on May 2, 2012

    otherwise we would do the job ourselves

Only if you have infinite time at hand.

noonespecial · on May 2, 2012

"Hiring the smartest people" is always great right up until the "paying slightly above market rate!" part.

If you really are hiring the smartest people, what you are paying them should leave average people with their chins on the ground(1). If this is not the case, then I wish they'd quit bragging, because the smartest people aren't working there.

(1) This and/or have about the coolest place to work on the most interesting problems in the world.

jobmatchbox · on May 2, 2012

There are people who volunteer to work on things like Presidential campaigns and for startups because of who they are working with rather than because of what they are being paid.

cheald · on May 2, 2012

The last problem seems trivial enough, and I'm no mathematician. The trick lies in that you get to pick the starting point.

Imagine you have a circular track, and four segments of the circle, each representing the gas station's distance potential (the left edge shall be the gas station's location, the right edge shall be how far that station's gas will carry you). Place the segments any way you like on the track (though they may not overlap; overlap is double-burning gas. We 'compensate' for that by placing them adjacent, representing the total distance travellable on the gas in those two stations). You have a 5th segment representing the initial tank. All 5 segments form a complete circle.

In order for the trip to be impossible, you would have to arrange the 4 pieces in such a way that it would not be possible to cut up the 5th piece and fit it into the track without space left over. Since we know that the sum of the 5 segments (no matter what their division or arrangement) always equals a full circle, we can conclude that the trip must be possible.

We can then conclude that after placing the first 4 station pieces, any empty space in the track before the station with the most gas is a valid starting point that will enable you to complete the full trip.

In the edge case where your initial tank is empty (that is, your 5th segment is null-sized), you may start at the station with the most gas and complete the trip.

lllbean · on May 2, 2012

Solution to problem 1: summing the numbers and then subtracting the sum from 1 + 2 + 3 + ... + n gives the deleted number x.

Solution to problem 2: let x be the deleted number and let y be the duplicated number. Summing the numbers and subtracting the sum from 1 + 2 + 3 + ... + n gives x - y.

Xor-ing the numbers and xor-ing the result with (1 xor 2 xor 3 xor ... xor n) gives x xor y. Now x xor y (expressed as a binary string) must contain at least one 1. So there is some digit where the binary representation of x differs from the binary representation of y. Let this digit be the ith digit.

Let's say that a number is good if the ith digit of its binary representation is a 1. Let's call numbers that aren't good bad. Observe that exactly one of x, y is good, and the other one is bad.

Let S be the number that we get by doing this:

  S = 0

  for j in 1..n:

    if j is good: S = S + j

    if j is bad: S = S - j

Let T be the number that we get by doing this:

  T = 0

  for j in our_list_of_numbers:

    if j is good: T = T + j

    if j is bad: T = T - j

We find that S - T = x + y or -(x + y). Since x + y is positive, we can find x + y in either case.

We have x + y and x - y, so we can find x and y.

Solution to problem 3: imagine that we have a car, car 1, that has a really big tank with a bunch of gas in it. Put car 1 somewhere on the track and have it drive one complete revolution. The amount of gas G in the tank of car 1 varies over time, but at some point P on the track, G reaches its minimum value. P has to be at a gas station. (Otherwise G would be lower at a point right in front of P.)

If we take our regular car and start it (with an empty tank) at point P, it's pretty easy to see that it'll make it all the way around the loop.

Because if it didn't, then at some point Q it would run out of gas. Q wouldn't be a gas station, otherwise our car would refuel; so there's some point Q' in between Q and the next gas station from Q. Then driving a car from P to Q' results in a net loss of gas. Then when car 1 drove around the track, it should have had less gas at Q' than it had at P. (That's not 100% obvious in the case where car 1 drove past point Q' before it drove past point P, but it follows from the fact that car 1 had the same amount of gas when it finished as when it started.) Contradiction.

Fuck yeah, I'm a badass.

panic · on May 2, 2012

Here's a direct proof for problem 3 that doesn't use any contradiction (perhaps someone will find it clearer).

Say you have N gas stations. For any n between 1 and N, say:

  * f_n is the amount of fuel in the nth station, and
  * d_n is the distance from the nth station to the (n+1)th.

You want to make sure that at each step along your path, you have more fuel than the distance you need to travel. Using our notation, for each step along the path, we want the sum of the f_n so far to be larger than the sum of the d_n. That's the same as saying the sum of (f_n - d_n) should never dip below zero.

So how do we show that we can visit all the gas stations while keeping the sum of (f_n - d_n) non-negative? Well, since we have just enough gas to get around the track, we know that the sum of (f_n - d_n) along the entire path is zero. In other words, if we write out the partial sums:

  S_1 = f_1 - d_1
  S_2 = f_1 - d_1 + f_2 - d_2
  ...
  S_N = f_1 - d_1 + ... + f_N - d_N

then S_N is zero, since the distances and the fuel amounts cancel each other out.

Now look through this list and find the smallest partial sum. Call it S_m. If we change our path to visit gas station 'm' last (and start at station 'm+1'), we get these updated partial sums as we move along our new path (call them U_n):

  U_1 = f_{m+1} - d_{m+1}
  U_2 = f_{m+1} - d_{m+1} + f_{m+2} - d_{m+2}
  ...
  U_{N-m} = f_{m+1} - d_{m+1} + ... + f_N - d_N
  U_{N-m+1} = f_{m+1} - d_{m+1} + ... + f_N - d_N + f_1 - d_1
  ...
  U_{N-1} = f_{m+1} - d_{m+1} + ... + f_N - d_N + f_1 - d_1 + ... + f_{m-1} - d_{m-1}
  U_N = f_{m+1} - d_{m+1} + ... + f_N - d_N + f_1 - d_1 + ... + f_m - d_m

Now note that:

  U_1 = S_{m+1} - S_m
  U_2 = S_{m+2} - S_m
  ...
  U_{N-m} = S_N - S_m
  U_{N-m+1} = S_N - S_m + S_1
  ...
  U_{N-1} = S_N - S_m + S_{m-1}
  U_N = S_N - S_m + S_m

Since S_N is zero, all the U_n are of the form S_i - S_m (for various i). And S_m is the least partial sum -- that means S_i >= S_m for all the S_i in the list. So each U_n is greater than or equal to zero, just like we wanted.

Therefore our last stop should be the gas station with the biggest fuel deficit compared to its distance from the next station (and we should begin our trip at that next station to guarantee we never run out of gas).

pantaloons · on May 2, 2012

1. Not constant space. Not linear time.

2. Not constant space. Not linear time.

3. Looks correct.

cncool · on May 2, 2012

1. How is this not linear time?

drostie · on May 2, 2012

Because as the numbers increase, their storage requirement also increases.

Suppose you have the array [1, 2, 3, ..., n]. How does storage of that array scale? You need log(n) bits to store the largest element and you need about n elements of that size (at the very least the last n/2 will have the last bit set) -- so you are operating on a data source which has size O(n log n). At least, if n is the above number.

Given this, any algorithm which only uses O(n) instructions cannot asymptotically read the whole array, and this presents a fundamental limitation on the problem. Technically, there cannot be a solution better than O(n log n) for this reason -- unless you get lucky and find the duplicated element in the very first O(n) of the array, you will miss it in the last O(n log n) of the array.

The larger problem is that there is a fundamental confusion here. Really you should say that the array is some permutation of [1, 2, 3, ... m] and that n = m log m is the input size, and then ask whether algorithms run in linear time relative to n, not relative to m. This allows you to read in the array. You still have the problem that you have storage which grows like 1 + log m, though.

pantaloons · on May 2, 2012

The larger problem is the cult of "Big O" and its use as a form of signalling in interviews.

If I wanted to test whether a candidate understood Big O, I'd ask them if there was a difference between O(1) and O(0). What these companies want is to see if the candidate understands program performance in general. Big O is the vogue (in my view misguided) way to do that.

alexchamberlain · on May 2, 2012

I disagree. The array is stored outside of your algorithm. The algorithm is constant space.

JoachimSchipper · on May 2, 2012

At least the sum needs ~log(N^2) bits. We typically assume that stuff fits in the native word size and thus takes a single cycle independent of the actual length of the number, but that's obviously not actually true (for numbers >= 2^32 or 2^64, etc.)

IsTom · on May 3, 2012

You still need to read O(nlogn) bits of the array. It can't take less than O(nlogn) time.

alexchamberlain · on May 2, 2012

1. How's it not constant space?

hristov · on May 2, 2012

All of your solutions work. I am curious however, are you new to HN or did you start a new ID just so you can brag about how much of a badass you are?

cncool · on May 2, 2012

For question 3, it clearly states two things:

"the total amount of gas available at the stations and in the car is exactly enough for the car to drive around the road once"

and

"the car completes a full circle without running out of gas"

Given the first condition, in your solution, at the exact moment the car completes the drive around the track, it runs out of gas. Therefore, it is impossible to complete it without running out of gas.

cheald · on May 2, 2012

1) Momentum!

2) The implication of the question is not "can you complete the trip with gas left over", but rather "can you complete the trip". If the requirement is that you must have gas left over, then barring a trick answer, the problem is trivially answerable as "false".

timthorn · on May 2, 2012

I used to work at a company inordinately full of extremely smart people. The issue there was the lack of absolutely competent but not stellar thinkers - the people to do the grunt work. I'm convinced that in most engineering companies you need a good mix, biased towards the competent types.

lhnz · on May 2, 2012

Most companies do not need to hire the 'smartest people in the world'. They also don't need the 'hardest working people in the world' or the 'most creative people in the world'.

Sometimes a company needs the very best people -- and they can try to define what this means to themselves and then filter for it. However, generally they need people that can solve the problems that they face to an above market-rate level of quality, and that are willing work for the least amount of money.

Unless your companies selling point is solving hard mathematical problems, or designing the most beautiful/elegant interfaces you should probably hire for 'good value', and a few people that you believe to be 'multipliers'.

hnhg · on May 2, 2012

You forget the value of people who work well together as a team. Individually they might not be anything special but together they can produce great results.

Personally, I hate the idea of abstracting people out to fungible goods that a company buys in because it's a cop-out and an over-simplification. Putting together a group of people in a team is as much a social challenge as it is ticking technical requirements.

ckuehne · on May 2, 2012

I very much doubt that the OP can solve either problem 1 or problem 2 with constant space. Hint: even storing an additional integer N requires \Omega(log(N)) space.

Edit: For the down voter: I searched around and found a discussion of the puzzle [1]. Please go ahead and read it.

[1] http://books.google.de/books?id=415loiMd_c0C&lpg=PP1&...

hristov · on May 2, 2012

I am not sure how you got downvoted but the constant space requirement seems a little puzzling. When the problem says that you are given a list of n numbers (or, more correctly, n-1) you are already in linear space. You have to store the list somehow.

Maybe they meant "constant additional space". Or constant space for the data you are using in addition to the space required for the inputs.

That however tends to go against the main principles of complexity theory and O-notation. The main principle of O notation is that you mostly care on how things grow as n becomes large. Thus, constant terms should be deleted if you have a linear term. In other words, if you have something that adds constant space to linear space, you get linear space. Thus, if you already have something that requires linear space, and have to add something else to it, it is not that important (for complexity theory purposes) if the second thing is in constant space, or log space or even linear space, as long as it is smaller or equal to linear space.

gnud · on May 2, 2012

It's fairly standard to talk about how much auxiliary space an algorithm needs - in sorting, for example. I guess that's what the author means.

ckuehne · on May 2, 2012

Thinking about it, it comes down to the level of analysis you choose. If you assume constant space per integer, then it is indeed solvable with O(1) additional space. For example, this is usually assumed when analyzing sorting algorithms and I guess this is also what the author had in mind.

pantaloons · on May 2, 2012

If you limit n then than the algorithm is actually O(1) time too -- his answer is wrong either way.

6ren · on May 2, 2012

For the first one, if they are in an order, and in an array, you can do a binary search, O(log2 n) time - better than linear time. Perhaps the question was misstated. Also, the friend's relationship is accidentally disclosed as a son.

zheng · on May 2, 2012

The problem wasn't misstated, when they aren't in order the solution isn't obvious unless you remember a bit about sums. Most interview problems, if restricted, can either be solved very quickly, or aren't interesting anymore. Either way, as it was stated, it's a fairly common interview question as another comment has pointed out.

6ren · on May 2, 2012

The question states they are in an order - ah, by that, they probably mean "not ordered".

zheng · on May 2, 2012

You're probably right, although as of right now it reads "in some order", ergo there is no guarantee that i comes before i+1 for all i. It could have been updated though.

jyothi · on May 2, 2012

yeah i noticed the "son's interview" part in first read itself - it was amusing O(n) solution for who the friend is.

matthieupiguet · on May 2, 2012

I always beware of the "we want the best of the best" companies. Usually they are looking for the "best that don't show it if they are smarter than the boss". That's a shame ...

olliesaunders · on May 2, 2012

Better title: Acing The Interview Doesn’t Mean You’ll Be Hired. No explanation why though. Envy?

arethuza · on May 2, 2012

"The problems were so difficult that he wanted to sit with me and read them together to make sure that I understood them"

That sounds slightly condescending to me, especially as the problems don't sound that complex to describe and he is a mathematician! It doesn't surprise me that someone who takes that attitude reacts badly to someone easily solving a problem that they couldn't.

theallan · on May 2, 2012

I'd suggest more that the interviewer felt threatened rather than envy. From their point of view, they would have someone smarter than them working under them, who might show them up to senior management or displace their current position in the company.

maak · on May 2, 2012

Problem solving is only part of what makes a good employee.

pja · on May 2, 2012

Because being able to solve puzzle problems isn't actually a good predictor of employee quality? (It remains an open question why interviewers use them at all in that case.)

Many interviewers do not (in reality) want to employ people who are obviously better than them, whatever they may say in public?

alexchamberlain · on May 2, 2012

A quick Google will find you the answer to the first question. The second just requires an additional equation, where people are suggesting powers in finite groups. This seems rather complex. Can we not reach the same constraints by only summing the even numbers, for instance?

nabb · on May 2, 2012

An easy way to do the second is to consider the sum and the sum of squares. This gets you a-b and a^2-b^2. Recall that the latter is (a-b)(a+b) and the answer follows immediately.

drostie · on May 2, 2012

That is probably the classiest way I've seen yet to do the second problem. Since I gave Python code above, here it is for your version:

    >>> from random import shuffle
    >>> def test(n, missing, extra):
    ...     t = [i if i != missing else extra for i in range(1, n + 1)]
    ...     shuffle(t)
    ...     return t
    ...
    >>> def solve(list):
    ...     diff, diffsq, n = 0, 0, 1
    ...     for i in list:
    ...         diff += i - n
    ...         diffsq += i*i - n*n
    ...         n += 1
    ...     sum = diffsq / diff
    ...     return {'extra': (sum + diff)/2, 'missing': (sum - diff)/2}
    ...
    >>> solve(test(7633507, missing=3518688, extra=4057456))
    {'missing': 3518688L, 'extra': 4057456L}

drostie · on May 2, 2012

At the very least, you could do it with logs like so:

    >>> from math import exp, log
    >>> ref  = [i for i in range(1, 500000)]
    >>> test = [i if i != 7922 else 59148 for i in ref]
    >>> s = sum(test) - sum(ref)
    >>> r = exp(sum(map(log, test)) - sum(map(log, ref)))
    >>> b = s/(r - 1); b
    7921.999994157095
    >>> (int(round(b)), s + int(round(b)))
    (7922, 59148)

The use of logs is really just a convenience for finding product(test) and product(ref), so that we have both a - b (from the sums) and a/b (from the ratio of the products).

Python of course has bignums, so we can also calculate these products directly, but that gets to be a very large calculation; you don't want to be storing N! for N ~= 2^64 if at all possible.

If we know that these numbers are bounded by 2^n (e.g. they're all longs) then we might be able to do this product modulo some larger base, perhaps even 2^n, with the assurance that when we do the division all of that crap will cancel; I don't know and haven't tried it.

Feanim · on May 2, 2012

1°: s is the sum of the numbers in the array, x = n(n+1)/2-s

2°: s sum, p product, solve the linear system {n(n+1)/2-x+y=s; n!y = px}

hristov · on May 2, 2012

The second solution works in linear time only if one assumes that multiplication is a constant time operation. But is it really? If you look at how a processor does multiplication it becomes clear it is a O(log n) operation where n is one of the numbers being multiplied. Thus, the second solution is probably a O(nlogn) solution.

alexchamberlain · on May 2, 2012

It also requires one to assume n! can be computed. Factorials get huge very quickly.

alexchamberlain · on May 2, 2012

Sorry, can you expand on what you meant by that?

gsa · on May 2, 2012

Sum of first n natural numbers is n(n+1)/2. The sum of the array, represented by s, is n(n+1)/2 - x + y (subtract x - the missing number and add y - the duplicated number) gives:

1: n(n+1)/2 - x + y = s

Product of first n natural numbers is represented as: n! The product of the numbers in array is p.

2: p x / y = n!

(multiply by x - add missing number to product and divide by y - remove duplicated number from product).

emillon · on May 2, 2012

If a is duplicated and b missing, computing the sum and the product of all the numbers can give you a-b and a/b, from which you can deduce a and b.

raverbashing · on May 2, 2012

So, here's the thing, and you can be guilty of this too!

If you're too worried about the answers matching your little pop quiz, sorry, you're doing it wrong

I know G needs the users to know what they ask, and their business stand on scalability, still

There is so much more! The article is just one example where google fails (yes, it doesn't say it's google: it's google) And you solved it in O(n) where they want an O(n log n) solution?! What are they thinking of refusing this?

You can be a specialist on several other technologies and techniques that are not asked in the interview it's not even funny. Discrete math, code optimization, machine learning (yeah, ok, Norvig is there, and they do that a lot, still), graphics rendering, etc, etc

sakri · on May 2, 2012

so was it her friend or her son?

jyothi · on May 2, 2012

it was either Sergei or Alexey. Brenstein btw

Gupie · on May 2, 2012

Aren't the first two problems simple to solve:

1. create an array length n of booleans set to false, set the each index i to true for the integers in the array, the index of one still false is the missing integer 2. same as 1 but with a count

Both are O(N) but perhaps my understanding of what constant space means is flawed?

brohee · on May 2, 2012

Your solution is in O(N) space, since you create a N sized array.

The elegant solution to the first problem requires big integers in a realistic setting, and simply involves summing all the numbers and comparing with what the sum of 1 to n should be (n(n+1)/2 iirc).

Robin_Message · on May 2, 2012

I don't think it requires big integers. If you do it with machine words, the overflow/underflow will exactly cancel out so you will be fine. I think.

If that's not true, then adding each x and subtract i as you process it is sufficient to ensure you only need integers comparable in size to N, not N^2.

cncool · on May 2, 2012

"create an array length n of booleans"

This could have a space complexity of O(n), not O(1).

thaumasiotes · on May 2, 2012

yes, "create an array of length n" requires O(n) space.