It's worth remembering that when selecting things at random from a given pool, there is always a chance of collision. If you think of a hash as taking a value to a "random" result, with different objects going to results uniformly "at random", then you have to expect collisions. This is why the term "perfect hash" is defined so carefully.
For those who are interested, the probability of collision, then size of the hash destination, and the number of objects chosen are related by this formula:
Choosing from N items, with N large, and wanting a
probability T of having no collisions, the number of
items you can choose at random with replacement is:
k ~ sqrt( -2 N ln(T) )
There's always a chance of a collision, but for a large enough pool relative to the number of objects the chance may be sufficiently smaller than the chance of hardware failure that it can be legitimately ignored. For something where the entire space must in some way be realized (eg. hashtable) this won't be the case, obviously.
Most of the time, that isn't true. Hardware failures are (barring quality control issues) independent, hash collisions are not.
For example, if you assume your hash will not have collisions, and design your hash table without any mechanism for coping with one, and it happens to have a clash on "Jones" and "Smith", your address book app will become unreliable for a rather large fraction of your users.
But at the scales we're talking about, it won't happen to have a clash on "Jones" and "Smith". This won't be a hash table - this will be things like a git commit hash, where generating 1/ns is likely to give a collision in 3 years, before which the storage required just to store hashes is in the yottabytes.
For those who are interested, the probability of collision, then size of the hash destination, and the number of objects chosen are related by this formula:
So Source: http://www.solipsys.co.uk/new/TheBirthdayParadox.html?HN2013...