Hacker News new | past | comments | ask | show | jobs | submit login

Another example is hash maps: all the large companies built better maps in cpp (folly, absl), but the number of apps that are performance sensitive and still use std::unordered_map will be astounding forever.

(Why not upstream? ABI compatibility which is apparently a sufficient veto reason for anything in cpp)




No, that's rubbish that became a meme. There is no universally the best design of a hash-map. Each design always comes with its own trade-offs and so is the case with open-addressing vs separate chaining in conflict resolution. Iterator instability and memory-bandwidth intensive rehashing just to name a few.

Design of your hash-map, or any other algorithm or data-structure, is always and only dictated by your workload. Microbenchmark suite from xyz company/person will hardly represent that reality.

So the more generous answer would be it's not because of "ABI" meme but because std::unordered_map is just fine for the most cases out there. And this is what general purpose standard library is ought to be able to support.


Lol this is so incorrect it's almost funny. Google saved something like single digit % of CPU and memory fleet wide by switching to this map. That's not a micro benchmark


Learn to read with understanding and learn to have some respect as well. Never have I said that it cannot make a difference but that it's not universal and that it depends on the workload.


Ok let me try to unpack what I'm saying so that we're on the same page:

1. Until c++ [toolchains] do an abi break, you will find better performance in libraries like boost/folly/absl than in the STL. Titus wrote about this in "abi now or never" ~4 years ago. His thesis was "the language is leaving performance on the table, and by default if you care about performance you should not use the STL." Afaict this is all still true. If you think this is a meme, you're wrong, it's existential for companies like Google.

2. You mention a bunch of things that one might consider when picking a hashmap. "There might be a workload where I need one map vs another." Yes, that's true. However, my original statement was saying: if you care about performance you should not use unordered map. This is sort of like saying "if you care about performance you should not use bubble sort." Of course there are lots of considerations when choosing a sorting algorithm. Does it need to be cache tuned/oblivious? Do you have a lot of runs in your data (eg is it mostly sorted?)? HOWEVER, I will happily tell you to prefer a gently optimized quick sort over bubble sort for ~all workloads, and if you need further performance then go consider those things.

The performance of the absl/folly maps is not a joke. It's the basis for the rust hash map. These are good, general purpose hash maps. That's not a meme. It's also not a meme to say that the cpp working groups are generally unwilling to make ABI breaks for small improvements (indeed, in the past decade they haven't made an ABI break to bundle large improvements)


Microbenchmark suite is very different from fleet wide profiling

The whole point of switching every single use of unordered map to node/flat hash map is because it is always better. It always uses less memory (much less memory per node).

Edit: what did I say that was disrespectful? I called you out for being wrong because you're very wrong




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: