Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How so? Sorting changes the order of the list, to the extent it's adding information (the sort order) it's also destroying the same amount of information (the previous order).

It's easy enough (including by accident) to deliver insertion ordered iterator behaviour for small hash maps that have never had any items removed, but after that things get sticky very quickly.



You were stating that "[...] it definitely is using memory it wouldn't need if it didn't preserve order.".

There is no reason for that. A sorted list doesn't need more memory than an unsorted list. And an insertion-sorted map doesn't need to use more memory than a map that doesn't keep the order.

Independently, some properties we deem valuable are often side-products of the way the data-structure works. For example, an efficient binary search tree will naturally tend to be balanced. In the case of efficient hash maps, it is now common to add the key/value pair at the end of an internal vector. Maintaining the insertion order this way doesn't add any cost.


It actually does have a cost. In fact I'm pretty sure it has two separate types of cost, so let's talk about both of them.

Firstly, and this is orthogonal to my original point, whenever people say "Cost" they actually silently mean "... that I care about" and so we're involuntarily obliged to agree to stop caring about anything else. Because of Hyrum's law even if you don't promise ordering (as Python's "Compact Dict" developer proposes at one point) your users will rely upon it and you're stuck with whatever you implemented. This is a real cost for a language or standard library. Look at the sad state of C++ standard collections.

But my main point is that the requirement for ordering forces you to dispose of otherwise permissible optimisations which could destroy the order. You actually ran head first into this in the article, when repeatedly removing and adding elements blows up by clogging your data structure with tombstones.

Swiss Tables wouldn't clog up with tombstones. Since they don't care about ordering they only need tombstones at all in uncommon special cases (specifically, if the entire rest of the group is full or tombstones), in most cases Swiss Tables get to just set removed (key, value) pairs as empty. But this optimisation isn't open to you.


Sticking tombstones in the backing array doesn't actually cost you anything though. The backing array still needs to be the same size. Like maybe you end up having to re-hash more often?

>Because of Hyrum's law even if you don't promise ordering (as Python's "Compact Dict" developer proposes at one point) your users will rely upon it and you're stuck with whatever you implemented. This is a real cost for a language or standard library. Look at the sad state of C++ standard collections.

Yes but this cost goes away if you commit to it, and the alternative (like what go does) to force the order to change all the time also comes with a complexity cost vs. doing "nothing" and having the order be arbitrary and sometimes, but not actually, reliable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: