> but in practice every job where I've been tasked with making something fast, only linear time complexity was good enough.
But of course we remember from our CS school that some things simply cannot be done in linear time. I’m not sure what happens when your employer tells you to sort a big list and insists that only linear time complexity is good enough.
To that I find that the answer is usually: find ways to avoid needing to sort the list! This is more viable for many problems than you might otherwise think.
There’s also vastly faster ways of sorting lists of bounded numbers (or tuples), the most linear being: cut straws of the exact right length labelled with an Id, bang them on a table so the ends without the ids all line up, read the ids back in order.
I only said it took linear time, I didn’t say it was fast...
At one job we used to joke "i don't care if it's right, as long as it's fast"
Of course, it was only half a joke. For some pages, we really did want data quickly, so we were willing to accept stale data or somewhat inaccurate data, it that helped. For other pages/sections, accuracy was more important so caching wasn't possible or invalidation was critical.
But product design is always about tradeoffs between what features are desirable, what features are possible, what features are easy to build, and how features interact. Features that are possible but not easy to build can sometimes be replaced with features that are easy to build and close enough.
Radix sort is still O(n logn). More precisely, it's O(w * n) where w is the size in bits of the data being sorted, but then it follows that w scales with log n, since n distinct items requires log n bits and hence you arrive at n logn.
Sure, but it has the classic problems of the memory/storage required for the sort (minimum double all your keys), and a classic math issue relating to it's speed being O(k*N) and since k = number of bits/digits, N can only be as large as the number of bits/digits allow for.
This means quicksort and the like end up faster in practice for any "sort this list" operations. Radix/bucket sorting can be SUPER awesome when what you need is a single level of bucketing at all rather than exact order.
Sure, but it has the classic problems of the memory/storage required for the sort
No problem, just create a file of the necessary size on disk if you don't have enough RAM available.
Today's comment was brought to you by a large number and the letters big-O and c... where c, a large number, is the hidden constant of proportionality waiting to bite you every time you mention big-O. :-)
If we're reading from disk, then we DEFINITELY don't want radix, as then it's almost guaranteed that all time will be dominated by loading pages rather than cpu cache locality being leveraged by linear time algorithms - which is what this entire discussion is about.
If we are talking about linear in number of characters, an easy answer is building a trie and traversing it left to right. (Cache-friendly variation: https://en.wikipedia.org/wiki/Burstsort)
Anything you didn't learn in Christian Science courses at university, you can find and learn more at your local Christian Science Reading Room! Most cities and towns in the USA have one, and even many airports for people on the go. If you're flying through SFO, you can visit the one in terminal 2!
But of course we remember from our CS school that some things simply cannot be done in linear time. I’m not sure what happens when your employer tells you to sort a big list and insists that only linear time complexity is good enough.