in blitsort.h it'll run about 25% faster. Probably still slower than a native C++ implementation since it'll evaluate to (a > b) > 0 rather than (a > b), not sure if the compiler will optimize that.
As a layman I'd expect advanced sorting algorithms to depend on the dimensions of the underlying memory architecture, e.g. size of CPU caches, and memory latency, bandwidth and such.
Copyright (C) 2014-2021 Igor van den Hoven ivdhoven@gmail.com
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
These are all pretty slow on 32-bit integers, which you're using for testing. Could you please include a comparison with pdqsort, or at least mention that pdqsort is to be preferred when an element fits in a register (as it certainly will be if you're only beating timsort by 20% or so)? It's frustrating to see sorting algorithms with some effort spent on benchmarking and still have to guess at whether it's practical based on how much slower the comparison points are than the state of the art.
Radix sorts (like your own wolfsort!) will also be much faster on the benchmarks presented, although that's sort of misleading since they are very good at uniformly random input and not so good at other distributions.
You should explain these issues in the repository, not in Hacker News comments!
A benchmark of 100,000 32 bit integers suggests that you will be comparing against algorithms that do well on that benchmark when it's not the case at all. The concept of stability doesn't apply when elements that compare the same are indistinguishable, which is the case for integers. And I think it's an exaggeration to say pdqsort has "killer inputs". In your benchmarks the patterned data always sorts faster than random, so the most you can say is that merge sort exploits some kinds of structure better (I have seen patterns where it does worse, but not by much). I expect timsort has some similar advantages over blitsort because of its ability to find natural runs while blitsort always starts at size 32—try an up-down pattern with runs of length 50 for instance. The absolute worse case for pdqsort is a fallback to heapsort. With pivot randomization it's pretty much impossible on large arrays unless they are engineered to make pdqsort fail, and you end up sorting the array three times slower or something. Not much of a DOS attack.
Having a fast merge sort is definitely useful in a lot of situations (my thoughts at [1]), and I've been interested in quadsort particularly for that purpose (blitsort seems to emphasize lower external memory usage which I've never had a use for). Half as fast as pdqsort on random data is actually somewhat better than I expected. It feels dishonest to see "Blitsort has exceptional performance" followed by selective benchmarks showing worse algorithms, and it makes it much harder to figure out what blitsort is good for.
> The concept of stability doesn't apply when elements that compare the same are indistinguishable, which is the case for integers.
If you’re considering the raw comparison value, then sure, stability isn’t applicable. But it's really rare that people are just sorting normal lists of numbers.
Stable sorts come into the picture when the comparisons are referring to records, ie rows in a DB.
> Agreed on pdqsort being 3x slower worst case on "killer" input, but it's my understanding that's why it's not being used to replace std::sort.
This is just not true. Introsort (the current std::sort in all C++ compilers I know of) also has 'killer' inputs, and will switch to heapsort twice as slow than than pdqsort does in those. Worse, the libc++ implementation of std::sort has a quadratic killer sequence that I found 7 years ago, and it is still not fixed: https://bugs.llvm.org/show_bug.cgi?id=20837. In my opinion there is no good reason that pdqsort is not adopted yet as std::sort as it is faster for many patterns, faster for random input and rarely if ever slower.
Note that any deterministic quicksort variant that does not spend a lot of time on choosing its pivots will always have some 'killer' pattern that forces it to use its fallback algorithm (or worse, go O(n^2)), as you can always
shuffle the input array such that poor pivots are chosen. What pdqsort does is shuffle things around so that those 'killer' patterns are a lot less trivial, and are not as likely to occur in real-world input. For me determinism was very important so I didn't do it in https://github.com/orlp/pdqsort, but note that the variant implemented in Rust uses non-deterministic swapping and thus it is impossible to force a 'killer' input against it.
I guess that makes sense, but I never would have guessed that the category is in-place stable algorithms as it's only mentioned in the "about" line and not the text itself.
Sorting on an integer key isn't the same as sorting a list of integers. They have different performance considerations.
I updated the README slightly, it's mentioned twice now. :)
As for performance considerations, blitsort's performance on integers is indicative of its performance on strings, while this isn't the case for pdqsort.
I haven't ran a specific benchmark, but I assume that on string data blitsort will beat pdqsort for every metric except random with many equal items.
What do you mean by "suffers from killer inputs"? It is always O(n log n) by falling back to heapsort, so I don't think any input can be described as a "killer".
Radix sort wouldn’t allow for the “possible to sort any other data type” part.
I do wonder about not enabling inlining structs. I don’t know where the crossover happens, and it certainly varies with hardware characteristics, but I’m sure a data structure made up of say an int and two longs would better be sorted in contiguous memory rather than as an array of pointers to random places in the heap.
In the source code, it is not apparent where the size restriction comes from. It seems to me that it can be easily modified to work with any data types. Maybe I am wrong. I am more concerned with this sentence:
> Blitsort's performance is similar to that of quadsort as long as the auxiliary memory is greater or equal to the square root of the array being sorted, which comes out at 262,144 elements with the default stack of 512 elements. Performance on larger arrays degrades marginally.
I wonder what "marginally" means here. What if we sort 10 million integers?
I am thinking more about sorting, say, 20-byte or 24-byte custom structs. Directly sorting arrays is probably faster than sorting pointers to arrays. At least in my applications, I rarely just sort numbers; I more often sort numbers along with other data associated with the numbers.
Functions look clean and fas Also great README (documentation, evaluation, ..)
I did not see a license file though. Is the repository intended as public domain?
UPDATE: thanks, apparently I'm trained to skip file head sections.