Thanks for sharing :) Any thoughts on what kind of things you are looking for and didn't find?
I cannot recall anyone saying this kind of thing is a bottleneck for them.
We don't use std::range, but searching for a negative value can look like:
https://gcc.godbolt.org/z/8bbb16Eea
Can you write the second one two? With two ranges? That's where I believe the variadics will be.
FYI:
The codegen is smaller because the loop is not unrolled. That's a 2x slower on my measurements.
+ at least I don't see any aligning of memory accesses, that'd give you another third improment when the data is in L1.
You really should fix that.
We have a different philosophy: not supporting/encouraging needlessly SIMD-hostile software. We assume users properly allocate their data, for example using the allocator we provide. It is easy to deal with 2K aliasing in the allocator, but much harder later. At least in my opinion, this seems like a better path than penalizing all users with unnecessary (re)alignment code.
We have not added a FindIf for two ranges because no one has yet requested that or mentioned it is time-critical for their use cases.
Let me write the std::ranges code and ask you to write them with highway.
https://godbolt.org/z/3s1b8P3sj
PS: this is how it looks in eve: https://godbolt.org/z/Kzxqqdrez