If you're interested in results, it's also worth reading the follow-up which show that on equal footings LZSSE8 is a more general choice than LZSSE2 which works well on text and in high compression scenarios (http://conorstokes.github.io/compression/2016/02/24/compress...).
posts involving vectorization are always nice, especially if it involves compression. I'd love to see a vectorized version of arithmetic coding, that would definitely be interesting
Finite State Entropy is the leaner, faster entropy encoder that you probably want to use instead of Arithmetic coding. It's also much more likely to be something that could be made parallel--at least for encoding, Arithmetic coding traditionally uses division, and there's no SSE integer division.
This is pretty cool. It competes well with lz4 even if it is amd64 only. I wonder what an ARM version would look like. Here's to hoping he author keeps working on this.
LZSSE author here, Evan Nemerson has started a portable version of LZSSE that uses his own SSE emulation layer and has apparently got it working on the Raspberry Pi 2.
This just barely counts, but... The game Destiny uses a custom SIMD VM to script loading constants buffers for their shaders. They mentioned it at GDC this year.
It would be fun to write a SIMD-based VM in Terra P: