I think its unfortunate that many people look to size alone when considering dat...

parentheses · on March 24, 2021

First, the competition does place a time limit (in fact several resource limits). I agree that it may be too lax to judge consumer-oriented compression algorithms. That is not what this competition is about.

The goal is maximizing compression ratio - with the aim of affecting AI. Since an AI system can be seen as a lossy compression where some worldly input is compressed into something that we hope mimics "understanding". Improvements in compression may translate to AI advances [0] and potentially vice versa.

To draw a parallel away from computing:

One could say we are compressors that experience the world and compress it to loose what we don't care about. Improving our compression is basically akin to "knowing more" or "remembering more of the right things". See Schmidhuber [1].

[0] http://prize.hutter1.net/hfaq.htm#compai

[1] https://arxiv.org/abs/0812.4360

globular-toast · on March 24, 2021

> One could say we are compressors that experience the world and compress it to loose what we don't care about.

That's how lossy compression works but this competition is about lossless compression. Makes me wonder if the competition should really be about lossy compression, as it's pretty obvious our minds aren't lossless compressors.

lifthrasiir · on March 24, 2021

This argument has been argued over and over and the FAQ offers counterarguments [1]. Also if you can build a good lossy compressor it's relatively easy to convert into a lossless one (residual coding), but the inverse does not hold in general.

[1] http://prize.hutter1.net/hfaq.htm#lossless

schoen · on March 24, 2021

I agree with this in many cases, but I read some of the Hutter Prize documentation and it seems like the people running the prize have very specific reasons for neglecting compressor and decompressor runtimes for this contest (although I'm not sure I could do justice to them).

mdip · on March 24, 2021

You make a good point. I've often wondered why there aren't compression container formats that can do magic like handle media file compression differently (using lossless formats), but both the time to compress/decompress would be prohibitive (and, personally speaking, I'd just leave it in the smaller lossless format assuming my players will play it).

babelfish · on March 24, 2021

From a business perspective, decompression complexity wouldn’t be the biggest issue if it didn’t cause a noticeable slowdown in processing on the client (the one decompressing). At a SaaS for example, if you can save 30% on outbound transfer, it might be worth a +10ms time-to-paint