Lepton image compression: saving 22% losslessly from images at 15MB/s

ausjke · on July 14, 2016

This is very impressive for archiving images.

For a quick test, I run it over ~1.3GB JPEG pictures I had locally, the finally result is 810M, that's 66% of the original size, very impressive considering it's lossless. It only deals with jpg file though, no png, no iso, no zip, no any formats other than JPG.

If someone can do this over video files that will be PiedPiper comes into real life.

colechristensen · on July 14, 2016

The reason this can be done with JPEGs is the compression hasn't been updated. There are folks that have used h264 for compressing images and webp uses vp8 compression, both with much better results than JPEG.

Lepton is cool because it helps make existing technology a whole lot better, but what we actually need is a better image format.

You wouldn't see the same leap for videos because people have been working hard to make those great for a while while JPEG has been left to rot.

joosters · on July 14, 2016

All that is true, but it misses out a key point: Lepton is risk-free.

I could re-encode all of my JPEG photos with a better codec, but the problem is that then they have gone through a lossy compression scheme twice: once with JPEG, and once with the new codec. This could ruin the image quality on some photos, giving them nasty artefacts.

The newer codecs might even be hindered further by the fact that they are being asked to compress an image that's been JPEG'd. I don't think any of the codecs you mentioned are tuned to work their best with image data that already has been butchered by JPEG. They expect to be given the 'pure' original image. This may well cause the codec to perform worse than expected.

Using Lepton gives none of these risks, since jpeg<->lepton is lossless. I could throw away all the JPEGs afterwards, safe in the knowledge that if I want to go back, I can.

Also, it's not really true that JPEG has been left to rot. There are lots of programs available that greatly improve upon the basic JPEG compression, while still outputting a JPEG file that any compliant decoder can read. And there's an even simpler way to shrink your photo file sizes - just drop the JPEG quality level slightly. Most cameras and programs default to a very high quality value. Dropping the default by even a tiny amount can produce remarkably smaller files, and you'll probably never notice the miniscule image quality loss.

I suspect that JPEG is so popular because it is 'good enough'. Most people and programs don't care so much about squeezing out smaller files, so they'll keep using JPEG regardless.

manigandham · on July 14, 2016

Yes, it should all just switch to FLIF

http://flif.info/

rattray · on July 14, 2016

> FLIF is a novel lossless image format which outperforms PNG, lossless WebP, lossless BPG, lossless JPEG2000, and lossless JPEG XR in terms of compression ratio.

> 74% smaller than lossless JPEG XR compression.

> Works on any kind of image

jacobolus · on July 15, 2016

> outperforms PNG, lossless WebP, lossless BPG, lossless JPEG2000, and lossless JPEG XR

PNG was not designed to be used for photograph-like images. The rest of those were not designed to be lossless formats, the lossless version is just a tacked-on afterthought.

Very unsurprising to find a codec that can beat those.

are595 · on July 15, 2016

In my own tests, I found FLIF generally beats PNG (PNG Crush/Optipng both in brute-force mode) for comics (greyscale, majority white) as well, but by a less significant margin.

It's also worth noting that there aren't many other lossless formats, so it's still a valid comparison. I'm sure neither TIFF nor RAW outperform FLIF either.

rleigh · on July 17, 2016

That's not a valid comparison for TIFF. TIFF is a container which supports multiple compression algorithms. You could plug FLIF in as a compression algorithm and use it directly.

are595 · on July 17, 2016

Alright, well against current commonly supported TIFF compression algorithms such as LZW, Packbits, and ZIP.

manigandham · on July 15, 2016

It also loads progressively and has a novel feature that lets a client determine how much detail to render, then stop loading any more data, all while using the same file.

http://flif.info/example.html

usrusr · on July 15, 2016

This would be a totally awesome feature if the quality of a truncated FLIF came anywhere close to files tailored to that size in “real“ lossy formats. Unfortunately, according to that example page, truncated FLIF falls far behind. It might find a niche where data reduction and scale reduction fall together (the last example)

colechristensen · on July 15, 2016

What are the processing implications?

The lack of processing comparisons raises that question pretty loudly, and it's definitely important in a mobile world. There's more to performance than size

iopq · on July 15, 2016

Actually, high quality JPEGs are just as good as BPG, VP8, etc:

https://people.xiph.org/~jm/daala/revisiting/subset1_psnrhvs...

so when you use a better encoder like mozjpeg, JPEG is actually competitive with the newest formats, given enough quality

it gets slightly grainier results with blocking artifacts, but BPG and WebM look like they had a painting algorithm blur out all of the details

guelo · on July 14, 2016

Indeed. In one project where I converted jpegs to webp I saw around 70% savings at similar quality. This was on Android where it's fully supported. On the web pretty much only Chrome supports it, I'm not sure why Firefox, IE and Safari are hesitant.

Sinaf · on July 14, 2016

This [1] is from 2013, so I don't know how it holds up, but I found it in the criticism section of the Wikipedia article for webp [2].

edit: This [3] is a follow-up from 2014 to the article/study from 2014.

TL;DR: "We consider this study to be inconclusive when it comes to the question of whether WebP and/or JPEG XR outperform JPEG by any significant margin. We are not rejecting the possibility of including support for any format in this study on the basis of the study’s results. We will continue to evaluate the formats by other means and will take any feedback we receive from these results into account."

[1]https://blog.mozilla.org/research/2013/10/17/studying-lossy-...

[2]https://en.wikipedia.org/wiki/WebP#Criticism

[3]https://blog.mozilla.org/research/2014/07/15/mozilla-advance...

therealmarv · on July 14, 2016

I've heard that Safari (10) on upcoming iOS 10 supports webp. Don't know about Safari 10 on macOS Sierra.

ksec · on July 16, 2016

Where did you read that? Since Apple has yet to commit on H.265, ( Due to Insane patent Licensing payment ), I am hoping they jump onboard with Open Media Alliance.

neuronexmachina · on July 14, 2016

I'd love to have something like this for archiving DVD ISOs though, where the VOBs are compressed with old-school MPEG.

astrange · on July 14, 2016

It wouldn't get much smaller. The reason DVDs are such a high bitrate compared to Handbrake rips isn't that MPEG2 is inefficient, it's because there's a keyframe twice a second. Most movie rips have a keyframe every 10 seconds.

It's mostly to let you fast forward, but there is a technical issue there. MPEG2 decoders aren't all mathematically identical, so what happens is the picture tends to drift away from the real thing after a while, and there's hacks like frequent keyframes and flipping the smallest DCT coefficient to get around it…

phire · on July 15, 2016

I've always wondered if it would be possible to losslessly convert the MPEG2 DCT coefficients, motion vectors etc to the equivalent subset of h.264 and then take advantage of the better prediction and entropy encoding of the later standard.

Maybe you could even go further and actively remove keyframes (in a fully reversible way, just keep a record of where they were)

I'm not sure how much you would save, and for losslessly archiving DVDs you might be better off creating a special format like Lepton for mpeg2

niftich · on July 15, 2016

The first part of your question was posed on Doom9 in 2009. It's the opinion of Dark Shikari, longtime lead x264 developer, that it would be possible [1].

Not sure what you mean by the keyframe removal, though. Such an act would be lossy, significantly impair any P-frames or B-frames (unless you majorly modify them), and, it frankly doesn't sound very reversible. Mind elaborating?

[1] http://forum.doom9.org/showthread.php?t=149895

phire · on July 17, 2016

What I mean is transforming I-frames into equivalent P-frames (or even B-frames) that decode to the exact same pixels. The transformation might result in something that is bigger than a directly encoded P-frame (which can tolerate some small errors) but I suspect the new P-frame will be smaller than the original I-frame.

There is no restriction that P/B-frames only reference I-frames, so you don't even need to touch those frames.

For conversion back to MPEG2 it would be ideal to detect or mark the original I-Frames, so you can convert them back (a simpler transformation). But you could also pick any random P/B-frame and convert it to an I-frame with little issue.

dublinben · on July 14, 2016

Why bother trying to losslessly compress your DVDs when they're already a low quality, compressed MPEG2 source? The small amount of content that isn't available in higher quality won't take up that much space left as is.

tedunangst · on July 14, 2016

This is why I come to HN. To find out what's in my video archive and how easy it is to replace.

antisthenes · on July 14, 2016

In the case you're being facetious, maybe you could clarify how big is your video archive exactly and what sort of compression you're trying to achieve on it?

cbhl · on July 14, 2016

Shouldn't you just use Handbrake and H.264/AAC? (Assuming your computer is fast/new enough to play it back.)

nacs · on July 14, 2016

The article (and the person you're replying to) is referring to lossless compression. Converting to H264 and AAC may maintain a high quality at a lower bitrate but they're definitely not lossless.

gribbly · on July 14, 2016

I assume he'd want it to be lossless, as is the case with Lepton.

stephenr · on July 14, 2016

How does h.264/5 not fit the bill?

beernutz · on July 14, 2016

I don't think h.264/5 is bit-for-bit retrievable is it? Neuronexmachina might be wanting to be able to get back the 1:1 version maybe?

IanCal · on July 14, 2016

Can you run those losslessly?

thescriptkiddie · on July 14, 2016

https://trac.ffmpeg.org/wiki/Encode/H.264#LosslessH.264

niftich · on July 14, 2016

While there's a lossless H.264, it doesn't fit the implied use-case proposed in the thread. Using lossless H.264 to recompress lossy MPEG-2 would be akin to using PNG to recompress a JPEG -- silly and wasteful.

Sure, it's an inexact analogy because PNG is not a block-based DCT coder, but all lossless H.264 does is set the quantizer as absurdly high as it needs to go to losslessly encode a particular frame. Staring from a compressed source, this is not a good recipe for achieving a space-saving result.

sliverstorm · on July 14, 2016

Is JPEG, PNG, etc a container format that could use arbitrary compression, like MKV? Or do we need a brand new format.

joosters · on July 14, 2016

The JPEG file format is pretty detailed, there are lots of places in the file where you could indicate that the image has been compressed with a different algorithm. But it wouldn't really help, because older decompressors will still not be able to interpret the new codec. They would be able to parse the file and read the metadata in it - like image width and height, timestamps, camera details, and so on, but there's no way it can decompress the data.

halomru · on July 14, 2016

PNG is a very nice, extensible container format. In principle you can use it to store arbitrary image data, and still make use of all established metadata fields.

In practice you probably shouldn't do that without changing the file extension and the magic bytes to avoid confusion among users and poorly written software.

rleigh · on July 17, 2016

While PNG (and libpng) are reasonable for their intended purpose as a GIF replacement for simple RGB and greyscale graphics, it doesn't compare when it comes to the power and flexibility of TIFF, from sample format and depth, to sample count, image orientation, tiling and many additional features.

TIFF is showing its age at this point, with a some high-end scientific applications switching to HDF5. But PNG simply isn't featureful enough for this type of data.

animal531 · on July 15, 2016

If I remember correctly PNG can use different compression algorithms per scanline, although I doubt it's ever used as such.

unexistance · on July 15, 2016

perhaps BPG is better? By the great Fabrice Bellard

http://bellard.org/bpg/

kalleboo · on July 15, 2016

BPG is based on h265

daniel_rh · on July 14, 2016

Thanks for testing this! For archival purposes, be sure to check the exit code of the lepton binary after compressing each JPEG: the default parameters only support a subset of JPEGs (you need to pass -allowprogressive and -memory=2048M -threadmemory=256M to support a wider variety of large or progressive JPEG files). Lepton will not write images that it is unable to compress using the settings provided. In those cases be sure to keep the original JPEG.

are595 · on July 14, 2016

Hi Daniel, very cool project! I am interested in hearing your thoughts on alternative image formats / compression methods.

WebP is very promising (also based on VP8) for lossless and lossy compression. Have you considered using it to compress PNGs in the same way Lepton is compressing JPGs? Odds are it wouldn't be bit-perfect though (despite being pixel-perfect).

Also interested in hearing about the tradeoffs between server-side decoding and client-side. Not to keep focusing on it, but WebP has native support in Chrome and javascript decoders for everything else.

I think it is a very exciting time for image formats with several promising new ones on the way (WebP, FLIF, maybe BPG but possible legal issues).

PepeGomez · on July 15, 2016

How much better do you think it could get with JPEGs encoded with Lepton in mind?

darksoul16 · on July 16, 2016

The best I've seen is a new compressor at http://encode.ru/threads/2459-EMMA-Context-Mixing-Compressor

Can also recompress GIF and some RAW formats, but is much, much slower

_rw62 · on July 14, 2016

I'm running over several terabytes and on the 5k images I've done so far (3.6GB), the compression has been 0.78x, which is damn good for lossless compression!

EDIT: 17GB now (my server is kinda slow) and it's still holding at 0.78x.

odbol_ · on July 15, 2016

It's not really "lossless" if it only works with a lossy format though, right?

mark-r · on July 15, 2016

Yes and no. It's not a lossless way to encode an image, but the Lepton step itself doesn't add any loss that wasn't already present in the JPEG to begin with, and it generates a smaller file overall.

zerr · on July 14, 2016

Over 3D video files...

Apofis · on July 14, 2016

It seems like the real bottleneck here is binary... why can we not start supporting base 10 yet? Or even base 64? a-Z+0-9?

ryanlm · on July 14, 2016

Someone just had to mention that _stupid_ show. Seriously. By empowering that show, you're just making fun of yourself and all the other programmers on here.

the_duke · on July 14, 2016

I really admire Dropox for open sourcing this, it shows their commitment.

Saving almost a quarter of space for most images stored is something that truly gives a competitive edge. (I say most because people probably primarily have JPEG images).

Especially considering how many images are probably stored on services like Dropbox.

And they just gave it away to their competitors.

randyrand · on July 14, 2016

IMO, it shows how they don't plan to make any money from this, don't consider it a large competitive advantage, and think that the largest advantage of open sourcing it are the 'free' open source volunteers will improve the software.

It's interesting that they don't consider it a large enough competitive advantage.

Either that or they are using this to attract engineers.

tetrep · on July 14, 2016

> Either that or they are using this to attract engineers.

On a related note (I can't speak for Dropbox specifically) there are many engineers who desire their work to be open source for their own motivations, and when it's not a significant business risk to do so, nice companies will allow it.

So it works on two fronts, as a "hey, we'll let you open source your stuff" and a "hey, we've got people here who care about and contribute to open source."

mahyarm · on July 14, 2016

You also open source if it's in your best interest for something to become the 'standard'.

In google there is a regret that they didn't open source a lot of stuff, because it ends up being reproduced outside of google in some form. The open source companies have an advantage in hiring and overall advancement of their product by using the open source version whatever their internal version was.

You also see it in facebook's open source initiatives, like react, buck, haxe and so on.

prirun · on July 15, 2016

Without taking anything away from their generous announcement, it's very much in their interest for .lep files to become a standard so that they do not have to transcode to jpeg when users are viewing files on DropBox. I'm happy they released it.

manigandham · on July 14, 2016

> In google there is a regret that they didn't open source a lot of stuff, because it ends up being reproduced outside of google in some form.

What regret? A lot of the open-source clones of Google's internal projects come from Google themselves.

pierrec · on July 15, 2016

Wait, since when is Facebook pushing for Haxe? I tried looking it up, but I'm not finding much. It would be really cool (I love Haxe). But I'm guessing you only went a bit over the top with that particular example.

mahyarm · on July 16, 2016

Sorry that was a typo. I meant HHVM and Hack http://hhvm.com/

jacobsladder · on July 14, 2016

Then why do you admire them? Would you also admire them if you were their investor? Dropbox management is obligated by law to act in the best interests of their shareholders, i.e. to make them as much profit as possible.

It's more likely that they have released it because of some profit-seeking interest. They are not charity.

acdha · on July 14, 2016

> Dropbox management is obligated by law to act in the best interests of their shareholders, i.e. to make them as much profit as possible.

Can you cite the law which you believe makes that obligation?

The reason you can't is because it's not actually a legal requirement and the reason is obvious: it's hard to say what the best interest is over all but the shortest time frame:

https://en.wikipedia.org/wiki/Business_judgment_rule

Dropbox's management might argue that they benefit more from open-source than they're giving away, that this kind of favorable attention will help them hire the top engineers who make far larger contributions to their bottom-line, that the pricing models are complex enough that this just doesn't matter very much, etc. Absent evidence that they're acting in bad faith, it's almost impossible to say in advance whether those arguments are right or wrong.

shuntress · on July 14, 2016

"Dropbox management is obligated by law to act in the best interests of their shareholders, i.e. to make them as much profit as possible." - no they are not.

http://www.nytimes.com/roomfordebate/2015/04/16/what-are-cor...

mseebach · on July 14, 2016

Even if that was their obligation (which it's not, as others have pointed out), there's no reason to believe that keeping the algorithm proprietary would be more profitable. They're not handing a critical capability to their competitors (competing with Dropbox isn't fundamentally about cheap storage). They don't have the infrastructure to licence the algorithm (who should they licence it to? For how much? Under which terms? Those are not simple questions to answer.)

Releasing it freely allows it to work its way into browsers, allowing Dropbox to serve up the smaller images directly to users, saving money on bandwidth. It allows other users to contribute improvements. It markets Dropbox as a desirable place to work for engineers.

Not obviously less profitable that the opposite.

the_duke · on July 14, 2016

I admire them because they certainly considered the pros and cons of doing so, and decided that they feel secure enough in their market position and prefer to give this back to the open source community

Pretty much every modern company safes tremendous amounts of money from open source (Linux and upwards in the stack), and so the OSS community should rightly be considered a STAKEholder.

The share vs stakeholder obsession in the space of large companies and corporations represents a lot that's wrong with our current markets.

--

Also, the only upside to open sourcing this is getting other involved in development. I just tested on 10k images, and the promise both on compression rate and bit parity after decompression holds true.

Seems to be a pretty stable product, so the that motivation is probably only miniscule.

lilyball · on July 14, 2016

There's also the upside of attracting other talented developers to come work at Dropbox if Lepton is representative of the kind of project they might be working on.

placeybordeaux · on July 14, 2016

They are a private company, they can actually ask their shareholders how they wish them to act.

versteegen · on July 14, 2016

Does it make a difference? Can't public companies do the same at shareholder meetings, or declare what the intentions of the company are before going public? For (a poor) example, Google declared at IPO they would never pay dividends.

Klasiaster · on July 14, 2016

A similar approach has also been developed for use with zpaq, a compression format which stores the decompression algorithm as bytecode in the archive:

[The configuration] "jpg_test2" by Jan Ondrus compresses JPEG images (which are already compressed) by an additional 15%. It uses a preprocessor that expands Huffman codes to whole bytes, followed by context modeling. http://mattmahoney.net/dc/zpaqutil.html

Jabbles · on July 14, 2016

It's interesting to do a cost analysis here:

It's saved "multiple petabytes" of space.

Backblaze storage is $0.005/GB/Month = $5k/PB/Month.

The GitHub repo has 7 authors, perhaps costing Dropbox $200k/year each and taking most of a year ~ $1M to develop this system.

So this might pay for itself after 200PB*Months, assuming Dropbox's storage costs are the same as Backblaze's prices, and assuming CPU time is free. (TODO: estimate CPU costs...)

Of course, advancing the state of the art has intrinsic advantages, but again, it's interesting to look at the purely financial point.

https://www.backblaze.com/b2/cloud-storage-pricing.html

drewcrawford · on July 14, 2016

I think you may be crunching the numbers the wrong way...

Dropbox is valued at $10b (regardless of what you think of that number, someone was willing to buy at that price). But they only have 1800 employees. If by some miracle 80% of them are engineers that is $7M per engineer. All this for a "pure software" business. No inventory, no manufacturing, they only recently started doing their own operations.

Investors would be telling them "more engineers" => "more software" => "better Dropbox". So now you have to go out and snap up engineers in Silicon Valley, which is very very hard. But you don't offer the perk of a Google/Apple line on a resume nor the compensation of e.g. Microsoft. What do you do?

Answer: you offer people the chance to work on cutting edge breakthrough technology. Not only that but it's all open source! This is a dream job for some people. They will turn down every other gig for this one.

It doesn't matter that PB/mo is chump change because an investor doesn't know that, they only know the engineering headcount. If by some miracle an investor does know this is a waste of time, then you just point to this HN post and observe how many developers are interested in this technology and how it is attracting developer mindshare that can be exploited for additional hires down the road.

I'm not saying it's rational–it's not. But I think there were strong incentives to greenlight a project like this, even if the actual cost savings were zero.

BrunoJo · on July 14, 2016

Space is cheap but the bandwidth isn't. At Backblaze the download traffic costs $0.05/GB = $50k/PB.

Jabbles · on July 14, 2016

I don't think they're compressing this and decompressing it client side. At least I didn't get that impression from the article.

You're correct ofc, download costs = 10 months of storage.

lj3 · on July 14, 2016

From the article:

Lepton can decompress significantly faster than line-speed for typical consumer and business connections. Lepton is a fully streamable format, meaning the decompression can be applied to any file as that file is being transferred over the network. Hence, streaming overlaps the computational work of the decompression with the file transfer itself, hiding latency from the user.

In my humble opinion, dropbox's image gallery webapp is considerably faster than any other I've seen, especially when compared to imgur.

aab0 · on July 14, 2016

> I don't think they're compressing this and decompressing it client side.

The speed quotes made it sound like client-side was a concern. Why would you go to all the effort of devising a new image compression format saving 20%+ storage and on the wire, and not have it decompressed client-side, especially when you control the client?

dgoldstein0 · on July 15, 2016

My rough understand, as someone who works at Dropbox and knows some of the people who worked on this (but isn't directly involved), is that this currently only runs on our servers. The perf requirements are primarily that we don't want to slow down syncing / downloads significantly - and also want to keep the CPU cost under control. As is, the savings in storage space should easily pay for the extra compute power required.

mseebach · on July 14, 2016

Why wouldn't they (at least, plan to)? They control the client-software.

kardos · on July 14, 2016

The typical encoding/decoding rates are measured with a Xeon... my guess is that lower powered phones/netbooks will be considerably slower. I'd trade off a 20% longer sync time in exchange for not running my laptop battery down.

placeybordeaux · on July 14, 2016

I only see 3 authors.

Jabbles · on July 14, 2016

https://github.com/dropbox/lepton/blob/master/AUTHORS

vanderZwan · on July 14, 2016

Doesn't mean they worked on it full-time

Jabbles · on July 14, 2016

So are you going to give me a better estimate?

jamie_ca · on July 14, 2016

https://github.com/dropbox/lepton/graphs/contributors

mmastrac · on July 14, 2016

Great work. Ensuring bit-by-bit identical output and compressing an extra 22% is amazing.

> For those familiar with Season 1 of Silicon Valley, this is essentially a “middle-out” algorithm.

I really wish that every single compression-related blog post would stop referencing Silicon Valley.

XorNot · on July 14, 2016

They will once Silicon Valley takes them to task for it. Just like season 1 pretty much wiped "making the changes world a better place" from the lingo of SV companies.

rhaps0dy · on July 14, 2016

> season 1 pretty much wiped "making the changes world a better place" from the lingo of SV companies

Did it really?

matznerd · on July 14, 2016

Yes, the show is making the world a better place.

golergka · on July 14, 2016

Better than we do.

ot · on July 15, 2016

> To encode an AC coefficient, first Lepton writes how long that coefficient is in binary representation, by using unary. [...] Next Lepton writes a 1 if the coefficient is positive or 0 if it is negative. Finally, Lepton writes the the absolute value of the coefficient in standard binary. Lepton saves a bit of space by omitting the leading 1, since any number greater than zero doesn’t start with zero.

The wording almost implies that this is novel, but it is actually Gamma coding [1], which in the signal compression community is often called Exp-Golomb coding [2]. I wonder why this is not acknowledged, considering that they mention the VP8 arithcoder instead.

[1] https://en.wikipedia.org/wiki/Elias_gamma_coding

[2] https://en.wikipedia.org/wiki/Exponential-Golomb_coding

mbreese · on July 14, 2016

So, am I to read this as to mean that when I send Dropbox a JPEG, they are behind the scenes further compressing it using Lepton? Then when I request it back, they are re-converting it back to JPEG?

daniel_rh · on July 14, 2016

That's the idea behind the algorithm, yes. And since it's lossless, every original bit is preserved. The same idea could be applied on the Desktop client instead of on the server, which would save 22% of the bandwidth as well and make syncing faster.

noahkim11 · on July 14, 2016

Any word on when that would be implemented?

hmage · on July 15, 2016

it already is.

SapphireSun · on July 14, 2016

This is amazing, we've been struggling with JPG storage and fast delivery at my lab (terabytes and petabytes of microscopy images). We'll be running tests and giving this a shot!

jszymborski · on July 14, 2016

jpg is a weird format to be storing microscopy images, no? Usually end up in some sort of bitmap TIFF (or their Zeiss/etc. proprietary format) from what I've seen.

SapphireSun · on July 14, 2016

So the weird thing is that we're not only a lab, but also a web tool. We have the files backed up in a standard format in one place, but delivering 1024x1024x128 cubes of images over the internet has been tricky. We don't need people to always view them at full fidelity, just good enough.

We tried JPEG2000, which was better quality per a file size, but the web worker decoder was slower than the JPEG one adding seconds to the total download/decode time.

EDIT: We're currently doing 256x256x256 (equivalent to a 4k image) on eyewire.org. We're speeding things up to handle bigger 3D images.

EDIT2: If you check out Eyewire right now, you might notice some slowdown when you load cubes, that's because we're decoding on the main thread. We'll be changing that up next week.

bnolsen · on July 15, 2016

Yeah jpeg2k sucks. It doesn't seem to do anything very well. Design by committee ruined it by making it waay to complex.

e12e · on July 14, 2016

The size difference between "high quality" jpeg and lossless compressed TIFFs can be quite significant - so I it might not be feasible to use eg. TIFF for archiving. And jpegs might very well be "good enough".

Eg. just tested on a small 10MP Sony ARW raw image - the raw file is 7MB, the camera jpeg is 2.3MB, and a tiff compressed with LZW is 20MB (uncompressed 29MB). The raw tiff run through lzma is 9.3MB. But either way, if ~7MB is the likely lossless size, if the JPEG is good enough, at ~2.3MB it's a pretty big difference, if we're talking petabytes, not megabytes.

(I'll get around to testing lepton on the jpegs shortly)

eon1 · on July 14, 2016

How on earth did a 7MB raw blow out to a 20MB TIFF? Isn't that going from uncompressed RGBG to losslessly compressed RGB?

e12e · on July 15, 2016

DNGs and RAWs aren't generally (AFAIK) uncompressed. But ideally they're losslessly compressed. They're all(?) "TIFF files" - but AFAIK saying something is a valid TIFF, is almost as helpful as saying something is "a file".

Apparently Sony uses some kind of lossy compression for it's files - I just tested with a jpeg2000 encoder on the same file above, and the size of the j2k file is approximately the same as the ARW: 7MB. Btw, the lep-file is 1.7MB.

Note that the uncompressed (flat) PPM file is 29MB as is the uncompressed TIFF - but simply running the TIFF through lzma reduces the size to 9.3MB. So ~7MB isn't that far off.

[ed: And while the lep-file was 1.7MB, shaving a bit off the original jpg, mozjpeg with defaults+baseline created a jpeg (at q=75, per default) 472k in size. Lep managed to shave a bit off that too - ending up with PPM->mozjpeg->lepton resulting in a 359K file. The (standard) progressive mozjpeg ended up at 464K.

This is not quite apples to apples, though, I think the comparable quality setting for mozjpeg would probably be 90 to 95 or so -- ending up around 1.6MB. But for this particular (rather crappy) image - I couldn't readily tell any difference.

Which I suppose is where https://github.com/danielgtaylor/jpeg-archive comes in.]

CarVac · on July 15, 2016

Raw files store only one channel per pixel. A TIFF has been demosaiced and stores 3 channels per pixel.

On top of that, most sensible raw formats only store 12 or 14 bits per pixel, instead of 16.

And then most are compressed, some losslessly and some lossily (like the infamous Sony format that packs it down to an average of 8 bits per pixel but does exhibit artifacts).

rleigh · on July 17, 2016

Some TIFF-derived container is the most common representation. But note even these could use JPEG/J2K if desired.

Most microscopy images are stored uncompressed or with lossless compression. But unfortunately this doesn't scale with newer imaging modalities. Here's two examples:

Digital histopathology. Whole-slide scanners can create huge images e.g. 200000x200000 and larger. These are stored using e.g. JPEG or J2K in a tiled BigTIFF container, with multiple resolution levels. Or JPEG-XR. When each image is multiple gigabytes, lossless compression doesn't scale.

SPIM involves imaging a 3D volume by rotating the sample and imaging it from multiple angles and directions. The raw data can be multiple terabytes per image and is both sparse and full of redundant information. The viewable post-processed 3D image volume is vastly smaller, but also still sparse.

For more standard images such as confocal or brightfield or epifluorescence CCD, lossless storage is certainly the norm. You don't really want to perform precise quantitative measurements with poor quality data.

manigandham · on July 14, 2016

Why not use a better format to begin with? Like FLIF: http://flif.info/

SapphireSun · on July 14, 2016

Thanks for the info, see my response below.

are595 · on July 15, 2016

I'd also consider WebP. FLIF is great but compression time is much slower than WebP for similar results.

manigandham · on July 15, 2016

Store it as FLIF then dynamically serve up whatever is needed for the web interface...

SapphireSun · on July 15, 2016

We're storing them as RAWs for our backups, but it's cheaper and faster to store them compressed and transmit them without a transformation step. I'll keep FLIF in mind, but it looks kind of unstable reading the website?

are595 · on July 15, 2016

Yes, the FLIF format is still being fully fleshed out. I would actually give WebP a shot instead of JPG. It offers better compression and already has native implementations in Chrome and on Android.

anc84 · on July 14, 2016

JPEG2000 might be a nice choice, it is often used in scientific environments.

SapphireSun · on July 14, 2016

Thanks for the info! See my comment above.

jsingleton · on July 14, 2016

This looks very useful for archiving, but as others have pointed out, less useful for web development. I did lots of research on image compression for a book recently and found quite a few helpful tools.

jpeg-archive [^1] is designed for long term storage and you can still serve the images over the web. imageflow [^2] has just been kickstarted and looks really promising for use with ASP.NET Core.

mozjpeg is also showing progress and if FLIF takes off then that will be great. Scalable images would be fantastic. No more resizing and all the security issues that brings [^3].

[^1]: https://github.com/danielgtaylor/jpeg-archive

[^2]: https://www.imageflow.io

[^3]: https://imagetragick.com

diamindo · on July 14, 2016

The technical rigor in these recent Dropbox blog posts is admirable. Seems like an impressively talented engineering team (or maybe just good at marketing :)

ptspts · on July 15, 2016

Portability notes for the Lepton implementation (https://github.com/dropbox/lepton):

* Implemented in C++ (-std=c++0x and -std=c++11 work, -std=c++98 doesn't work).

* Needs a recent g++ to compile (g++-4.8 works, g++-4.4 doesn't work).

* Runs on Linux and Windows.

* Runs on i386 (-m32) and amd64 (-m64) architectures. Doesn't work on other architectures, because it uses SSE4.1 instructions.

* Can be compiled without autotools (http://ptspts.blogspot.ch/2016/07/how-to-compile-lepton-jpeg...).

sevenless · on July 14, 2016

I'm interested in a 'super lossy' (deep learning based?) compression. You should be able to compress movies down to a screenplay and a few stage directions.

A picture should of course be worth 1000 words.

mcbits · on July 14, 2016

I started something like this long ago but didn't get very far with it. This was before "deep learning" and I was groping in the dark, but I think the concept is sound up to a point.

The idea was to train a neural net and build up a database of features (maybe on the order of 1-10 GB, or whatever is just small enough to ship) to estimate the missing details from downscaled and extremely over-compressed JPEGs. If it worked, I think it would also improve the quality of all the 10-20 year old images out there where the uncompressed source is long gone. Sort of a Blade Runner-style "enhance" tool, but of course it would only be filling in aesthetically plausible details.

kardos · on July 14, 2016

Sounds like the "content aware fill" algorithm [1], only for small scale features that are somehow stitched onto the low-resolution image

[1] http://www.logarithmic.net/pfh/resynthesizer

lfowles · on July 14, 2016

Google Photos already has an incredibly lossy compression, right? Images organized under Dog, Cat, etc. :) Might take a bit more effort to get the extra 999 words.

eon1 · on July 15, 2016

Sounds a bit similar to the fractal compression experiments[0][1] that were eventually repurposed in stuff like Perfect Resize. IIRC, it worked a bit like RLE, but by generating an IFS on the fly that could be used to recreate the image, instead of just storing pixel sequences.

[0] http://www.cs.northwestern.edu/~agupta/_projects/image_proce...

[1] http://www.math.psu.edu/tseng/class/Fractals.html

nachtigall · on July 15, 2016

I would be interested why C++ was chosen for https://github.com/dropbox/lepton instead of Rust?

Given the recent usage of Rust for the implementation of Brotli compression (https://blogs.dropbox.com/tech/2016/06/lossless-compression-...) and that it's used for data storage (http://www.wired.com/2016/03/epic-story-dropboxs-exodus-amaz...) this somewhat surprises me.

The reasons given for Rust as stated in https://blogs.dropbox.com/tech/2016/06/lossless-compression-... would seem valid here too: > For Dropbox, any decompressor must exhibit three properties: > > 1. it must be safe and secure, even against bytes crafted by modified or hostile clients, > 2. it must be deterministic—the same bytes must result in the same output, > 3. it must be fast.

daniel_rh · on July 15, 2016

SIMD support in Rust is still very early and required unsafe mode for SSE, and lepton makes heavy use of SSE intrinsics.

Now I do see http://huonw.github.io/simd/simd/ was being developed in August 2015, but it seems to be gathering dust of late.

I really do wish that Rust would provide nice alignment guarantees (eg 32 byte) without depending on customizing the allocator, and builtin, safe, SIMD instructions

joelg236 · on July 15, 2016

> I really do wish that Rust would provide nice alignment guarantees (eg 32 byte) without depending on customizing the allocator, and builtin, safe, SIMD instructions

Same could be said about C/C++. I'm guessing the answer is much simpler: the author(s) are comfortable with C++. And they probably don't deploy much rust code yet.

ngry · on July 14, 2016

Hmm, have to try it in WMS/TMS tile storage for web cartography. It uses JPEG files also but with fixed sizes like 256x256. May be it needs to tune predictor for that, because aerial imagery is a bit different from smartphone photos.

ValentineC · on July 14, 2016

Would someone be kind enough to explain how one could store the compressed .lep files, serve them, and then have the browser render them, without using a JavaScript library?

wongarsu · on July 14, 2016

You would either need a JavaScript library (which doesn't exist yet) or you would need to convince browser vendors to support lepton files natively.

In Dropbox's case they control both client and server and can just compress/decompress in the dropbox client. Using lepton on websites wasn't really Dropbox's goal (but it would be cool if a sufficiently fast JavaScript library existed).

xchip · on July 14, 2016

I love short articles where I learn something new :) Thank you!

deegles · on July 14, 2016

Does Dropbox do block or file-level deduping?

bmalehorn · on July 14, 2016

Block, it deduplicates based on 4MB blocks.

Source: https://news.ycombinator.com/item?id=2478595

the_duke · on July 14, 2016

I wonder if they use a rolling checksum too, to avoid duplicating a complete file if only a view bytes shifted (for example adding a line of text in the beginning of a file)

The backup tool bup (https://github.com/bup/bup) does this.

aidenn0 · on July 14, 2016

They almost certainly do not, mostly because of how slow doing so is.

sliverstorm · on July 14, 2016

It probably wouldn't hit the most important cases either, dedup is typically most powerful & valuable on large media files, software packages, disk ISO's, and the like which do not frequently have arbitrary text inserted at the start of the file!

mappu · on July 14, 2016

So it's using the VP8 arithmetic coder - how does that stack up compared to rANS?

daniel_rh · on July 14, 2016

rANS is really cool, but I think we tried it too early in the lepton research phase while some of the other ideas were still brewing... We might revisit rANS now that v1.0 is out!

eln1 · on July 16, 2016

WebP is switching to rANS: https://chromium-review.googlesource.com/#/c/338781/

Here is a superfast implementation of rANS: https://github.com/jkbonfield/rans_static

are595 · on July 14, 2016

Interesting that they are using VP8 to compress the JPEGs. This is one degree of separation away from Google's WebP [1]. It would be interesting to see how they stack up (WebP has a lossy and lossless mode).

[1]. https://developers.google.com/speed/webp/

niftich · on July 14, 2016

(EDIT: removed wording that Lepton produces files that conform to the JPEG spec. It doesn't. It losslessly compresses into a custom format, that losslessly decompresses into a JPEG)

Lepton uses the arithmetic coder [1] from VP8. Using arithmetic coding instead of Huffman encoding to get better compression was always an option in JPEG, but it has been historically avoided due to patents [2].

Compared to VP8-Intra, the compression used in lossy WebP, JPEG is missing the prediction step, usually called 'filtering' [3], which is the single largest contributor of WebP's compression outperforming JPEG.

Reading through the Lepton blog post, it seems they're using a different method of prediction, based on observations about typical gradients and correlations between AC and DC coefficients. VP8 uses a more 'traditional' approach of predicting your neighboring pixels, which was borne out of run-length encoding, but also very applicable to video's moving macroblocks. A comparison would indeed be enlightening.

[1] https://tools.ietf.org/html/rfc6386#section-7

[2] https://en.wikipedia.org/wiki/Arithmetic_coding#US_patents

[3] https://medium.com/@duhroach/how-webp-works-lossly-mode-33bd...

Stanleyc23 · on July 14, 2016

Just a random idea: could machine learning algorithms that do object recognition help to improve the compression of images or videos? Maybe a lossy algorithm could compress away "irrelevant" things. This way a high resolution frame might have lower resolution objects inside but it would he ok because the important part of the content is preserved.

joeyo · on July 14, 2016

Absolutely, yes. There is a very deep link between compression and "understanding". I think we have every reason to believe that networks that can understand/"explain away" the content/statistics of a scene ought to be able to compress them better.

I presume someone (or likely many people) are working on exactly this.

Stanleyc23 · on July 15, 2016

Very cool! if anyone who comes across this particular thread knows about papers/research being written about this topic, I'd be very interested to learn more.

sapphireblue · on July 15, 2016

There is an award-winning compressor that uses many statistical models (and a 3-layered dense neural network) to compress the data losslessly: http://www.byronknoll.com/cmix.html

Also there is a free book that contains descriptions of various common and exotic compression formats http://www.mattmahoney.net/dc/dce.html

Overall I think we are yet to see the full potential of deep learning unleashed on data compression. For example the neural network in cmix compressor is quite primitive compared to modern architectures. Someone will certainly find a way to do better than that!

Stanleyc23 · on July 15, 2016

ah thanks. really appreciate the lead. do you yourself happen to have much experience in the space of machine learning or compression?

Florin_Andrei · on July 14, 2016

I think a fully trained neural network is, in a sense, a "lossy" (possibly very lossy) description of the training set.

Stanleyc23 · on July 14, 2016

Oh what I was imagining was more like if I was watching a news broadcast and so all I care about was the news anchor person, and the main on screen grapic/text. If you Compress down other things like the table or the background or wrote an algorithm to selectively stream resolutions tied to the object streamed instead of the frame streamed. Is that a viable form of "compression" using existing computer vision tech?

sapphireblue · on July 15, 2016

Autoencoders are a long known approach to use neural networks for lossy compression. Some papers about using these models as aids for compression are googleable.

mp3geek · on July 15, 2016

Does it happen often where a company will disable pull access on github and to request EUA to be signed first?

orware · on July 14, 2016

Are we able to take the EXE and run it on JPEGs from a Windows machine without issue? (I haven't tried it myself, but probably will using the instructions I saw on Github a moment ago and report back...it looks like it should work though): https://github.com/dropbox/lepton/releases

However I am wondering about the two different EXEs available on the page (one has an avx prefix that I need to try and figure out). If anyone has info on that that'd be useful.

I could see this potentially being useful for schools and businesses doing quite a bit of digital scanning so I'm going to try running some tests using it and some images I think we have available somewhere.

mappu · on July 14, 2016

It should work fine.

The AVX version will be faster, but it requires a recent-ish CPU: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPU...

orware · on July 14, 2016

Thank you very much for the extra details mappu! That answers my question :-).

therealmarv · on July 14, 2016

I want this working seamlessly on the file system. That means: I see JPGs but they are lepton compressed in reality. Would be a great use case for existing file servers. How could this be (theoretically) achieved on a linux machine?

Strom · on July 14, 2016

One way would be to write a FUSE [1] that does this.

[1] https://en.wikipedia.org/wiki/Filesystem_in_Userspace

therealmarv · on July 15, 2016

Now I get very interested in writing a FUSE for that http://www.cs.nmsu.edu/~pfeiffer/fuse-tutorial/

inian · on July 14, 2016

Would be interesting to carry out a comparison with the MozJpeg format..

ptspts · on July 18, 2016

Some statistics. TL;DR Lepton indeed provides about 22% of size savings.

When running Lepton through JPEG photos downloaded from Flickr, I've got about 23.37% of size savings.

When running Lepton through JPEGs generated by mozjpeg (default settings: -quality 75, progressive) from JPEG photos download from Flickr, I've got 22.63% of size savings.

The mozjpeg output is about 3.83 times smaller than the original JPEG photo, on average.

fake-name · on July 14, 2016

Am I the only person who spent the first paragraph or so very confused why dropbox was messing with thermal cameras?

The FLIR Lepton is thermal imaging sensor (http://www.flir.com/cores/content/?id=66257), and has been for a few years now. They really should have used a different name.

kayoone · on July 14, 2016

Just because you know another thing with the name doesn't make it illegal to use that name as it has a meaning anyway. There is also a CMS called Lepton. Pretty sure this will soon be the most popular product with this name anyway.

fake-name · on July 15, 2016

"Illegal"? Wut?

It's not illegal, it's just dumb. Sure, you could implement your entire C++ application inside the `std` namespace, and I'm sure it'd work fine, but you /shouldn't/. If you're going to start a project, at least google the name first.

djmips · on July 15, 2016

just because it messed with your own personal namespace - you know that lepton is a particle and not a thermal camera right. Just because camera marketers got to you first. The real problem is polluting using useful words to market things in the first place.

personjerry · on July 14, 2016

Could someone explain to me how compression algorithms work? Shouldn't they not work very consistently, by pigeonhole principle?

aidenn0 · on July 14, 2016

They work because while most possible data is unstructured, most real-world data is highly structured. On average over all possible inputs, any compression scheme must have 1x compression.

Consider images of slides for a presentation that are text on a flat background. If you know the value of the pixel just to the left of the current pixel, then if you guess that the current pixel will be the same, you will be right most of the time. This is obviously not true for random noise. Consider a really simple compression scheme where a pixel is stored as a single bit of a 1 if it is the same color as the previous pixel, and the color is stored as usual, but with an additional 0 bit prepended to the color. When you guess wrong, you pay a tax of 1 bit, but when you guess right you save N-1 bits where N is the number of bits per pixel.

For random noise, this will grow the input quite a bit, but for simple flat-shaded graphics it will shrink the input quite a bit.

lilyball · on July 14, 2016

On average, any lossless compression scheme must have >= 1x compression. Lossy compression schemes can have less (since they're discarding information), and lossless compression schemes certainly can average out to > 1x (this is trivially observable by taking any 1x-average scheme and adding a fixed-length header to the output).

daniel_rh · on July 14, 2016

It's very easy to make a JPEG file that will not compress at all. Luckily it might look like snow from a television set rather than a typical image produced by a camera.

We live in a world where it is common for blue sky to occupy a portion of the frame and green grass to occupy another portion of the frame. Since images captured of our world exhibit repetition and patterns, there are opportunities for lossless compression that focuses on serializing deviations from the patterns.

GrantS · on July 14, 2016

Impressive work, Daniel! Do I understand correctly that any image prediction for which the deltas are smaller in absolute value than the full JPEG/DCT coefficients would offer continued compression benefits? As in, if you could "name that tune" to predict the rest of the entire image from the first few pixels, the rest of the image would be stored for close to free (and if not, it essentially falls back to regular JPEG encoding).

If that's the case, then not only could we rely on the results of everything we've decompressed so far to use for prediction (which is like one-sided image in-painting), but we also could store a few bits of semantic information (e.g. from an image-net-based CNN, from face detection) about the content of the original image before re-compression, and use that semantic information for prediction as well via some generative model. All of this would obviously be trading computation for storage/bandwidth, but it this seems like an exciting direction to me. Again, nice work.

daniel_rh · on July 14, 2016

Hi GrantS: That's pretty close to how it works. We always use the new prediction, even where it's worse than JPEG though (very rarely), to stay consistent.

As for having the mega-model that predicts all images better: well it turns out with the lepton model out you only lose a few tenths of a percent by training the model from scratch on each images individually. We have a test case for training a global model in the archive (it's https://github.com/dropbox/lepton/blob/master/src/lepton/tes... ) That trains the "perfect" lepton model on the current image then uses that same model to compress the image (It's not meant to be a fair test, but it gives us a best-case scenario for potential gains from a model that has been trained from a lot of images) and in this case it doesn't gain much, even in a controlled situation like the test suite.

However the idea you mention here may still be a good idea for a hypothetical model--but we haven't identified that model yet.

verytrivial · on July 14, 2016

I think you are explaining how both lossy and lossless compression work without explaining how this does it differently.

I can't read the article due to technical constraints, but understand that e.g. JPEG has a lossy quantisation pass followed by a lossless encoding/compression pass of the result of the first stage. If they're reproducing bit-identical result to the input JPEG, it must be a (very good) optimisation of the latter stage. [How'd I do?]

aidenn0 · on July 14, 2016

JPEG has a very poor lossless encoding stage; this is well known. Several archiving tools (most notably winRAR) already do some lossless reencoding of JPEG.

ars · on July 14, 2016

> Luckily it might look like snow from a television set rather than a typical image produced by a camera.

That would actually compress rather well. A hard to compress random image would not look like TV snow (white dots, with space between them), but rather randomly colored dots that are continuous in the image.

cocotino · on July 14, 2016

Random link: "I am trying to design a cloth that, from the point of view of a camera, is very difficult to compress with JPG, resulting in big-size files (or leading to low image quality if file size is fixed)."

https://dsp.stackexchange.com/questions/2010/what-is-the-lea...

CJefferson · on July 14, 2016

No compression algorithm works on completely random data. Fortunately (for compressors) there is in practice almost no completely random data sets people care about in the world. Almost all data is corrolated in some way. In pictures you can do a good job of predicting the colours of neighbouring pixels (they will, most of the time, be a similar colour).

In this case they are making use of the fact that JPEGs store a certain mathematical formulation of a picture. It turns out JPEG doesn't store that mathematical formulation very well, so you can squash it loselessly into a better formulation, then later turn it back into the original JPEG.

jmspring · on July 15, 2016

There were techniques discussed as part of the JPEG-2000 efforts where just reordering coefficients before doing the entropy coding would gain you a good deal of compression (though at the expense of the block based nature of JPEG).

It's always good to see new techniques out in the open.

nogridbag · on July 14, 2016

Does anyone know how this compares to jpegoptim?

https://github.com/tjko/jpegoptim

I'm hitting the limits in my OneDrive account and jpegoptim seemed to reduce my photos quite a bit.

ptspts · on July 15, 2016

jpegoptim has lossless and lossy modes. In lossless mode it preserves all pixels, but it doesn't preserve the file itself. Lossless jpegoptim is comparable to Lepton. In general, Lepton can tends to give better improvements, because it uses a different output file format. How much better depends on your input files, you should try both.

I'd say 22% for Lepton and 5% for jpegoptim, based on fading past memories of mine.

nogridbag · on July 28, 2016

Thanks. I posted that before trying out Lepton not realizing the artifact is no longer a jpg.

ipsin · on July 16, 2016

I wonder if Dropbox is going to do steganography detection before using Lepton compression, because "pixel-for-pixel identical" is obviously not the same as "byte-for-byte".

voltagex_ · on July 15, 2016

I could imagine Flickr being very interested in this, if they survive.

pdknsk · on July 15, 2016

similar: https://github.com/packjpg/packJPG

raverbashing · on July 15, 2016

Very interesting. But it's not clear to me how estimating the DC components makes it guaranteed lossless

usrusr · on July 15, 2016

You store the delta to the estimate. If the estimate is bad (e.g. because someone designed content for maximum surprise of the estimator function), compression rate goes down, not quality.

amelius · on July 15, 2016

When reading the title I thought this was about compressing images originating from a particle accelerator :)

mrcactu5 · on July 15, 2016

are there public data sets of images the scale of 15MB/s ?

S3curityPlu5 · on July 15, 2016

wheres the code?

jrcii · on July 14, 2016

That's nothing compared to LenPEG 3 http://www.dangermouse.net/esoteric/lenpeg.html

>For the standard test image the new LenPEG 3 compresses the image so efficiently that data storage space is actually freed on the computer right up to the entire capacity of the storage devices

ausjke · on July 14, 2016

don't get this one, where is the code and how can I try it out? if it's so good I would assume dropbox have found it instead of doing lepton.

capnhooke · on July 14, 2016

"When presented with an image, the LenPEG 3 algorithm uses the following steps:

Is the image Lenna? If yes, delete all other data on the computer's storage devices. If no, proceed to the next step..."

the_duke · on July 14, 2016

It's a "joke". Not a very funny one, though.

ceejayoz · on July 14, 2016

It's a joke, as evidenced by:

> LenPEG compresses the image into a file of minimal size: one bit.

imperialdrive · on July 14, 2016

LMAO - yes, Dropbox, please run LenPEG compression on your production env!

tqkxzugoaupvwqr · on July 14, 2016

It's a joke.

the_duke · on July 14, 2016

Not. Funny.

Dzugaru · on July 14, 2016

plus · on July 14, 2016

They are seeing a 22% lossless compression of already lossily-compressed JPEGs, though.

Dzugaru · on July 14, 2016

Ah, my bad. Disregard everything :)

daniel_rh · on July 14, 2016

Do you have a link to the code? Is it open source? Sounds pretty cool :-)

Dzugaru · on July 14, 2016

Anyways...

Part of ffmpeg open-source library

http://www.ffmpeg.org/~michael/ffv1.html