Hacker News new | past | comments | ask | show | jobs | submit login

I believe ZIP supports updates about as well as SQLite. You append the new file to the end (and write the "header" after that). This leaves a hole in the middle, but that can be reused by a later addition. There's some details about internal page size and fragmentation, but I think it could amortize to about the same. There's something to be said about quality of implementation, but that's not an argument about the file format per se.



How? There's a massive difference in usability and performance with a SQL database and all the data querying and manipulation it provides compared to a ZIP file that just straps together a bunch of compressed payloads with a key/value file structure.

The linked page even has a case study of the OpenDocument file format that shows exactly how SQLite is better than a Zip file: https://www.sqlite.org/affcase1.html


If you're using SQLite as a pile of files, you're not doing much querying if the compressed data inside.

> Newer machines are faster, but it is still bothersome that changing a single character in a 50 megabyte presentation causes one to burn through 50 megabytes of the finite write life on the SSD.

That's funny, since Firefox would burn through gigabytes of writing to SQLite. All comes down to quality of implementation.


The point is that you don't just use it as a pile-of-files when you have SQL access to the data and instead use a high-level data model. This is not possible at all with ZIP.


ZIP is a horrible, horrible format - it's best left in the past.


I honestly don't know much about the relative differences in formats like this, what is horrible about ZIP? Also, my impression is that more compressed files end up in RAR these days. I know that's not a new format either-- is it similarly horrible?


To read a zip file, you need to read it backwards: https://github.com/corkami/pics/raw/master/binary/zip101/zip....

There's also another standard called ZIP64 to allow for larger files (maximum 4GB in the original spec).

I was going to say that RAR is proprietary, but at least there's (non-FOSS) source code and good specs: https://www.rarlab.com/technote.htm


The central directory is a great feature of zip. Makes it easy to append to and to list the files or decompress individual files without decompressing the entire archive.

The actual horrible thing about zip is that there is no standard whatsoever for the encoding of filenames. So mangled filenames are common and most utilities don't even let you manually select the encoding if you know what it is.


Yeah, I occasionally would encounter errors decoding zip files along the lines of "the file name was too long"


Are there zip libraries that facilitate this usage pattern?

Either way, I tend to think you'd probably be better off with sqlite since it would give you more flexibility (for instance if you later realize that your application possesses relational data.)




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: