Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've recently been looking into this same issue because I analyse a lot of data like sosreports or other tar/compressed data from customer systems. Currently I untar these onto my zfs filesystem which works out OK because it has zstd compression enabled but I end up decompressing and recompressing which is quite expensive as often the files are GBs or more compressed.

But I've started using a tool called "ratarmount" (https://github.com/mxmlnkn/ratarmount) which creates an index once (and something I could automate our upload system to generate in advance, but you can also just process it lcoally) and then lets you fuse mount the file. This works pretty great with the only exception that I can't create scratch files inside the directory layout which in the past I'd wanted to do.

I was surprised how hard a problem to solve it is to get a bundle file format that is indexable and compressed with a good and fast compression algorithm which mostly boils down to zstd at this point.

While it works quite well, especially with gzip and bzip2, sadly the zstd and xz (and some other compression formats) don't allow for decompressing only parts of a file by default, even though it's possible the default tools aren't doing it. The nitty gritty details are summarised here: https://github.com/mxmlnkn/ratarmount#xz-and-zst-files

The other main options I found was squashfs which recently grew zstd support and there is some preliminary zip file support for zstd but there are multiple standards which is not helpful!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: