File names not being normalized across platforms is sometimes beneficial. Ignoring symlinks is also sometimes beneficial. However, sometimes these are undesirable features. The same is true of solid compression, concatenatability, etc. Also, being able to effiently update an archive means that some other things may be lacked, so there is advantage and disadvantage of it.
I dislike the command-line format of Hop; it seems to missing many features. Also, 64-bit timestamps and 64-bit data lengths (and offsets) would be helpful, as some other people mention; I agree with them, to improve it in this way. (You could use variable length numbers if you want to save space.)
My own opinion for the general case is that I like to have concatenable format with separate compression (although this is not suitable for all applications). One way to allow additional features might be having an extensible set of fields, so you can include/exclude file modes, modifications times, numeric or named user IDs, IBM code pages, resource forks, cryptographic hashes, multi-volumes, etc. (I also designed a compression format with a optional key frame index; this way the same format supports both solid and non-solid compression, whichever way you want to do, and this can work independently from the archive format being used.)
For making backups I use tar with a specific set of options (do not cross file systems, use numeric user IDs, etc); this is then piped to the program to compress it, stored in a separate partition, and then recorded on DVDs.
For some simple uses I like the Hamster archive format. However, it also limits each individual file inside to 4 GB (although the entire archive is not limited in this way), and no metadata is possible. Still, for many applications, this simplicity is very helpful, and I sometimes use it. I wrote a program to deal with these files, and has a good number of options (which I have found useful) without being too complicated. Options that belong in external programs (such as compression) are not included, since you can use separate programs for that.
> I dislike the command-line format of Hop; it seems to missing many features.
I agree 100%. I wrote most of it in like three hours; it's not a polished product.
> My own opinion for the general case is that I like to have concatenable format with separate compression (although this is not suitable for all applications). One way to allow additional features might be having an extensible set of fields, so you can include/exclude file modes, modifications times, numeric or named user IDs, IBM code pages, resource forks, cryptographic hashes, multi-volumes, etc. (I also designed a compression format with a optional key frame index; this way the same format supports both solid and non-solid compression, whichever way you want to do, and this can work independently from the archive format being used.)
I'm wary of slowing it down by adding lots of features. I think that, generally speaking, _more_ purpose-built binary formats should exist.
Engineers do this all the time with YAML files and JSON, but why not binary files?
I dislike the command-line format of Hop; it seems to missing many features. Also, 64-bit timestamps and 64-bit data lengths (and offsets) would be helpful, as some other people mention; I agree with them, to improve it in this way. (You could use variable length numbers if you want to save space.)
My own opinion for the general case is that I like to have concatenable format with separate compression (although this is not suitable for all applications). One way to allow additional features might be having an extensible set of fields, so you can include/exclude file modes, modifications times, numeric or named user IDs, IBM code pages, resource forks, cryptographic hashes, multi-volumes, etc. (I also designed a compression format with a optional key frame index; this way the same format supports both solid and non-solid compression, whichever way you want to do, and this can work independently from the archive format being used.)
For making backups I use tar with a specific set of options (do not cross file systems, use numeric user IDs, etc); this is then piped to the program to compress it, stored in a separate partition, and then recorded on DVDs.
For some simple uses I like the Hamster archive format. However, it also limits each individual file inside to 4 GB (although the entire archive is not limited in this way), and no metadata is possible. Still, for many applications, this simplicity is very helpful, and I sometimes use it. I wrote a program to deal with these files, and has a good number of options (which I have found useful) without being too complicated. Options that belong in external programs (such as compression) are not included, since you can use separate programs for that.