The encryption used by e.g. IBM tape drives is two layered - strong/slow and weak/fast. The Asymmetric key encryption used is the "strong" encryption, but it's far too slow to be used to encrypt the whole tape. So IBM encrypts the tape data with the weak/fast symmetric algorithm and then encrypts the key for that with the strong/slow asymmetric one.
If you compress the symmetrically encrypted data, you will indeed gain some capacity. Not as much as you would compressing the raw data, but a visible amount, because the symmetric algorithm doesn't have the property of indistinguishability.
That property is not needed for tape storage because anyone who has the tape can safely assume it contains a ciphertext. The tape has header information saying whether it's encrypted or not.
If you read the links you posted, you will find out that it's not "most" cryptosystems, it's some. Some applications just don't need indistinguishability and LTO Tapes are one of them.
All that said, IBM does compress first then encrypt to make the most of the data compression, but the resulting ciphertext isn't completely random.
If you compress the symmetrically encrypted data, you will indeed gain some capacity. Not as much as you would compressing the raw data, but a visible amount, because the symmetric algorithm doesn't have the property of indistinguishability.
That property is not needed for tape storage because anyone who has the tape can safely assume it contains a ciphertext. The tape has header information saying whether it's encrypted or not.
If you read the links you posted, you will find out that it's not "most" cryptosystems, it's some. Some applications just don't need indistinguishability and LTO Tapes are one of them.
All that said, IBM does compress first then encrypt to make the most of the data compression, but the resulting ciphertext isn't completely random.