It looks like they store the extended attributes in the __MACOSX tree too. And everything you download from the web has extended attributes describing the original URL and page that linked to it. (With chrome/safari, I haven't checked firefox.)
It looks like the finder "compress" functionality will include __MACOSX, and the command line zip doesn't.
If you run `xattr -l` on something in your Downloads, you should see the kMDItemWhereFroms metadata. mdls shows it too, but also includes other data that is extracted from the file itself.
> Zip files can encode their file names in two ways: CP437, or unicode.
> Each operating system does it wrong, but in a different way. For instance, Mac OS encodes its zip files as unicode, but doesn't set bit 11 correctly, so Python (correctly) reads them as CP437, and garbles the non-ASCII characters in file names.
> I wrote a quick and dirty workaround for Mac OS archives: if the file doesn't exist, encode the name as CP437 and check again. I'll think of something more clever if I ever switch to another OS.
This is my script to “clean up” a typical macOS zip file.
It assumes you’ve started by asking the Finder to “Compress” (which creates a zip file but one riddled with macOS-isms).
#!/bin/bash
zipfile=$1
if [ "x$zipfile" = "x" ] ; then
echo "$0: .zip file expected" >&2
exit 1
fi
zip -d "${zipfile}" "__MACOSX*"
zip -d "${zipfile}" ".DS_Store"
zip -d "${zipfile}" "*/.DS_Store"
unzip -l "${zipfile}" | sort -k 5
Hidden dot files were introduced as a bug and left to linger because they were kind of useful. It all started when someone tried to hide . and .. from the output of ls and messed up the if statement to only check if the first letter in the filename was a period instead of testing for the intended use cases. People then copied that behaviour around because it was a cool new trick, not because it was set up as a standard.
The Windows method, leveraging file attributes, is actually much cleaner in my opinion. You can set the hidden attribute in ZIP files and most tools do for OS specific files and folders, but I don't want my ZIP tool to put files on my file system that I don't get to see first so I always turn them on.
Windows does the same thing with desktop.ini, but I rarely encounter those anymore. It used to be that every ZIP had a bunch of thumbs.db files but Microsoft seems to have cut that out.