Wine is faster than Emscripten because asm.js is a slow target, not because HLE produces fundamentally better results than transpilation.
Wine has a variant, libwine, that can linked into a Windows binary at compile-time, replacing that Windows binary's linkages to Windows libraries with linkages to Wine. If you have the opportunity, this is always the more processor-efficient way to go (and I'm surprised so many companies choose to "port" their software to Linux using Cider rather than taking their source and compiling in libwine.)
If you didn't have the source of a Windows binary, you could create a Linux transpiler that replaces already-generated Windows symbols with Wine symbols statically while also potentially rewriting the ISA and calling conventions. The only reason one doesn't exist is that the overhead of running a very similar platform via HLE (i.e. one with the same ISA) is negligible. If OSX was still on PPC, for example, running Wine-on-PPC-OSX would work much better via transpilation than via HLE.
Speaking of PowerPC, a better candidate for "transpilation for performance purposes" would have been the PPC apps run under the Rosetta HLE emulator built into OSX. These apps could have been chewed-through once to produce a Carbon-linked x86 binary, that binary cached in the PPC binary's resource fork, and then executed immediately from then on. But Apple themselves had little reason to do this—they didn't want to give anyone an extra incentive to continue using PPC software instead of moving to x86/x64 software.
I wouldn't say that whether the shim libraries are built into the emulator or statically linked into the transformed output is very important. There are really two mostly orthogonal axes here:
- Instruction set: JIT (emulators, Rosetta) vs. static recompilation (only niche projects when the source is a machine ISA, but arguably Java AOT compilers, OdinMonkey, etc. count) vs. doing nothing because the source and target have the same ISA (Wine, VM software for the most part).
- APIs: HLE (emulate semantics) vs. LLE (emulate hardware). The more abstract the API, the better the former works, and vice versa. If you are not emulating things like interrupts and register pokes, that is HLE by definition.
In some cases you can /directly/ translate source API usage into target API usage without any shims, but only for relatively simple APIs. For anything complicated, the semantics are likely different enough that transforming calls to it would require complicated global analysis, which would be quite pointless from a performance perspective unless the API is called ridiculously often.
Wine has a variant, libwine, that can linked into a Windows binary at compile-time, replacing that Windows binary's linkages to Windows libraries with linkages to Wine. If you have the opportunity, this is always the more processor-efficient way to go (and I'm surprised so many companies choose to "port" their software to Linux using Cider rather than taking their source and compiling in libwine.)
If you didn't have the source of a Windows binary, you could create a Linux transpiler that replaces already-generated Windows symbols with Wine symbols statically while also potentially rewriting the ISA and calling conventions. The only reason one doesn't exist is that the overhead of running a very similar platform via HLE (i.e. one with the same ISA) is negligible. If OSX was still on PPC, for example, running Wine-on-PPC-OSX would work much better via transpilation than via HLE.
Speaking of PowerPC, a better candidate for "transpilation for performance purposes" would have been the PPC apps run under the Rosetta HLE emulator built into OSX. These apps could have been chewed-through once to produce a Carbon-linked x86 binary, that binary cached in the PPC binary's resource fork, and then executed immediately from then on. But Apple themselves had little reason to do this—they didn't want to give anyone an extra incentive to continue using PPC software instead of moving to x86/x64 software.