In theory, PROT_SAO should be useful for qemu, and trivial to make patches implementing there. That's assuming the kernel actually sets it, though. The problem I encountered when I set out to do it a year or so ago, was that I couldn't find a good test case to fail without it...
I used box64 as a test case, where I had a game that would run in emulation, but only if I pinned it to a single core. On ARM64, it also worked, as the JIT translator on box64 uses manually inserted memory fences to force strongly ordered access.
The game never worked correctly, even after I patched the kernel to mark every page on the system as SAO, and confirmed this worked by checking the set memory flags. This might be a mistake in my understanding of what SAO should do, though. (or another failure in box64 on ppc64le)
One thought I've had recently is perhaps it's like the recently discovered tagged memory extension and only worked in big endian? There's nothing in the docs to suggest this, but since the only test case was BE-only, maybe?