They mention that they do not have access to the threat actor’s obfuscating compiler itself, but while reading the analysis it occurs to me that given they have released a purpose-built deobfuscator, that they could certainly develop a ScatterBrain-like compiler and then I wonder if doing so might enable creation of useful heuristics that might reveal the quiet existence of ScatterBrain compiler in some sample, archive, darknet tools repo, compromised host, torrent, etc.
Just as they have supplied IOCs, perhaps they could provide reasonable signatures or heuristic rules that scanners in various places might ingest and apply that might allow for the discovery of some latent copy of the compiler itself, which could be useful in and of itself, as well as for all of the possible breadcrumbs and inferences that could be made based on where/when it was spotted, if it was.
Judging by this analysis one simple approach would be to search for control-flow desynchronizations, ARM is easier since instructions are aligned on 2/4-bytes but the variable instruction format for x86/x64 is used here to make the life of the decompiler harder.
However, you can store a map of how instructions are placed and detecting cases where instructions overlap to different sequences should be a big red flag for an AV tool (that said, it's not impossible to disguise instruction targets enough that an analyser would need to be nearly Turing Complete to find even this).
So what I was suggesting is I guess a detector of compiled (lol and possibly in need of deobfuscation now that I think of it, but that’s apparently a solved problem) code that generates the type of code you mention, in order to find a copy of the compiler, and not the compiler’s obfuscated output of malware, but as I try to clarify that in this reply and realize that the binary we would be trying to flag would probably be both of these things, it occurs to me that a) your near-turing-complete comment holds even if my original target wasn’t communicated clearly and b) if said copy of the compiler does already exist somewhere in the wild, it may well be picked up as a submitted sample based on an IOC anyway, since I have to assume the threat actor obfuscated their compiler binary by building it with a copy of itself. :-)
This would be challenging. It might be possible if the obfuscator includes some sort of magic constant or something in the compiled binary, which you could do a search for, but aside from that there isn't really an easy way to search across binaries for say control flow or specific characteristics beyond something that every binary will do (write output, open files, etc.)
This is very cool. Can someone help me understand the behind the scenes, what’s their strategy? Their motivations? Are they targeting specific industries or nations for a reason?