We've been through several generations of exploit mitigations starting with non-executable stacks, and, impressively, exploit developers found workarounds for each of them (although often the particular workarounds have requirements that might not be met in a particular vulnerability environment). In many cases I had the impression that the workarounds were surprising to the mitigation developers because the latter had expressed a lot of confidence that software security was about to make a huge leap and memory safety violations would rarely be exploitable anymore.
What are the prospects for finding workarounds to CET too?
(I don't mean to argue that there's no benefit to these mitigations or that some of them might not eventually finally stop whole classes of vulnerabilities. But I feel like their track record is not nearly as awesome as their inventors anticipated, so I wonder what informed opinion is on the eventual relevance or irrelevance of this one. Notably, the "RIP ROP" seems like a somewhat ambitious claim to mitigate a large amount of attack potential; how well-justified is it?)
It's not totally true that attackers have found ways bypass mitigations. In many cases the attackers require entirely new capabilities, or separate vulnerabilities, to get around a mitigation. And in some cases it makes an attack statistically unlikely, even if it may succeed.
Lots of mitigations do suck, and are a huge waste of time and effort, but quite a few are very significant. It's very rare that a bypass is a surprised except when the mitigation is poorly thought out. Good mitigations start with a threat model.
An example is ASLR. Once in a while ASLR is pronounced 'dead' because an attacker with capabilities that ASLR does not try to defend against can bypass ASLR. For example, the attacker has arbitrary compute on the system with ASLR. No one who built ASLR was surprised by this.
It's kind of easy to say "ASLR is not resistant to infoleaks" but it's really quite another thing if the infoleak ends up coming from something like a microarchitectural sidechannel or even an undocumented proprietary processor extension, rather than "hurr durr here is a slid pointer, we hand out these out like candy".
My point is that ASLR makes assumptions about the attackers capabilities. All good mitigations do. If the attacker has "I can run nearly arbitrary computations on within the processes memory space", that is outside of the model that ASLR attempts to deal with.
For example, ASLR for a process that executes Javascript is probably not going to be as useful as ASLR for a process that receives network requests.
How so? As far as I understand, being able to leak an ASLR slide from JavaScript is considered to be a security bug in every browser engine, because they do not intentionally provide access to that information.
Whether the browser intends to provide that information or not, ASLR was not designed as a control against an attacker who can run near arbitrary code in the process.
There will probably be defeats and accidental gadgets for the first couple releases, but this hardware technology has the potential to be better than any software based mitigation with the same goals.
Here's a paper with a more detailed security analysis:
https://sci-hub.tw/10.1145/3337167.3337175
Incredible how C and C++ have managed to keep the security industry busy.
Not that other languages don't have logical errors that might lead to security exploits, but in what concerns memory corruption exploits, the mitigation list just keeps pilling up.
Looking forward to how long ARM hardware memory tagging will hold on, after Intel's MPX failure.
At least so far Solaris SPARC ADI seems to hold on.
Most of the negative commentary us around the protection of the forward edges, while this blog is about the backward edges. The grsec post notes that implementing full support for the backward edges involves handling a number of special cases, but doesn't criticise its effectiveness if that work is done.
"As a reminder, Intel CET is a hardware-based mitigation that addresses the two types of control-flow integrity violations commonly used by exploits: forward-edge violations (indirect CALL and JMP instructions) and backward-edge violations (RET instructions).
"
It's a mitigation for a software exploitation technique called Return Oriented Programming (ROP). The mitigation is referred to as 'Control Flow Integrity' (CFI).
Essentially an attacker who has the ability to exploit the first stage of a vulnerability will be able to stitch together "gadgets" from the program to build up a second stage of the exploit.
Control flow integrity, to my understanding, applies a validation or restriction of the program's call graph. This limits the attackers ability to just stitch up their own arbitrary call graph. There are 'forward edge' protections (calling a function) and 'reverse edge' protections (ret). But of course there are more ways to control the flow of a program, as this document discusses - like longjmp.
I won't try to get more detailed as I'm not an expert. Hopefully this will help you find more information.
I'll add on since this is the most informative post so far (and I've written a static binary re-writer to add shadow stack protection to an existing binary).
A shadow stack is a limited subset of the call stack that only stores return addresses. In normal operation, Every time your compiled program makes a function call, it stores the return address on the main call stack (modulo certain compiler optimizations) so that when the called function returns, your program can resume executing directly after the point at which it called the function.
With a shadow stack, when a function is called, the return address is copied to a separate "shadow" stack as well as the call stack. When the called function returns, the return address on the two stacks are compared and the program fails if they are different.
In new Intel microprocessors, the shadow stack is implemented in hardware. The numerous corner cases require software support that the article describes.
Can you provide some details on your binary rewriter to add shadow stack support? Was this a pure software approach, or was it designed to take advantage of the support in new intel microprocessors? Do you have a write up of or can you give a quick overview of your methodology? Is the source code published somewhere?
No, it was proprietary code, and it wasn't for an Intel processor. It was a pure software approach, but the particular (embedded) environment made it harder to attack the shadow stack itself.
I had a pretty cool optimization that I don't think anyone's figured out yet. Oh well. That's the downside of software-as-trade-secrets.
Judging by the title, it helps avoiding ROP: "Return-oriented programming is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing." (Wikipedia)
Agree re: canaries, but when I learned about ROP I was told that ASLR typically is not employed on the text segment (due to lack of position independence) which is why ROP effectively acts a bypass for ASLR on the stack / heap and why we need things like control flow enforcement. Is this not the case or no longer the case?
Gcc these days compiles with -pie (Position Independent Executable) by default. This makes the text section position independent and able to be relocated, like a shared library.
You are correct that the main TEXT section used to typically not be position independent.
Windows uses relocations, not PIC, to enable different load addresses. That means the image in memory has its self references patched by adding the difference between compiled in load address and runtime load address. System DLLs can still share code with one another as long as they share the same load address in different processes for that reboot of the operating system.
Historically EXEs were either linked without relocations or had relocations stripped. They were always loaded first so ended up where they wanted, no relocation necessary. But /dynamicbase flag to linker opts in to setting a bit in the PE header and retaining relocations, so the EXE can be loaded elsewhere.
TL;DR: Windows supports ASLR on both executables and dynamic libraries.
There are limitations in sharing text segments in different processes, and extra memory usage mapping in text segments so relocations can be, uh, relocated, but once running there is no additional performance overhead. The runtime image is effectively "self-modified code" by the OS loader, patched to final addresses. No PIC register, no indirect references.
Right. You've described the performance impact: a ton of relocations hurts startup time, and the modified memory reduces sharing, costing additional memory. If startup time does not matter and memory is free, sure, it has no costs.
And how about performance impact? The mitigation's that have been done in software recently came with an ugly performance cost (just not as ugly as the vulnerability). Is there any speculation about what this is going to cost?
CET has two parts, a forwards edge protection (indirect jumps and calls like those necessary to execute a C++ virtual function, a Go interface function, or a Rust trait function), and a backwards edge protection to protect against Return Oriented Programming (overwriting the return address to attacker chosen code).
If I recall correctly, Windows will only use the backwards edge protection, since they already have a superior technology for forwards edge protection (CFG and XFG). The backwards edge protection has an impact of 1.65% according to that paper. Forwards edge protection had no impact.
I would argue that the learning is circular, as exploit development learns from mitigations just as much as mitigation development is informed by exploits ;)
What are the prospects for finding workarounds to CET too?
(I don't mean to argue that there's no benefit to these mitigations or that some of them might not eventually finally stop whole classes of vulnerabilities. But I feel like their track record is not nearly as awesome as their inventors anticipated, so I wonder what informed opinion is on the eventual relevance or irrelevance of this one. Notably, the "RIP ROP" seems like a somewhat ambitious claim to mitigate a large amount of attack potential; how well-justified is it?)