This change makes me sad, not because it isn't brilliant work - it is - but because this kind of brilliant work is unlikely to move the needle in the real-world. I can't use this RNG because it isn't FIPS validated. I can't sponsor getting it FIPS validated because the cryptography it uses isn't even FIPS compatible. It wouldn't make it past the cursory skim of a reviewer. That says more about FIPS than it does this work, but it still means that it's a non-starter for many libraries and applications that end up having US Federal Government users ... which is to say that basically everything important gets pushed away from benefiting from work like this.
Seperately, I'm also a little sad that there are no distinct RNGs for secret and non-secret output. It is an indispensable layer of defense to have separately seeded RNGs for these use-cases. That way if there is an implementation flaw or bug in the RNG that leads to re-use or predictability, it is vastly harder for that flaw to be abused. s2n, BouncyCastle, OpenSSL, and more user-space libraries use separately seeded RNGs for this reason and I don't think I could justify removing that protection.
On the other hand, there's the FIPS-accelerationist perspective, which suggests that as more and more modern cryptography is mainstreamed irrespective of FIPS silly requirements, FIPS will itself gradually become untenable, leaving us all better off.
I'm not a cryptographer so disregard my opinions as you please, but I really like OpenSSH's approach. They just implement whatever the cryptographer community thinks is the best approach at the moment and disregard NIST and other authorities. For example they already have a post quantum key exchange as one of the default algorithms.
I'm not a cryptographer either and I certainly trust OpenSSH developers and actual cryptographers a lot mote than outdated government standards. As far as I'm concerned, what these people say is the standard and they're the people I look for when I want to learn something about cryptography from websites, LWN articles, mailing list discussions and other such sources. I've learned a lot reading about getrandom on LWN, plenty of knowledgeable people involved in that work.
This sounds like wishful thinking. FIPS cryptography is involved, often by legal requirement, in major parts of American data processing and in important parts of society.
Banks and government data processors alike are often forced to use things with FIPS’ lesser security. FIPS is where NSA is known to have sabotaged cryptography. NIST is still disgraced by the tip of the backdoor/sabotage iceberg. Yet we are still stuck with NIST, and indeed with various lesser standards which we can generally just call FIPS.
It is lame and sure, we should build better things. We should ignore FIPS where possible. We probably agree there, but it seems unreasonable to ignore all the systems which cannot be made better by intentional limitation.
Ignoring FIPS doesn’t change that many important systems do use FIPS’ cryptographic constructions. It would be nice if the U.S. government wasn’t actively sabotaging the security of standards with backdoors. It would also be nice if the U.S. government didn’t require anyone at all to use FIPS. Too bad we can’t have nice things in many important sectors.
Acceleration won’t fix the systemic issues here. Ignoring the systematic failure of (intentionally) weak FIPS standards will only further create division. Non-compliance will sideline reasonably secure modern systems in important contexts.
We shouldn’t need to fix FIPS, but what other alternative will help users whose data is protected by FIPS cryptography in the FIPS legally mandated contexts? Ignoring it isn’t going to change the law, ignoring it won’t secure those systems. OPM leadership probably really wished they had ignored FIPS. That wishful thinking won’t repair the damage to national security done by just one NSA/NIST FIPS backdoor being exploited in that single prominent example case.
The way I've seen at least some standards structured, you could take a FIPS entropy source and XOR it with a non-FIPS entropy source and from a security perspective (disregarding consideration of complexity introducing bugs or side channel attack considerations) you'd be no worse off than using one entropy source on its own, but would potentially gain the benefit of not having to trust a single standard/implementation. Linux and BSD kernels already use this approach with CPU-supplied entropy sources which the kernels do not just inherently trust (there is no way to verify the implementation).
For encryption, you could have a FIPS-compatible TLS connection over a non-FIPS Wireguard tunnel and again be no worse off (excluding performance etc reasons), whilst gaining the benefit of not having to trust one implementation exclusively.
Are there any standards that you are aware of that would prohibit mixing FIPS and non-FIPS entropy and encryption techniques as described above?
CPUs are “Enabled” and listed as a successful operation by NSA. Standards which are implemented in hardware and software are similarly backdoored. Protocols are designed to leak enough information to exploit various things at various levels. The new push to post-quantum without using mandatory hybrid constructions that include something we at least strongly believe to be (contemporarily) secure means that we should expect problems at literally every level from hardware to the newest key exchange mechanisms.
Can we theoretically build a baroque machine that combines something we trust with something we explicitly assume is compromised? Sure, and I like XOR as much as the next cryptographically knowledgeable person.
Should we? No, absolutely not. More importantly, will everyone who is required to use FIPS? Certainly not. Is it even allowed by auditors? Doubtful but maybe you can become one and bless your own solution. It’s still not a solution for everyone, not even close.
At the point where we start to even try, might we ask ourselves why we tolerate NSA and NIST doing this to the American public? Why don’t we understand what they planned to do when someone like you proposed things like you have done? After all, there is no question that NSA was playing the long game - they just believed their own NOBUS propaganda. Whoops.
Are you implying that the OPM breach happened because of a successful exploit of an NSA backdoor in a FIPS 140-2 approved cipher? Do you have any evidence to substantiate that?
My understanding of the OPM breach is that it was a classic example of poor access control combined with phishing.
Here’s one of the academics from the papers I linked above explaining his “pet” theory: https://mobile.twitter.com/matthew_d_green/status/1433476266... The papers explain more in detail, and the ACM publication seems reasonable to characterize as peer reviewed for some value of that depending on your impression of ACM.
You asked for “any evidence” and those papers explain the how, while Wired explains the what and reporting on the Project BULLRUN clearly lays out the why. It is undeniable that U.S. government sabotage to benefit NSA surveillance is involved in deployed Juniper devices. It is also widely discussed that those sabotaged devices were used by OPM. It is understood by many reasonable people in the know that OPM was hacked by Chinese intelligence and that they likely did this by compromising at least one Juniper device. See https://www.bloomberg.com/news/features/2021-09-02/juniper-m... for the Chinese link, for one example.
Bloomberg says:
“The NSA introduces an algorithm called Dual Elliptic Curve...”
“...which it urges the National Institute of Standards and Technology to approve.”
“The Pentagon—NSA’s parent agency—insists that Juniper Networks use the algorithm as a condition for some contracts. Juniper adds it to NetScreen’s ScreenOS software.”
“This introduces an alleged backdoor that could be used to spy on select NetScreen customers, which include telecommunications providers and government agencies.
Juniper includes two tweaks that it says neutralize the vulnerability.”
There are other sources which investigated this topic.
You don’t have to connect the dots, or believe the reprint from several reputable journalists or academics but I would appreciate you acknowledging that it’s not something that I made up. Many reasonable people have concluded that the NSA sabotage played some role in the OPM hack as a result of the link to Juniper.
The attack against OPM leveraged juniper bugs, which clearly exceeds your claims. I acknowledged that your claim is at least the bare minimum possible. However there is ample evidence that is goes much much further. At the time this happened, people pushed back that it was even plausible, and the paper represents a better investigation than I will be able to provide on a message board.
What I hope we can also agree on is that a complete lack of transparency means we cannot fairly adjudicate this in public to completion. Juniper, the FBI and the NSA, and indeed NIST, have not provided a full analysis to the public or even to the relevant IG as far as I understand things.
FIPS does not impinge on me in any way whatsoever - that's my real world and that's the real world for a darn sight larger number of people than ... Americans.
I note your secret/non secret discussion for RNGs and that seems to be a good idea - many non FIPS standards also have a dichotomy like that or something more complicated. However, separately seeded pools reduces the entropy available (IANACryptographer) that's a trade off of some sort that I'm not qualified to assess.
This work is done by people I respect and I will evaluate it according to the standards I have to adhere to. None of those standards is FIPS. I'm not quite in a cargo cult state but I have to take a certain amount of things on trust when it comes to this stuff!
> I can't use this RNG because it isn't FIPS validated. I can't sponsor getting it FIPS validated because the cryptography it uses isn't even FIPS compatible. It wouldn't make it past the cursory skim of a reviewer. [...] but it still means that it's a non-starter for many libraries and applications that end up having US Federal Government users
But how do these user-space libraries get the entropy for their RNG? If they read it from /dev/random or /dev/urandom or call getrandom(), it's the exact same algorithm.
To put it another way: even though this is running in user mode, it's not a user-space library; it's part of the Linux kernel, which happens to be running in user mode. If for some reason you need to make getrandom() use FIPS algorithms, you already have to patch the kernel, and when doing so you'd patch this code to match. Because this code is the getrandom() system call, which like a couple of other "fast" system calls like gettimeofday(), is implemented partially in user mode and partially in kernel mode.
Usually those libraries seed from a FIPS-validated hardware entropy source. AES-NI is a common one, it's validated on some chips, but other hardware vendors have others (we have a hardware RNG in the AWS Nitro System for example). FIPS specifies a handful of DRBG algorithms that can then be used. AES-CTR-DRBG is the most popular. That's definitely no the exact same algorithm as the ChaCha/Blake constructions in Jason's work. It's all rather carefully constructed and then checked as part of the validation process. This is probably the most common reason for userspace RNGs in cryptographic software.
And some standards (like Common Criteria) actually require you to bring your own FIPS-validated CSPRNG, which effectively makes a userspace CSPRNG unavoidable.
I wasn't aware that FIPS required your _entropy source_ to be validated in order for your library to be validated, though. BoringSSL, for example, just reads from RDRAND, getrandom(2), or /dev/urandom. Maybe the OS/CPUs it's certified for all have FIPS-validated entropy sources?
Correct; after the newer seeding policies have taken affect, many have punted to the OS/app for seeding. See https://csrc.nist.gov/CSRC/media/projects/cryptographic-modu... -- the entropy input is plaintext via API, but still needs to be from an approved source via 90B I believe, so likely means their Android kernel is also certified or they maintain something like JitterEntropy for that.
The older e.g., 36xx series Google cert predates that requirement iirc, when you could seed from a non-FIPS kernel.
> That says more about FIPS than it does this work
It would be nice to have companies collectively reject FIPS. I realize that that's a pipe dream, but we do it with patents. Lots of companies pool their patents to defend against trolls on the condition that none of those companies sue each other over patent violations.
If a bunch of companies were like "We won't require you to be FIPS, and we won't get FIPS" or something that'd be cool. Unfortunately, that can't happen with FIPS specifically as it's government mandated, but idk, maybe SOC2 or something?
Many organisations need standards, they need formally defined things to point at and say 'make me one of those'. It doesn't always have to be the best, it doesn't always have to be the easiest, but without standards you have a moving target. You might say that a particular library is great and so we'll just specify that. But the problem there is you get into dependency hell and maintenance issues. When you can point to specifications that define behaviour and algorithms, rather than code, you reduce those issues.
Folk who've not had a good experience of standards tend not to appreciate the benefits they bring and yet so much in your life to this point has been defined by work done on standards. Proprietary standards and trends are what tends to lead towards closed ecosystems that don't benefit consumers.
FIPS is a sign that it’s hard for other people (except the Chinese in the Juniper/OPM case) and possible, likely easy, for NSA.
No American should mind, they obviously have your best interests in mind. Unless you filled out an SF-86 form that was stolen from the nice folks at OPM. Whoops.
For lower-stakes streams of random data, like such as unguessable patterns in video games (or perhaps in Monte Carlo simulations for a Chess or Go engine), CSPRNGs are a common solution here.
You bite off a chunk of data from the RNG and use it as the seed for a sequence based on a cryptographic hash, relying on the non-repeating qualities of these algorithms to give you a convincing random data stream that is not necessarily suitable for generating cryptographic keys.
The problem here, if there is one, is that if you implemented your RNG source as just reading from a file descriptor, the subsequent bite of that proverbial sandwich gets the entire tomato slice and most of the lettuce. If the language you picked already abstracts /dev/random, then it's just a couple lines of code difference for you. If not, well then welcome to the wide world of cryptography APIs, friend.
That's not really a fault of Linux or the 'real world', that's government entities intentionally holding themselves back. I'm not even saying that's a terrible thing - gov't work is a good place to require audited security options! - but that's the tradeoff they've chosen.
This looks like an excellent idea.
I will implement support for it in my libc immediately when it's available in a release kernel.
Currently userspace has incentive to roll their own RNG stuff.
This removes that, which is good for everyone. The less incentive you give people to write code that has already been written by other, more experienced people, the better.
I would go even further and export the kernel ciphers via vDSO.
Then user space could rely on those ciphers being optimized for the host CPU and side channel free instead of everybody bringing their own crypto primitives.
I don't think there is a good reason why gnupg and openssl would bring different crypto primitives.
Doing it through vdso has performance advantages because it elides a lot of the syscall overhead. This works make the crypto stack be more advantageous they it was previously
This way may actually have advantages over vDSO. Maybe you can set up IV and key with the kernel and then let the kernel do the crypto without having to have them in user space memory anymore. That would be a way to reduce risk in crypto applications, as long as you can prevent an attacker who has taken over the crypto app to retrieve the keys from the kernel. Maybe seccomp can help here.
Slightly more verbose: it is an ELF library that is mapped into every process containing kernel code that runs (due to app calls) in userspace. Most often it works (like in the case of gettimeofday) by reading values from another shared memory segment mapped between the kernel and user space. Getting time just involves carefully ordered reading from that shared mem
The vDSO functions are called using the C compiler's ABI for the platform. It's completely optional infrastructure that offers higher performance, normal system calls can still be used.
The address of the vDSO shared object is passed by the kernel to the program through the auxiliary vector, a list of key-value pairs. It's located on the stack right after the environment vector. The key for the address of the vDSO is AT_SYSINFO_EHDR.
There's more useful data in there too: system page size, CPU capabilities, loaded program's own ELF header and entry point locations and even its file name, user and group IDs, some bytes of random data. In most cases glibc will be the consumer of this data but it's perfectly possible to use of it ourselves.
Making providing high-quality randomness without compromises a first-class OS feature seems like a great idea. Especially because it reduces the chances of using a bad/incorrectly seeded/inadequately seeded/cloned from snapshot userspace cryptographic PRNG, and of using a non-cryptographic PRNG for cryptographic stuff.
I'm a kernel expert, so I don't know if VDSO is the right implementation, but the idea seems sound. Make it harder for people to make mistakes that break security!
On many operating systems, including macOS and Windows, the only ABI-stable interface is a userland-to-userland interface. Application code loads a shared library vended by the system and calls functions from that library like open() or CreateFileW(), in userland. These functions are in turn are usually thin wrappers around system calls with equivalent argument lists – but not always, and even when they are, it's only an implementation detail. Trying to call system calls directly without going through the wrappers risks incompatibility with future OS versions, e.g. [1].
On Linux, traditionally, the userland-kernel interface itself is ABI-stable. The userland code can be fully custom and doesn't even need to support dynamic linking. Syscall numbers and arguments are fixed, and application code can perform its own syscall instructions. You can then layer something like glibc on top of that, which provides its own syscall wrapper functions with a corresponding stable (userland-to-userland) ABI, but that's separate.
The vDSO has always been a step away from that. It's userland code, automatically mapped by the kernel, that provides its own system call wrappers. Applications are still allowed to make system calls manually, but they're encouraged to use the vDSO instead. Its original purpose was to allow certain functions such as gettimeofday() to be completed in userland rather than actually performing a syscall [2], but it's been used for a few other things. It's worked pretty well, but it does have the drawback that statically linked binaries no longer control all of the code in their address space. This, for instance, caused a problem with the Go runtime [3], which expected userland code to follow a certain stack discipline.
Anyway, this patch seems to me like a significant further step. Not just putting an RNG into the vDSO, which is more complicated than anything the vDSO currently does, but also essentially saying that you must use the vDSO's RNG to be secure (to quote the RFC, "userspace rolling its own RNG from a getrandom() seed is fraught"), and explicitly choosing not to provide stable APIs for custom userland RNGs to access the same entropy information.
I don't think that's necessarily a bad thing. It's not that complicated, and to me, macOS' and Windows' approach always seemed more sensible in the first place. But it's a step worth noting.
Just for the sake of clarity: the claim is that you must use getrandom, somehow, to be secure, or maybe more specifically "you are almost certainly not secure if you roll your own CSPRNG". Even before the vDSO, I think one would have made the same claim about getrandom.
> Just for the sake of clarity: the claim is that you must use getrandom, somehow, to be secure, or maybe more specifically "you are almost certainly not secure if you roll your own CSPRNG".
Right; more accurately stated, you must use the vDSO's RNG to securely get random numbers quickly.
> Even before the vDSO, I think one would have made the same claim about getrandom.
Because of VM forks. But an alternative would be to expose VM fork events more explicitly to userland, so custom RNGs can take them into account.
And I'm not sure that fixing the RNG, without providing a generic way for userland code to take forks into account, is enough to avoid all vulnerabilities. After all, there will always be a window between generating random numbers and using them. As a dumb example, suppose a DSA implementation first picks a random nonce, then reads the data to be signed from some external source. If the VM forks in between those actions, and the two forks choose different data to be signed, you're in trouble no matter how good the RNG is.
I said it's a dumb example because there's an obvious fix (picking the nonce after the data to be signed). But as a nonexpert, I'd be kind of surprised if there wasn't some cryptographic protocol where forking midway through operation is inherently insecure, and the only solution is to abort and retry. Heck, couldn't that apply to TLS?
Actually, I see that Jason submitted an earlier patch [1] that does provide explicit fork events. I don't know whether this vDSO patch is meant as a replacement or a complement.
i think that's mostly scientific computing where you want the ability to control the RNG and even intentionally use deterministic seeds for reproducibility.
i think if the kernel is going to provide secure random numbers (which seems like a good idea), it should be through a (new) specific system call that fails unless a hardware entropy facility is available. performance seems like a secondary goal, where the primary is ensuring that people are using the right thing to generate keys and such.
There is always the option to reseed userspace PRNG with with getrandom() regularly. This makes userspace PRNG safe and more versatile than getrandom().
This is what the patch does. It does not handle the case of VM resume yet. Fork safety is achieved through a generic mechanism that any userspace generator could use. The advantage the vDSO has is that it can assume that MADV_WIPEONFORK is implemented, but that's about it.
The vDSO-based approach is certainly interesting because the kernel can know exactly where randomness caches are located and zap them as needed (on fork, periodically, after VM resume). But if entire pages of memory are dedicated to buffers anyway, MADV_WIPEONFORK is sufficient for now.
And it's wrong. If you initialize your own PRNG properly with multiple reseeds you saturate the entropy pool of your PRNG and subsequent reseeds are only relevant for ratcheting and state compromise proofing.
This assumes you aren't starved for entropy on initialization. If you are, it would imply a constrained environment and you are better off using getrandom() then.
There is no such thing as being "starved for entropy", once you've hit whatever threshold you require for considering your RNG to be seeded in the first place.
If your PRNG state is is x bits, if you initialize it with less than x bits of entropy you are starved.
Since you cannot know how many bits of entropy getrandom() would give you and when it itself is reseeded with fresh entropy, you usually have your userspace PRNG sample getrandom() for some amount of time, after which it is considered initialized.
There is no magic. If entropy got injected in between getrandom() calls you get that entropy - it does not matter if the entropy is cryptographically scrambled and 'merged' into the kernel PRNG state.
All you have to do is 'merge'/'absorb' multiple getrandom() results over some time t into your userspace PRNG, with t big enough to allow for multiple reseeding events by the kernel. You are in effect getting samples of the new entries into the kernel entropy pool indirectly by doing this, sampling too frequently is wasteful however as you are just going to be sampling the kernel CSRPNG with the same entropy in its state most of the time.
Doing the above is actually the only way to properly seed your userspace CSPRNG if your userspace CSPRNG has a state larger than the kernel CSPRNG (kernel's is 512 bits I believe) otherwise you will be working with a 512 bits of entropy as an absolute maximum even if your CSPRNG is capable of holding more, at least until reseeding (both the kernel and the userspace PRNG).
It's kind of wild, yea. I'd rather not do it. But if it's between unsafe userspace implementations and this thing, this thing is better. Maybe people will decide they don't care about hyperspeed card shuffling or whatever else. But if they do, this is an attempt to provide it safely.
I guess my biggest concern here is the notion that vDSO is going to manage the state in user space, if I understand correctly. That seems like a big footgun.
If I call the getrandom system call, and it succeeds, I am (pretty much) guaranteed that the results are properly random no matter what state my userspace program might be in.
With vDSO, it seems we lose this critical guarantee. If a memory corruption occurs, or my process’s memory contents can be disclosed somehow (easier to do against a userspace process than against the kernel!), I don’t have truly random numbers anymore. Using a superficially similar API to the system call for this seems like a really bad idea.
> If a memory corruption occurs, or my process’s memory contents can be disclosed somehow (easier to do against a userspace process than against the kernel!), I don’t have truly random numbers anymore.
A mitigating factor might be that if your process memory leaks, then the secrets generated leak anyway, no matter what generated them, so maybe not as large of a difference. But of course a generator leaking means future secrets potentially leak too. I suppose frequent reseeds could mitigate this, just as they do for potential leaks in the kernel.
But anyway, I agree that compromising application memory is somewhat more possible than compromising kernel memory, though both obviously happen.
If you have memory corruption in your process, what makes you confident your program state will let you do something useful with the randomness you get back from getrandom()?
I guess my concern is with “silent” memory corruption, e.g. someone putting in a “bzero(state, …)” by accident and winding up with deterministic randomness. Sure, they could also just as well do a “bzero(randombuf, …)” before using it but that’s much easier to detect (and in my head, somewhat harder to do by accident).
Silly mistakes like the Debian randomness bug come to mind - a program can be totally well-behaved even in the face of a glaring entropy failure, in a way that’s hard for developers to detect.
I guess? I mean, I see "something overflowed on the stack and into my randomness buffer" as being similarly common and about as undetectable. That's not to say we shouldn't invest in making APIs that are harder to misuse even if you hold them incorrectly, but I'm not sure the benefits are very compelling here.
The article mentions this case too. getrandom() on my system seems to return the required amount of random bits to perform a shuffle of a deck in less time than my clock seems to have precision for; … that's … too slow?
There are cases where you want tons of random numbers (e.g. monte carlo) and the line between "good enough" and "disastrously bad" is often unclear. Providing cryptographic random numbers is the only possible API that's both safe and generic.
As the post says, it's worth entertaining the idea of having the kernel provide a blessed way for userspace to do that, though I admit I've never personally seen a scenario where RNG was truly the bottleneck. But it'd still be nice to kill all the custom RNGs out there.
Don't you always want a reproducible random sequence for such simulations? I.e you use getrandom for the initial seed only, record it, and do the rest of your RNG state in userspace code?
It's a nice property, but a lot of people skip it because of the tradeoffs. I'm also sure there are lots of use cases I'm not aware of where you don't want reproducibility.
Why not just have the kernel map a page containing random bytes, that it rewrites with newly seeded random bytes when needed? Then userspace CSPRNGs could use that as a basis for their own reseeding.
Seperately, I'm also a little sad that there are no distinct RNGs for secret and non-secret output. It is an indispensable layer of defense to have separately seeded RNGs for these use-cases. That way if there is an implementation flaw or bug in the RNG that leads to re-use or predictability, it is vastly harder for that flaw to be abused. s2n, BouncyCastle, OpenSSL, and more user-space libraries use separately seeded RNGs for this reason and I don't think I could justify removing that protection.