Hacker News new | past | comments | ask | show | jobs | submit login
“RNG broken for last 4 months” (freebsd.org)
276 points by svoyage on Feb 17, 2015 | hide | past | favorite | 158 comments



Note that this is FreeBSD-CURRENT only, i.e. the development branch. This does not affect FreeBSD releases.


Word on Twitter is that there are some pretty major companies on -CURRENT.


Then they deserve what they get.

Honestly, this is probably some kind of confusion over a company or two that has -CURRENT as an upstream to their own fork of the OS because they are reliant on some kind of new changes in the system that aren't in -STABLE.

Even if that's the case, you'd hope they'd be taking extra special precautions considering what they are doing.


I agree with you in principle but I think in practice you'd need to give a little bit of leeway here because 'CURRENT' is a very suggestive name. And it does not suggest 'Do not use for production'. It rather suggests the opposite.


When I read "CURRENT" I assumed that was the label for the most current stable release. I would expect NIGHTLY or ALPHA/BETA for something that shouldn't be used for important things.


I think you should learn and read about what you're downloading and running in production (given your hypothetical example). It's basic literacy. For example, would you go to Linus's github: https://github.com/torvalds/linux and just start downloading that and try running it verbatim in production because "oh! it's Linus's branch! Linus makes Linux. It is Linux!" and then complain because it's not an actual distribution?

Because that's how you read in your comment.

The linux development kernel is called "mainline" Would you want them to change it to Nightly or Alpha?


Yes, people should read the manual.

And FreeBSD should name things better. Those two things are not mutually exclusive.


I agree you should read and understand what you're downloading and running in production. And I'd like to think that if I were in a position such that I was making those decisions I'd not make the mistake of assuming CURRENT meant current stable release. I'm just saying, reading the post I originally had assumed that's what current meant.


If you give inaccurate information and then correct it in the manual, you still gave inaccurate information.


If you didn't read the manual before putting something in production, you still didn't do your job properly.


If there is one group that categorically does not read the manual then it is computer people. And if they do read the manual it is because something didn't work as expected.


That's not entirely wrong. On the other hand, if you can't tell whether the OS you deployed in production has, say, security support, maybe you're not the right person to administer it.


I would not state it as broadly as "computer people."

A lot of computer people are indeed that way. But many others are not.

Invariably, when something goes wrong, the group that doesn't read the manual visits forums or chats to seek help from the group that does.


This probably comes from overconfidence. If you're pretty sure you can handle whatever comes up, it's easy to see reading documentation in advance as a potential waste of time.


Who gave you inaccurate information? Assuming CURRENT means stable is just plain assuming.


Principle of Least Astonishment. The names of things should not mislead a reasonable user as to their purpose.

It is still the responsibility of the user to educate themselves on the things they use; but it's also the responsibility of the designer to make things straightforward and intuitive. I'm okay with blaming both of them, but don't pretend that the designer has no responsibilities here.


I can understand the terminology may be confusing to someone new to FreeBSD, but the handbook clearly documents the meaning of -CURRENT and -STABLE, and explains that -CURRENT is the "bleeding edge" of FreeBSD development.


"EDGE" may be a better term in that case... meaning it may cut you... there's risk... "CURRENT" to me brings a thought of, okay, it's current and up to date (safe from known exploits).


If you go here: https://www.freebsd.org/where.html

There is RELEASE, CURRENT, STABLE. If you choose one without knowing the difference as to what those titles mean, you deserve whatever comes next.


In my experience with open source projects, STABLE should be first on the download list. RELEASE should really be labeled release-candidate, with stable being considered the release.


RELEASE is the release, -STABLE is the minor branch for bugfixes and minor new features that don't break backwards-compatibility ("stable ABI"). -CURRENT is, well, current - it's the tree the developers are working on, and is the current state of FreeBSD, where releases are archives at a particular point (more-or-less).

I suspect much of this goes back to when development wasn't quite so public, so these are names that are meaningful to the developers, not the public.


Three key words there - "In my experience". Well now you've experienced a project who doesn't fit your current model of the world. Should those of us who have RTFM and experienced different vocabularies adjust to your model or should you recalibrate?


It doesn't really matter what anyone thinks a name means based on their prior experience. If they didn't RTFM I've no sympathy, particularly if that user is depending upon security features within the OS. Users, particularly corporates, have to take responsibility for your own situation and if they got caught out by this because they didn't understand or bother to take the time to read what -CURRENT is then I've no respect for them in any technical capacity.


So, then we should label the unstable may crash at any time release 'USETHISONE' and the stable one 'POSTALPHA'.

Really, I don't get this attitude of 'if they can't be bothered to read the manual they get what they deserve'. It's everybody's loss to have more compromised machines on the net because compromised machines allow bad elements to gain a foothold and from there it can get much worse in a hurry.

So you treat this as a communications problem and reduce the potential of error by properly labelling your releases (this costs $0) instead of telling those that got bitten by it that they 'got what they deserve'.

Whether you have respect for them or not doesn't enter into it, it's a security issue, not a respect issue.


CURRENT has meant MAYCRASH in BSD for... 25 years now?

I don't know how suggestive it is to someone who doesn't use FreeBSD, but if you as much as download and install it, there's no way you don't stumble into at least one "-CURRENT is not what you want for production" fine print.

This isn't a communication, security or respect issue, it's a competence issue. You don't just install an OS on a server and "somehow" not know you're installing a development version!


Lance Leventhal is one of my favorite authors of machine language books, he wrote a series of books on old school microprocessors, each of which followed a fixed template and taught you the basics of yet another processor in a familiar setting.

One of the lines from those books that stands out very clearly 25 years after reading them last: "If a variable is tallying the number of horses name it 'nhorses' not 'qdogs'".

Names matter. So if 'CURRENT' means 'MAYCRASH' name it so.


-CURRENT is called CURRENT because it reflects the CURRENT development efforts and their CURRENT state. Of course it may crash. When it stops crashing, it's going to be released and that's how it will be turned into a RELEASE.

There is literally (and I'm using the word in its proper sense, not as a hyperbola) no way you end up with FreeBSD on a server without knowing this. No one just installed it "by mistake" in a production environment or mistook it for a STABLE version.


Many projects use "head" to refer to their development branch. That's equally as non-meaningful, yet nobody complains about that.


"Head" is very commonly used because it is from the language of revision control systems. CVS, Subversion, Perforce, Git, Mercurial, and probably others all refer to the latest checkin as the "head". It has a specific meaning to a much larger group of people than just FreeBSD users.


-CURRENT machines shouldn't be "on the net" in any meaningful way unless you've calculated the risks and are willing to accept them.

I don't get this attitude of people who think projects should fit their model of how things should be named. Many of us have no problem understanding the labelling, its really fairly intuitive. We also have no problem understanding alternative schemes used in other projects. Just because its different to what you're used to doesn't make it wrong.

Whether or not I have respect for them matter quite a lot if they're trying to market security related products. Basing themselves on -CURRENT and not understanding the potential consequences speaks volumes about that organisation or teams competency in their chosen field.


Given that... wouldn't -DEV or -EDGE or -UNSTABLE make more sense?


No they don't. It's reasonable to trust that FreeBSD is providing decent RNG. Even in the dev branch. I mean, why would they not?

Don't blame the user for upstream's mistake.

Do FreeBSD developers (I mean, the ones who had nothing to do with this but also use the dev branch) "deserve" it, too?

In general, do you always blame the victim for someone else's mistakes?

edit: I am not a BSD user. I am reading now that the dev branch is purposely crippled anyway and known to be terribly unstable. If that's the case, maybe it is idiotic for non-developers to use it. I'll leave this comment here instead of deleting in so nobody else says the same thing. In the meantime, I'm glad to be using a rolling release Linux distro that always has bleeding-edge software without ever having any problems.


> bleeding-edge software without ever having any problems

Did you seriously just suggest that Arch never has any issues? How did you manage to achieve that, because I gave up on Arch due to it breaking at least once a fortnight. The "bleeding edge" is called that for a reason: you're likely to bleed when using it.


I run arch with the testing repos on both my work and home desktops, and the normal repos on one of my servers, and I've had updates break things exactly once between them (broadcom NIC wouldn't come up on my home desktop), and the fix was just a matter of rolling back 1 kernel version. So I guess YMMV?


I very rarely have any issues with Arch. Like, maybe once a year. And they're always easy to fix.

One big difference is I don't run a desktop environment (kde, gnome, or whatever crap is out there now). I suspect that makes things a lot less likely to break when I do updates.


It is also terribly stupid for software and application developers to run -CURRENT (as a base/primary platform) because it's for the development of the OS itself.

The closest linux analogy is Linus's unstable git branch or maybe the redhat rawhide https://fedoraproject.org/wiki/Releases/Rawhide - even that has warnings:

from the Rawhide wiki: "Not recommended for production systems

We do not recommend that you run Rawhide as your primary production operating system. Instead, we suggest you could install and run Rawhide:

    As a live environment only
    In a virtual machine (VM) instance
    On a secondary system
    On a multiboot system, alongside a stable release of Fedora or another operating system 
This allows you to test Rawhide without any impact to your day-to-day workflow. "

-CURRENT is the freebsd version of that: you know that it's going to be where things are being torn out and put back in on a hour to hour basis.

The group of people using -CURRENT is akin to the group of kernel maintainers in linux-land.

I do admit though - the rawhide warning is more verbose and clearer than the FREEBSD-CURRENT warnings. That's one part that I can see the community improving. I might even write some verbiage myself for contribution.


>It is also terribly stupid for software and application developers to run -CURRENT because it's for the development of the OS itself.

Yeah, and that's why you should use it, to be sure your app works ok with the upcoming changes to the core OS.


Not necessarily. If you want to know if your app works with upcoming FreeBSD release versions, you actually care - and you would know to check

https://www.freebsd.org/releng/#freeze

and

https://www.freebsd.org/relnotes/CURRENT/relnotes/article.ht...

and

https://wiki.freebsd.org/WhatsNew/FreeBSD11

and be subscribed to the -CURRENT mailing list - which the development documentation indicate as mandatory when using a -CURRENT build.

Then you would know whether the compiler suite is being torn out (remember? FreeBSD switched from GCC to LLVM/Clang - yes - that was done in the last -CURRENT development cycle)

So your software might not even build because of certain base OS issues. BUT you would know all that already - and you would run -CURRENT in a VM that you blow away and rebuild as needed.

Or in continuous integration like jenkins. https://wiki.freebsd.org/201405DevSummit/Jenkins

BUT NO - you would use it in testing only. You wouldn't use it as a main platform.

Unless you don't really care - then it's on you.

Many devs run 10.1 and then run -CURRENT in VM's or on a spare laptop or machine for this purpose.


> No they don't. It's reasonable to trust that FreeBSD is providing decent RNG. Even in the dev branch. I mean, why would they not?

Because it's the dev branch and they could have any number of good reasons for disabling, removing, or temporarily breaking the RNG while they're rewriting or refactoring it.

It's very hard to have sympathy for people who may be using this in production. You have to go out of your way to get -CURRENT, and the download page clearly says it's aimed at "developers and bleeding-edge testers only." That's why the article is a message on the mailing list and not a security advisory somewhere.

By definition, the dev branch of any software project is unstable and not suitable for production.

> In the meantime, I'm glad to be using a rolling release Linux distro that always has bleeding-edge software without ever having any problems.

I think any long time user of Debian's unstable branch (almost equivalent to -CURRENT) would disagree with that statement.


Regarding your edit, you mean like FreeBSD -STABLE?


It is pretty well explained in the docs: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/cu...

If they're confused on that, they probably have other isses to worry about.


Sometimes you end up depending on stuff that isn't in -STABLE yet. Sometimes it is edgy new stuff that comes with a risk, but fixes for existing functionality also fall in this category.

Living with broken software isn't a compromise a lot of users make.


>Then they deserve what they get

Yeah, because it could totally not slip through to a release... Like, you know, Heartbleed et al...


Well they should be on stable, that is what stable branch is for.


It seems reasonable to want to keep a system around to test upcoming releases on. It does not seem reasonable to run that in production.


The -CURRENT is absolute bleeding edge, it is what FreeBSD 11 is going to be in about two years. Are you sure you're not mistaking it with -STABLE which is the bleeding edge of 10.x?

I'm more inclined to believe that there major companies are running on -STABLE which is still somewhat risky, but at least typically everything that is there was somewhat tested.


It's entirely possible that a lot of people don't know what -CURRENT really means if they're new to FreeBSD. They could be on 10.1 and think that that means that they're on -CURRENT, and thus affected by this bug, but they're not.


That is pretty stupid of them.


Names/sources/anything?


Apparently NetFlix follows -CURRENT.


Even if someone says they do, I don't believe it. As someone who runs -CURRENT on vm's and physical hardware, it's not stable to do anything aside from development and testing on it, and it's not meant to be stable as well on purpose.

Unless Netflix enjoys filesystem wipes, file corruptions, and extremely reduced performance due to kernel level debugging flags that are enabled by default (and often can't be disabled due to ongoing development efforts), then they are not running -CURRENT. Remember, -CURRENT runs slowly because of debugging code!!


There's a lot of public evidence of NetFlix's use of FreeBSD and their desire to generally stay up-to-date. I'm not aware of any credible statement that they're actually on -CURRENT though, and even if they were at some point they may not have updated since then.

They are heavily involved in FreeBSD development and I'm sure they're keeping on top of developments in -CURRENT, but I believe they are not using affected versions anywhere in production.


You can find the following slides from BSDCan 2013 that state Netflix runs FreeBSD 9.

https://people.freebsd.org/~scottl/Netflix-BSDCan-20130515.p...


Define "follow". My company "follows" development but does not run production servers with it.


I found only an unverified but specific claim in an ars forum thread that they run current on their CDN: http://arstechnica.com/civis/viewtopic.php?f=2&t=1262749&sta...


This was a comment I pulled off my Twitter timeline. Hence the apparently.


Same.


Netflix has said publicly that they run FreeBSD 10 stable. Here's a slide deck from 2013 that says they were running FreeBSD 9 at the time.


I guess you have to be aware of the trade offs when choosing a platform. I hope they were.


Right, which means that there will be a disproportionate number of FreeBSD developers running it. These developers have commit access to the source tree, including -STABLE and -RELEASE.


Approximately the worst possible vulnerability, arguably worse than kernel RCE, because a kernel RCE requires an active attacker to intersect with the vulnerable host while it's still there; the broken RNG will leak secrets that will be usable retroactively.

Ouch.


Well, no. It was caught in the CURRENT branch -- i.e. latest-and-greatest snapshots for developers. Users who use this branch are supposed to accept some breakage here and there.

Security requires a threat model. Something can only be "the worst possible vulnerability" if it causes a large amount of damage consistent via the threats which are in the model. The threats that you're thinking of are not usually contained within the models of cutting-edge-dev-systems, which will be protected by firewalls etc. on intranets.

There's really a need in both cases for "an active attacker to intersect with the vulnerable host while it's still there." A broken RNG in such circumstances is much, much weaker than kernel RCE.


Or as Juli Mallett put it on Twitter, "If you can deal with FreeBSD -CURRENT crashing, you can deal with the RNG being b0rked for a few months."

This is indeed a serious issue, and I'm disappointed it was in the source tree for four months. But -CURRENT is the development branch, and is not supported by the FreeBSD security team, or supported for use in production.

From https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/c...:

FreeBSD-CURRENT is made available for three primary interest groups:

Members of the FreeBSD community who are actively working on some part of the source tree.

Members of the FreeBSD community who are active testers. They are willing to spend time solving problems, making topical suggestions on changes and the general direction of FreeBSD, and submitting patches.

Users who wish to keep an eye on things, use the current source for reference purposes, or make the occasional comment or code contribution.

FreeBSD-CURRENT should not be considered a fast-track to getting new features before the next release as pre-release features are not yet fully tested and most likely contain bugs. It is not a quick way of getting bug fixes as any given commit is just as likely to introduce new bugs as to fix existing ones. FreeBSD-CURRENT is not in any way “officially supported”.


This bug is much, much worse than a crash.

I can understand the sentiment that you should be prepared for even horrible bugs like this if you're running -CURRENT, but "if you can deal with crashing, you can deal with this" is completely wrong. Instability is a completely different beast from potentially generating easily crackable crypto keys.


In my opinion, it's much much better that this was corrected! If there were to be any bug of any kind, -CURRENT is the best place in freebsd to have it.

I'm not sure what you're trying to argue for/against unless you are looking at things from a pure number theory/computer science research point of view? What would people do then? Create perfect flawless code that never malfunctions even as they write it? Code that never has bugs does not exist.

Perfect algorithms don't exist either.

-CURRENT is exactly the place for this because RNG bugs exist in reality.

If you read emaste's post above, this bug "potentially generates easily crackable crypto keys" for the safe to a monopoly game that holds monopoly money.

It's a development kernel.


The only thing I'm arguing here is that dealing with crashes does not imply you can deal with severe crypto compromises. In short, that the above-quoted tweet is completely bogus. That's all.


How are "severe crypto compromises" any worse than crashes and potential data corruption and business interruption?


Because it poisons data produced on that machine in an invisible way.

Normally you could use a system like this on a trial basis and keep the results. For example, you might run some image processing on it. You'd want to make sure the output was good, but having done so you could then use those images for real-world tasks without worry.

But replace "image processing" with "generate a private key" and now you're deeply screwed.

Well, now we know. Don't generate crypto keys on pre-release OSes that you're going to use for anything important. That's reasonable. But I don't think many people would have thought that way before.

And of course pre-release OSes aren't the only place the danger lies. Debian had a similar problem that made it to releases and stayed undetected for a couple of years. The point is just that "silently break all your crypto" is a much worse vulnerability than mere crashes, or even corruption.


A crashing test environment is no real danger, a compromised test environment could be the source of a lot of trouble (stealing code deployed to it, ...). Of course, it shouldn't be exposed enough to be successfully attacked in the first place...


Any ssh key generated with a bad RNG could be easily crackable, as an example.


I agree that this is much worse than a crash if you care about your keys, and I'm not trying to suggest otherwise.

My point is that -CURRENT may be rather unstable at any given time, and is explicitly not supported. One should not be using it a production deployment where sensitive key material may exist.


There's a point that's even more germane: if you take the money and dev time that would be required to build an exploit to hose your system, you could probably for a comparable sum pay your programmers to obfuscate a new zero-day vulnerability into the FreeBSD kernel on the premise that it's an experimental feature.

In that regard the 'crashability' of the branch is indeed more significant than a random-numbers exploit: the crashability of the branch is a mnemonic for its general permissiveness towards unaudited (or not-sufficiently-audited) code. When you remember that your adversaries can inject code into the OS, that changes your threat modeling.


Huh? "This crashes, therefore I bet it has backdoors." Does anybody really think like that? What happens when the crashes get fixed and the not crashing backdoor turns into -release?


It makes sense if you look at it as a correlation, not as causation. The point is that one cause (less review of experimental kernel features), which is known to underlie the "this crashes" problem, can also potentially cause the "it could have back-doors" problem.

Of course, as you say, not-crashing back-doors can make it into releases, too. Open source's generic promise that more eyes will make bugs shallow is not a guarantee.


It's a weird rhetorical position you're putting me in, where to agree with you I'd have to simultaneously accept that -CURRENT isn't widely deployed (reasonable!) and that a broken kernel RNG isn't a game-over flaw (not so reasonable!).


The reason that the latter point is reasonable is, it trivially isn't a game-over flaw for systems which do not have game-overs.

What we know about every system that installed FreeBSD-CURRENT is that the systems administrators at the time fully accepted an operating system:

1. That is not in any way "officially supported". (FreeBSD's words, not mine.)

2. that may for short periods of time "not be buildable."

3. that "is not a quick way of getting bug fixes as any given commit is just as likely to introduce new bugs as to fix existing ones".

4. that is much weaker in guarantees than the FreeBSD-STABLE branch, which expressly disclaims, "one should not blindly track FreeBSD-STABLE. It is particularly important not to update any production servers to FreeBSD-STABLE without thoroughly testing the code in a development or testing environment."

If someone has signed off on these topics, then there is no such thing as "game over". The server isn't important enough for "game over". If it is, then the security vulnerability was not the broken RNG but tracking FreeBSD-CURRENT in the first place.


Suppose a developer generated an ssh key while running -current and shared /home with -stable. Then the vulnerability would long outlast the use of -current.


It's not that it's not widely deployed - it's that -CURRENT is deployed by people who have been warned that it could fail at any moment because it is in constant development. It's a development tree, sometimes it doesn't even boot! Sometimes on svn upgrade (and recompile) it can hose your filesystem partitions and that's expected. Forget about the RNG - when someone is working on filesystem code it corrupts files and you lose all your data! I can say this with a straight face because that's what it's for.

I say it's a game-ON plan because thank goodness it got caught in -CURRENT now - that's the way the development process is supposed to work.

I myself run -CURRENT 2 ways - one is a sandbox development physical box that has no access to any of my other servers (I use it to try ports to see if they will still work on -CURRENT), and the 2nd way is in a VM on my laptop to fire it up to see if it boots from time to time.


> A broken RNG in such circumstances is much, much weaker than kernel RCE.

How do you reach this conclusion? If nobody runs -current, then the damage associated with a -current RCE must also be scaled down by a similar factor, no?

On the other hand, a -current test system which later gets upgraded to the next release for production will still have busted keys on it, but would have had the RCE fixed.


This is rationalistic.

I guess if my house is burglered because I left the door unlocked, you'd say I'm blameless as long as my "threat model" assumed that any burglers always come in through the window.


A more appropriate house analogy would be getting burgled by someone walking through your not yet existent living room wall of your half constructed house. The security flaw then is not that you're missing a wall, as it's not reasonable to expect walls to be put up instantaneously, but that: a) you left valuables at a construction site or b) you didn't have a fence around the construction site


I mean, yes.

What you're trying to point out is that sometimes the real threat model (i.e. the security concerns list of the stakeholders) is not the same as the explicit modeling they do of threats. And that's actually very common: HTTPS, for example, is based on a model of eavesdropping threats which only applies to a small case of real-world security problems (public WiFi), models a lot of threats which haven't been so important (ISP malfeasance, DNS hijacking) and the resulting certificate system may be worse for some bigger matters of network security (protecting from government eavesdropping, for example) when compared to something like SSH.

But HTTPS is a good security standard for what it does, and we don't claim that HTTPS sucks just because a lot of people have malware on their PCs which can trivially transmit their traffic to untrusted third parties. Rather, it's your responsibility to revise your explicit threat models to accord with your own implicit expectations. Your clients' students are suddenly upset that your client was storing Social Security Numbers on your servers, which could be read by the public via your API? You have hereby discovered that your "explicit" threat model, which left the front door unlocked, was failing to live up to an "implicit" threat model which you had. You now need to either punt the problem off to the client ("we're not storing your sensitive info, our service is not appropriate for that!") or update your explicit model ("we're re-securing our systems and assuming that everything is potentially sensitive, so new permissions systems will guard access").

Running and continuing to run -CURRENT when you know that occasionally the front door's locks can fall off for any or no reason, means that you've accepted this reality.


This is a way to discourage and scare off people from running -CURRENT. That's why FreeBSD releases, in general, are much buggier and have many more regressions than OpenBSD, because on OpenBSD, something like this in -current would be as big of a deal as it making it into a -release, and everyone (not just the developers) is actually encouraged to run -current to make sure any kind of bugs would be caught up in time.

BTW, links to the referenced revisions:

http://svnweb.freebsd.org/changeset/base/273872

http://svnweb.freebsd.org/changeset/base/278907


In my opinion this is a great thing to expose people to the freebsd development process! It's a big enough deal to get people to understand just what -CURRENT is, and hopefully it makes people want to build it and start fixing bugs!


How bad is the randomness here? Totally predictable? Predictable if you don't have an external entropy source (e.g. a hardware RNG)? Predictable if you don't have user input (e.g. keyboard typing etc)?


I'm not an expert, but it looks to me like if randomdev_init_reader isn't called then read_random will use dummy_random_read_phony which in turn calls random(9), a linear congruential generator.

http://fxr.watson.org/fxr/source/dev/random/randomdev.c?im=3...


If it does use a LCG, then it is totally broken from a cryptographic standpoint: http://security.stackexchange.com/q/4268


Why doesn't this cause a crash? Compatibility?


There are some places early in the kernel boot where "random" numbers are needed but they don't need to be truly unpredictable. A similar approach is used for time: If part of the kernel boot process looks at the clock but we don't have a real clock running yet, just return the values "0", "1", "2", etc. as the time since what really matters is that they are monotonically increasing.


How catastrophic is this compared to the old Debian weak key issue?


It sounds about the same, with differences being that it would affect anything that uses kernel randomness (Debian's problem "only" affected code that used OpenSSL's CSPRNG) and it didn't make it into a release (Debian's was out in the wild for a couple of years before being discovered).

In short, I think it's a bit worse if you're actually vulnerable, but vulnerable systems will be much more rare.


If this ever actually got out into -RELEASE or -STABLE then it'd be catastrophic, but -CURRENT is literally the branch devs are actively committing to, so it's actually a non-issue.


Folks don't seem to note the torn out walls, men with jackhammers, blowtorches, rivet guns, nor do they catch sight of the enormous, bold hazard signs.

For some, it's a gleeful traipse into the 5m deep concrete pour, briefly regretted. I salute those brave souls! Why they do what they do may never be known, even to those who partake, but they do it with conviction! And, it's a damn fine thing, if you ask me!


I actually wonder how serious this is. I know that if the RNG wasn't seeded properly the random numbers will be somehow predictable. But what do you need to know exactly to predict them? The time of seeding? The time of the number generation? How would one obtain either for an SSH key, and use it to break it? I am genuinely curious.


Would a broken RNG be a risk to leak private keys? As in, even if the key was generated on a safe kernel, merely using the key to encrypt or sign data on the broken kernel would compromise the keys?

This article mentions DSA private keys and poor RNGs: http://rdist.root.org/2010/11/19/dsa-requirements-for-random...


Not sure about the keys themselves if they've already been generated, but even after a key is generated, a lot of situational stuff involves random numbers (session keys, initialization vectors, nonces, etc), and lesser variation in these can make it easier to attack traffic with known content patterns.


This is a very good point, one that I missed in my followup on the FreeBSD-CURRENT mailing list. See also https://www.imperialviolet.org/2013/06/15/suddendeathentropy...


Yes, even with correctly generated keys if you perform DSA signatures with a broken RNG you risk reusing a k value which will leak your private key.


You don't even have to re-use a k value - if the attacker can guess your k value, one signature is enough to recover your private key.


Reading the comments here is somewhat frustrating. Even the most recent ones seem to miss the significance of the RNG bug being in the -CURRENT branch only.

They found a bug, and it will be fixed. The fact that this bug was found in -CURRENT is a good a thing. This is a pre-alpha, not for production release. The development model is working as intended, no?


I think the tide of blaming people running CURRENT and attempting to dismiss the issue instead of addressing the extremely serious vulnerability is troubling, since even in the limited Venn diagram of people I know that run FreeBSD, one of them matches this template:

1. STABLE prod fleet

2. CURRENT dev workstation

3. TLS certificate private keys in prod generated on CURRENT workstation to be signed by authority

4. Potentially vulnerable TLS keys existing in STABLE fleet

You folks can twist this to try to deflect severity by pointing to CURRENT, but in the real world, it existing at all for four months is extremely serious and must be taken seriously. My production TLS keys were created on my OS X laptop, because I don't often have to think about a compromised random number generator and this is a tradeoff I make in my own life.

I guess if I truly cared about my TLS keys, I should have made them on a copy of airgapped Warty Warthog and run them through a shitload of random analysis tools before shipping them off to be signed. My bad.


You might want to ask them again if 2 is true. In reality, I highly doubt they really use what you described in point 2. (or it's a miscommunication).

Unless they recompile world all the time (a nontrivial amount of work and a waste of time if you are not actively developing the OS itself) - CURRENT is actually much much slower because of debugging flags that are enabled by default. Some of these flags make the kernel panic upon certain types of errors and drop into kernel debugging mode allowing the examination of kernel dumps. I can't fathom why anyone would generate real security certificates in this kind of environment.

You can recompile with the WITNESS (kernel lock counting & validation - incurs performance loss but useful for counting kernel data structures) and INVARIANTS (run-time assertion checks and tests for kernel data structures) options off, but then you would have to spend time recompiling the kernel instead of working on your software.

https://www.freebsd.org/doc/en/books/developers-handbook/ker...

Note that CURRENT is packaged as snapshot binaries without binary upgrade capability - which means that once you install "one day's" -CURRENT the only 2 ways to upgrade are a) download the new iso and reinstall the entire OS or b) check out the freebsd source code via svn and recompile the system (often world changes as well so you need to do both kernel + userland).

If 2. is actually true, then the person you know who is doing that, unless they are being paid to work on freebsd features or perhaps validating and porting freebsd to a different hardware platform like ARM or PS4 or an embedded device like a medical imaging device or a router (ie Junos), they are wasting a lot of their employer's time and foot-gunning themselves massively in security.

If you show them this post, ask them what they are doing with -CURRENT?

I am open to learning new things so I'd really like to know!


I spent way too much typing this reply. I should just join the freebsd development team and learn the answer to my own question.

Thinking about this again - yah - the poster might be referring to some kind of workflow like this:

production machines are platforms that run -STABLE

there is some kind of device, embedded or otherwise, that they keep locked up somewhere in a lab. It could be the new xxyz multicore switching fabric [imaginary name] that is running a new version of BSD-OS variant that's undergoing verification testing - they run hardware development and need -CURRENT's capabilities for debugging the system itself. It generates some keys that will be distributed into the pool of machines running -STABLE so that in the future, when this new variant comes into the market, there will be "pre-seeded" keys for compatability (ie. older versions of systems will be able to interact via signed certs with the new system)

Since FreeBSD is BSD licensed, there can be any number of things people are doing with it without anyone else knowing - so maybe to give the benefit of the doubt, I can envision a workflow that needs -CURRENT as a workstation / dev platform.

I think one weakness to my thinking is that VARIANTS/WITNESS/kdb/ddb can be enabled on -RELEASE and -STABLE distributions as well! Why not just do everything on -STABLE even if you need kernel debugging?

If they need -CURRENT for new hardware support, it shouldn't be too hard to figure out from the svn log and the rolling release notes. It's kinda fun trying to reverse engineer the job of the thread parent's acquaintance!


> I guess if I truly cared about my TLS keys

Or a tested, non-alpha, security team verified edition of FreeBSD?

I mean come on, We are not saying that you should only generate keys from cosmic rays and personal messages from $DEITY. Only that you do it on a production ready OS, is that really too much to expect?

On a side-note. Unless your devs are involved in the development of FreeBSD, why are they on -CURRENT?


Maybe I'm crazy or have spent too much time in banking instead of in hip startups or something, but I'm struggling to understand why you'd want to do private key generation on a workstation, especially a workstation running the unstable version of an OS (which might break at a moment's notice), instead of a stable production environment, ideally a fairly isolated one dedicated to the task.


I am pretty sure they are taking it seriously. If you did #3, then that is on you. It's a bloody bleeding edge version where nothing, NOTHING is guaranteed. If you generated your keys on a bleeding edge version and then moved them to STABLE, lordy you should be fired.


It is very good that the bug was found and fixed in -CURRENT, but it's also important not to be dismissive of the issue. It is unfortunate that some recent posters are clearly confusing -CURRENT with the stable branch from which releases are cut. However, this is a very serious flaw that existed in the tree for some time, and is indeed a big concern in the FreeBSD developer community.


In my mind, no, it is not. But it's been a long time since I've contributed code to a *BSD.


Can you describe in what way it's not working as intended?


No, I couldn't do so two days ago, unfortunately. Sorry for any confusion.


As a bit of a tangent: Does this happen semi-regularly because proving "randomness" is so difficult? Is there a system which can, with enough data, show that something is at least nth degree of randomness or is such a thing impossible mathematically?


I would guess so, yes. You can never prove that something is random, merely that it behaves randomly "enough". There are lots of testsuites[1][2] you can run that will test that a generator is uniform and doesn't follow any patterns or has any other statistical weaknesses, but you can never prove that something is fully random without knowing the process that generates those numbers. A statistically sound RNG might just output the digits of pi, which would be non-repeating, uniform and pass any randomness test, but if you knew which digit the RNG started from and how many it has output so far you can easily predict the next one. Likewise if you're given a large enough sample you can search the digits to figure out the state of the RNG.

[1]: dieharder: http://www.phy.duke.edu/~rgb/General/dieharder.php

[2]: http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP80...


Statistical tests exist, but when it comes to cryptographic security, they are virtually useless. A statistical test attempts to examine a stream of bits and determine if they "look random" from a narrow, naive point of view, often predefined ahead of time.

On the other hand, a cryptographic adversary actively seeks to break the RNG, which may include tricks that statistical tests simply do not account for. As an example, the Mersenne Twister passes many statistical tests, but after observing 630 or so outputs, an intelligent adversary (with some math) can predict all future outputs of the RNG. That is not something a simple test can uncover.

So, essentially, if your RNG fails statistical tests, it is totally unworthy of any consideration at all from a cryptographic standpoint. If it passes all the selected statistical tests, good for it---all that means is that it might not be totally broken.

That isn't to say that statistical tests are without value. If they were in the build checking process, they could spot when the RNG has failed catastrophically, sometimes. They could not spot when cryptographic problems arise, though.


There are fairly standard statistical tests for randomness. They can't be 100% perfect (if it is truly random you can expect some results that don't look random occasionally) but with enough runs you can be sure to an acceptable degree.

It can't be beyond the wit of man to add such tests to automatic build and regression tests if the project has such a process in place, though it would potentially slow that process down depending on how you define the "acceptable degree" and therefore how aggressively you test.


Proving "randomness" is impossible, akin to finding the shortest program to reproduce a larger string (Kolmogorov complexity). Strings are Kolmogorov Random when they can not possibly be compressed any further by any program: Kolmogorov Random strings simply have no predictable information left to use for further compression. But Kolmogorov Complexity (KC) can not be computed, since that causes a paradox, much like "A description of the natural number that can not be described in less than fifty English words".

But we can of course talk about a degree of randomness, like we can also try to approach KC. Good randomness is all about unpredictability (given the first half of a random string, can you use that to predict the second half?), but that does not mean that proper randomness should be void of identifiable patterns (such as "00000000111111111" in a random binary string). Such orderly-looking patterns do appear in proper randomness, because the absence of those patterns would make the randomness more predictable, not less.

You can measure level of randomness with statistical methods [1], compression [2], visual methods [3] and die-hard tests [4]

[1] A simple method is the chi-square test.

[2] Compression ratio tells us something about the randomness. Random data can not be compressed by everyday-use compressors. That no one claimed the money for the challenge to compress RAND's digits in a binary file tells us something about the rigor that team had in coming up with random numbers. The more you can compress a string, the more order it contains and the more predictable it is.

[3] You can plot random points inside a circle. After a lot of points are added, you should see no patterns and a properly, evenly spaced circle. Another method are Moiré patterns: Take a field of random noise, copy it, slightly rotate it, and overlay. Non-random patterns will become more visible. But these patterns are visible without Moiré rotation too when using very basic PRNG's like the standard Python random library.

[4] The programmer's way of brute-forcing a lot of simulation runs to see if the PRNG works as expected: http://en.wikipedia.org/wiki/Diehard_tests


I’m curious about that too. How could you unit/regression test `rand`?


Statistic testing seems like it ought to work. However, I believe there's also the issue of actual uses of the RNG; it's not enough to test the kernel implementation, one must also test that libraries are using the implementation correctly (i.e. not using constant or completely-predictable seed), etc.

So it's not a trivial problem, because (among other reasons) nondeterministic, statistical testing is not well-understood in the testing culture.


Further complication: many RNGs will produce output as a function of their seed state (e.g., rc4 or chacha20). That output will look really good, even with weak seeds. You'll have a hard time detecting that two chacha20 streams were seeded with gettimeofday() for instance, unless you happen to check the exact time used.


You use statistics. For example, take the birthday paradox problem and check if the collisions are higher than expected.


eh, I'm not sure that works. A random number generator of "return number++" will pass your birthday test.


I just gave an example. I never intended to say you just need this sole solution.

It's also a good lead for a further search query on the topic. Almost all test sets include a birthday test ;) For example this page: https://sites.google.com/site/astudyofentropy/background-inf...


assert(rand() != dummy_random_read_phony())


If dummy_random_read_phony() is rand(), that test will always pass. Almost every implementation of rand() will never return the same result in a row.


Coming from just the data, I'd say impossible. It would be like achieving infinite compression on any pseudorandom data. There's no practical way to search through all possible short algorithms that could generate it.

With knowledge of the algorithm, you can do a lot more.


Reminds me of this old Dilbert cartoon...

http://dilbert.com/strip/2001-10-25


Bears repeating: this does not affect stable versions of FreeBSD. FreeBSD-CURRENT is the development version.


I fail to see how a bug in a development branch that nobody should be using in production is worth discussing, from a security incident perspective. Perhaps ways of catching that category of bug in an automated way would be an interesting topic.


Does that mean that current/boot time was used as seed i.e. there was no seed, or just that it was something weaker than normally used?


There is much less entropy than there should be, but more than none.


Would Dieharder have caught this?


Probably, yes. Upthread, someone looked at the code and suggested that it fell back on a linear congruential generator as an RNG. If that is true, I'd expect the Dieharder tests to fail it.


Boy I wish this was up higher, thanks!

http://en.wikipedia.org/wiki/Diehard_tests


Maybe FreeBSD would do well to rename their 'CURRENT' branch to something suggesting less that to be with the times you need that one. I'm aware of the difference between 'CURRENT' and 'STABLE' but I think that name is at a minimum suggestive enough that people might (and probably have) fall for it.

Something like 'DEVELOPMENT' or 'NOTFORPROD' instead of 'CURRENT'?


As far as facts go, there is no indication that whoever is using -CURRENT in production is doing so out of ignorance. I've only seen hearsay so far and I'm guessing that number is very small and comprises actually knowledgeable people that know the risks.

Additionally, it's stamped everywhere [1], if one cares to look, that CURRENT is unsupported, bleeding edge, buggy, "will not build sometimes", etc. Do we really want to modify a development process that has been in use for over a decade because of a few clueless/extremely-gifted people?

Another possibility is that this person is a FreeBSD jedi, well aware of the risks and payoffs. In that case, this bug is no surprise and she/he is prepared to act on it and regenerate some keys, review commit logs (if a developer), and look for signs of intrusion etc.

I still think this is a "non-news" and poor attempt to use HN to spread fear and trigger useless discussions. The usual drama show.

[1] https://www.google.com.br/search?q=freebsd+current


The linux kernel development branch is called "mainline" - people know that distributions take upstream changes and integrate into their distributed kernels, and not use use "mainline" kernels directly.

Do you want linux to change to "DEVELOPMENT" or "NOT FOR PROD" instead of "mainline"?

It's a matter of staying literate and actually reading what the docs say.


https://svnweb.freebsd.org/changeset/base/278922

"This does not effect programs that directly used /dev/random or /dev/urandom."

openssl should use /dev/random for key generation and keys generated by openssl is not affected?


Is one of /the main prng on FreeBSD actually arc4random? I.e. RC4?


Libc arc4random still uses rc4. The kernel code is actually kind of tangly. I think it still uses rc4 for explicit arc4random calls, but I'm not certain exactly what comes out of /dev/random.


/dev/random on FreeBSD uses Yarrow: http://en.wikipedia.org/wiki/Yarrow_algorithm


The big question: who made the change that introduced this major security hole? This may be an attack, and one that's traceable.

Name?


Burn the witch! Burn the witch!


You mean burn the possible government infiltrator. From now on, _all_ changes at the crypto layer should have double plus eyes looking at every change (and I don't mean the five-eyes).


Yes, I'm very interested in this too. A bug like this could very well be an attack. It could also be an honest mistake, crypto is hard after all. I'm hoping someone will research this and provide a nice write-up.


FYI: CURRENT means master/head/pre-alpha or what you would call it.


Would this affect Bitcoin private keys generated with this kernel?


It says it would affect openssl, which Bitcoin Core used up until the last update. IDK about other wallets.


It's Debian knocking on the door, they want their RNG back!


Will the individual who broke it get the credit?


... and Whatsapp uses FreeBSD extensively. Could this affect the end-to-end encryption that Whatsapp uses?


It wouldn't be end-to-end encryption if WhatsApp's servers could influence the results.


Does WhatsApp run on -CURRENT? That seems unlikely.


There was a talk + slides about WhatsApp FreeBSD usage [1, 2].

At around 23:50 in the video, Rick Reed talks about slide #17, they say they are running 9.1 - 9.3, and looking at 10.1 (not in a hurry as 9 works for them).

[1] https://www.youtube.com/watch?v=TneLO5TdW_M

[2] http://www.slideshare.net/iXsystems/rick-reed-600-m-unsuspec...


-CURRENT is not a single version number - it refers to a development branch. If someone says CURRENT is version xx, then they don't really know what they are talking about - because while CURRENT may become a versioned release down the line, that is very different from what -CURRENT represents (which is a named tree in source control that people develop against)

All of those freebsd versions you refer to (9, 10) are part of the production release cycles and thus are not -CURRENT.

If whatsapp uses 10.1, then they are NOT using -CURRENT


Yeah, I think we are both thinking the same thing. I put the links, and version numbers, in support of tptacek's thought that they were not using -CURRENT.


It shouldn't affect end-to-end encryption, because those keys should only be generated at the endpoints (i.e. the users' devices, probably not running FreeBSD). It could affect TLS, etc, however.


> should only be generated

Theoretically, WhatsApp could generate the keys and forward them to clients, and you could still plausibly call it "End to End"


oh god, hopefully not


I doubt they're on CURRENT.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: