Hacker News new | past | comments | ask | show | jobs | submit login
OpenSSH security advisory (openssh.com)
184 points by 0x0 on Nov 8, 2013 | hide | past | favorite | 88 comments



Test against latest version of:

* Arch Linux -- Version match: OpenSSH_6.3p1, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)

CentOS 5.10 -- OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008 (No AES-GCM support)

CentOS 6.4 -- OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010 (No AES-GCM support)

Debian 6 (squeeze) -- OpenSSH_5.5p1 Debian-6+squeeze4, OpenSSL 0.9.8o 01 Jun 2010 (No AES-GCM support)

Debian 7 (wheezy) -- OpenSSH_6.0p1 Debian-4, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)

Fedora 18 (Spherical Cow) -- OpenSSH_6.1p1, OpenSSL 1.0.0-fips 29 Mar 2010 (Supports AES-GCM)

* Fedora 19 (Schrödinger’s Cat) -- Version match: OpenSSH_6.2p2, OpenSSL 1.0.0-fips 29 Mar 2010 (Supports AES-GCM)

openSUSE 12.3 (Dartmouth) -- OpenSSH_6.1p1, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)

Ubuntu 10.04.4 LTS (Lucid) -- OpenSSH_5.3p1 Debian-3ubuntu7, OpenSSL 0.9.8k 25 Mar 2009 (No AES-GCM support)

Ubuntu 12.04.3 LTS (Precise) -- OpenSSH_5.9p1 Debian-5ubuntu1.1, OpenSSL 1.0.1 14 Mar 2012 (Supports AES-GCM)

Ubuntu 12.10 (Quantal) -- OpenSSH_6.0p1 Debian-3ubuntu1, OpenSSL 1.0.1c 10 May 2012 (Supports AES-GCM)

* Ubuntu 13.10 (Saucy) -- Version match: OpenSSH_6.2p2 Ubuntu-6, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)


I found your comment somewhat cryptic, the asterisks mean it's affected, right?

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=729029

"AES-GCM support was introduced in 6.2, so oldstable and stable should be fine (from http://www.openssh.com/txt/release-6.2):"

Debian 7 (wheezy) -- OpenSSH_6.0p1 Debian-4, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)

If AES-GCM was introduced in 6.2, did someone patch 6.0 to support AES-GCM? I can't reconcile your list with the statement in the bug report otherwise. Could you explain?

I can't understand why AES-GCM was introduced in 6.2 and your list has many < 6.2 that support AES-GCM.


I _think_ he's talking about whether or not OpenSSL supports AES-GCM.

e.g. Debian 7 has a version of OpenSSL that supports AES-GCM, but OpenSSH isn't one of the affected versions.


Arch Linux -- OpenSSH_6.4p1, OpenSSL 1.0.1e 11 Feb 2013 (Supports AES-GCM)


As side topic, it's quite awesome the level of auditing of this protocol. OpenBSD people deserve full respect.


OpenBSD people? OpenSSH people!


No. It's developed in-house by the OpenBSD project.


[deleted]


No, there are at least three of us (myself, Markus and Darren are the most regular committers) and we all work on other stuff in OpenBSD with varying frequency.


If I may ask...

How much time per day do you guys spend in working on OpenBSD and how much of this is it on auditing?


Same people.


From the diff:

  -	newkey = xmalloc(sizeof(*newkey));
  +	newkey = xcalloc(1, sizeof(*newkey));
The only change is that the allocated memory is zeroed, right? (Just wondering if I'm missing something.)


Nope. That's all.

The newkey struct in turn contains a bunch of OpenSSL context structures, and one of these includes a cleanup callback pointer for the MAC (message authentication code) in use. For most ciphers this was being initialised later but AES-GCM provides message integrity without the need for an external MAC and in this case the MAC context was being left uninitialised.

When the newkeys struct was later being cleaned up as part of a rekeying operation, the cleanup callback was called. It's address was whatever happened to be on the heap when this allocation occurred.

It certainly appears exploitable using standard techniques, though it may be complicated somewhat by OpenSSH's privilege separation architecture.


Thanks.

Is calloc(1, size) considered idiomatic for zeroing memory? While I haven't used C much lately, back when I did I don't think I realized this, and could see myself at least considering reverting the change during "code cleanup".


Yeah, it's fairly common. calloc zeroes, malloc doesn't, and you shouldn't make a significant change like that during code cleanup. I think it's somewhat more common to see malloc followed by memset, as the intent is way more clear to a reader than just calloc by itself.


calloc can be faster because it has internal knowledge of whether the memory is already zero.


Hmm.. would it make sense to replace all calls of xmalloc with xcalloc as a preventive measure?


I don't think it is advisable to blindly replace like that in a widely used, audited and tested system. You need to think about more than just the bits in memory, for example timing attacks. And who knows, your "fix" could even expose some other latent bug. It's not clear cut.


I don't get this. How could leaving memory uninitialized be better than zeroing it? Using calloc could be redundant in places, in that the allocated object might get initialized some other way right after allocation. But I don't see how it could ever be wrong. (There might have been zeros in that memory already anyway.) There are languages that have no way to allocate uninitialized memory -- are you going to say they aren't appropriate for crypto algorithms?

And besides, exposing a latent bug is a good thing, no?


There are a couple of concrete worries I'd have:

* If you malloc something very large, on Linux at least (I don't know about the BSDs), it doesn't necessarily get allocated, and only requests memory from the system when you actually access it. It's a valid pattern to, say, malloc a large amount of memory, pass it to your own allocator as a pool, and use what you need from it. If you go and zero all that memory, then you've accessed it all, resulting in much increased memory usage and maybe OOM kills. (You can argue this is an issue with the Linux overcommit system, not with malloc, but that's the system we have today and we'd need to account for it.)

* Slow code is a deterrent for people deploying crypto. If it takes 5 milliseconds for an HTTP response and 50 for an HTTPS handshake, people are going to be hesitant about deploying HTTPS, or do something silly like use it to protect only the initial login. So you do need to worry about the time that your crypto takes as an operational concern for deployment, and balance that off against hardening and paranoia. Favoring either extreme too much results in less overall security.

* There are very few languages that are truly "appropriate" for crypto algorithms. Even C only kind of counts, and you have to be very careful about how you write it. You ideally need promises from the language that certain operations are constant-time and that the optimizer is going to do what you want and neither more nor less. Most of the languages I can think of that only allocate you zeroed memory also include risks on the order of garbage collection being triggered in the middle of your algorithm depending on what your secret data is, so that makes them definitely unsuitable. And most of the non-toy crypto libraries in higher-level languages end up being bindings of carefully-tuned C libraries, and in that sense _do_ quasi-unintentionally expose a way to allocate uninitialized memory, by going through the C allocation routines.


I think that if I were a maintainer of something like OpenSSL, I would want the rule to be that any use of plain malloc, as opposed to calloc, should be commented. A simple "malloc ok" would suffice if the reason was obvious; for example, the very next lines of code initialized the object to something other than zero. If the reason were more subtle, a longer explanation should be provided. I do agree that performance concerns can be relevant (though zeroing a small block of memory that you're about to write into anyway is very cheap).

The caveat I would want everyone to be aware of is that if the object is a struct that is initialized member-wise, malloc can still be dangerous, because if a member is subsequently added to the struct, one would have to manually review all points where the struct type is allocated to add the initialization for the new member. So I would approve of malloc in such a case only if there were only one place in the program that allocated that struct type.

I'm talking about the crypto code itself. If you've written your own allocator, I agree that different rules apply for the allocator code.


"Most of the languages I can think of that only allocate you zeroed memory also include risks on the order of garbage collection being triggered in the middle of your algorithm depending on what your secret data is, so that makes them definitely unsuitable. "

You do realize that even a C program running on a modern kernel can be preempted? Unless you are running it as a high-priority real-time process on a real-time hardened kernel, and, OpenSSL is giving you hard-real-time guarantees.


Yes, and there are certainly cache-timing attacks between processes (and even VMs) on the same physical hardware. That said, if you're disciplined enough to avoid all its awfulness, C is one of the least-bad languages for writing crypto; other things that compile directly to native code and offer C-spirited APIs for allocation and the like might be better. Having no allocation API that doesn't zero returned buffers is somewhat correlated with being high-enough level that it's a more-bad language to write crypto in.

In other words, there may not be a good language, but some languages (or runtimes, really) are certainly worse than others.


But, task preemption is most likely not correlated with the contents of secret data. It's pauses that correlate with different values in secret data that allow data to leak via timing attacks.


Remember when Debian's OpenSSL was reduced to issuing a predictable small number of keys instead of using the entire keyspace? It's because someone added a patch that in effect initialized memory that was supposed to be uninitialized.


I had forgotten, but I just read up on it [0]. I haven't actually studied the code, but it appears to me from this post that your characterization is not quite correct. They didn't just add code to initialize the memory; according to [0], it would actually have been correct to do so (see the second sentence under "Links and Notes"). Instead they did something slightly different from that, which was wrong.

[0] http://research.swtch.com/openssl


That's only half correct. The problem is that they reinitialized memory.


Yes. I believe the bug rises from some code like:

    if (newkey->callback != NULL) {
        newkey->callback()
    }
If newkey structure was not zeroed, then the callback would have had some contents from previous use and the NULL check would have passed and some unintended memory position would be called.


The C standard doesn't require such code to be correct, even with calloc, because the NULL pointer is not necessarily represented by the zero bitpattern. While it will work "everywhere", calloc + NULL check is not semantically correct C.


The C standard defines NULL as 0:

"An integer constant expression with the value 0, or such an expression cast to type void * , is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function."

(Edit: the asterisk.)


Yes, but that's not what calloc does. It doesn't return memory with all values 0 (how can it? It doesn't know what structure you're allocating.), it returns all bits zero.


While I really appreciate learning from a good language-lawyering, and also acknowledge that there are many places where subtle interpretation issues have led to crashing code or even exploitable malfunctions, in practice the issue about different representations of Zero or the NULL-pointer are indeed mood.

I'll just be flamboyant now and claim that on all currently relevant platforms calloc'ing the memory of all supported numerical and pointer types will give you the numerical value 0 and also the NULL pointer.

Can someone name any CPU or microcontroller currently being produced that violates that statement? (e.g. uint64_t x; calloc(1,sizeof(x)) and void *p; calloc(1, sizeof(x)) or maybe memset(x,'0',sizeof(x))... and still x != 0 and p != NULL)? Even for gross missalignment? What about DSP platforms that don't have a useable byte-addressible memory? GPUs?


I agree. Note the fix used even depends on calloc setting a null pointer. The openbsd team generally makes a few other assumptions as well. But the comment I replied to was claiming the c standard itself makes these guarantees. That's wrong, and I think, slightly dangerous. It's fine to make assumptions, but one must always know what they are and not confuse them with guarantees.

Note that your memset call is totally bogus. '0' is not 0. You may have meant '\0', but plain 0 is perfectly legible. :)


I confirm that my inline C-code was written pretty sloppily... But obviously you clearly understood my intention :-)


> or the NULL-pointer are indeed mood

Just FYI, the word you probably mean is 'moot'.


Why didn't that crash? I understand the cleanup routine is always called and certainly someone tested the AES-GCM provider by just running it once.


Probably because the memory allocated just happened to contain only NUL bytes where that specific cleanup function pointer is stored whenever it was run in testing.

The point is that a targeted malicious attack would try and ensure that those bytes were not NUL and that it ended up calling malicious injected code on the target server.


Yeah, I've found a few bugs like this myself where the memory in question happened to be NULL, until some unrelated change elsewhere in the toolchain meant that it wasn't anymore.


It seems odd to see a commit in this day and age with no unit test. When you make a new newkey struct the clean-up callback should be NULL.


We committed a regression test fix separately.


That's the fun with C, it would be quite unlikely the malloc call in a unit test for this didn't happen to zero initialize the struct.


Yep. You may get different results based on the code path taken, the compiler, and (maybe?) the OS. Most of the time the value at that memory location will probably be a zero byte (or a zero word/dword), but sometimes it may not.


Debian Stable seems to be unaffected: https://security-tracker.debian.org/tracker/CVE-2013-4548


Note that testing/jessie and sid are vulnerable. Not that anybody should run a server with testing, but still :)


PoC ? Seriously with all the complex OS memory management access control, isolation and randomization features (like security features implemented by OpenBSD) writing a working exploit for this would be a real state of art ..


Linux by default doesnt do randomization features, at least not ArchLinux - the most security aware distro (sarcasm). Ubuntu doesnt do that either.

What do you mean by memory management access control?

As ssh contains or references code to open/read/write sockets, thats what I would do - return oriented programming or whatever its called - to use the functions already defined/within scope and memory, to open a reverse shell.


Looks like Ubuntu by default has randomization via ASLR and other protections as well.

https://wiki.ubuntu.com/Security/Features


We ran into this exact issue in my virus class. The lab machines running Scientific Linux would not randomize the library addresses despite ASLR being on. In contrast Ubuntu not only did the proper layout randomization but also had gcc compile with stack smash protection by default.


"This vulnerability is mitigated by the difficulty of pre-loading the heap with a useful callback address and by any platform address-space layout randomisation applied to sshd and the shared libraries it depends upon."

Right. But in order for RoP explotation (say chain some libc function calls) to work you'd still have to manipulate the stack arguments in some fancy way, also it's not really sure how trivial is "pre-loading the heap" since it's a post-authentication stage bug as the advisory mentions. Of course these are just speculation, digging into the source code might change perspective :)


Yes it is quite comforting this is post-authentication, so in most cases no big deal. Just tough luck for shared accounts.

I guess most people dont run sshd as root and capabilities either so that minimizes damage too. Another reason to not run ssh on port 22, no root, no special caps needed.


All my servers run sshd as root, including the FreeBSD ones. Is that ok? Or do you mean that sshd drops privileges for the child after forking?


In the past I've had this in my sshd_config files:

  Ciphers aes256-ctr
  MACs hmac-sha2-512
This is one recommended way for forward secrecy https://github.com/ioerror/duraconf/blob/master/configs/sshd...

Unfortunately my favored Android SSH client (juiceSSH) couldn't handle that so I had to change to:

  Ciphers aes256-ctr,aes256-cbc
  MACs hmac-sha1
which is rather unfortunate, but still turned out to have been a good thing in light of this vulnerability (I'm running 6.2 on all my machines because of the DoS vulnerability in earlier versions).


    2. Affected configurations
    
    OpenSSH 6.2 and OpenSSH 6.3 when built against an OpenSSL that supports AES-GCM.
What would be a good/simple way to test if your system is one of these? How can I check which ciphers my build of OpenSSL supports?

Edit: The following command returns nada on all my systems. I guess that puts me in the clear?

    $ openssl --version 2>&1 | grep gcm


What I did is try to use the cipher : ssh -c aes256-gcm@openssh.com (host)

If you get "no cipher found", you're clear, if you can connect you can disable the cipher (while keeping others) with this line in you sshd_config:

Ciphers aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc

Source: http://www.openssh.com/txt/gcmrekey.adv


"Unknown cipher type" for me, but close enough that it seems to mean the same thing.


I think this is a better test:

$ openssl enc --help 2>&1 | grep gcm -aes-128-ecb -aes-128-gcm -aes-128-ofb -aes-192-ecb -aes-192-gcm -aes-192-ofb -aes-256-ecb -aes-256-gcm -aes-256-ofb

(Tested on ArchLinux with OpenSSL 1.0.1.e-5)


Thanks!

I've never understood why so many applications pipe help output to STDERR.

It's not an error... I asked for the output! shrug


Traditionally applications will print usage (which is usually the same as --help) if arguments are missing or invalid flags are given. This is clearly an error condition, and it would be very bad to print the output to stdout in that case. --help will simply call usage(), so there's no difference in behavior. I think it would be feature bloat / overkill to have programs print usage to stdout or stderr depending on how it's invoked, even if it's easy to implement.


It's not even one extra line of code. usage() should take the output stream as a parameter.


Even so, is it really advantageous to have "foo --help" send usage info to stdout, but for "foo -?" to send "Invalid switch '-?'" followed by usage info to stderr?

More importantly, usage information doesn't conform to the structure of ordinary program output, and can therefore be annoying or even dangerous if inadvertently treated as such, so it really, really does belong on stderr for any program whose stdout might reasonably be used as input to another program.


That's a weird, conceptual argument (and program output is not strongly typed). I just want "program --help | less" to work, especially when there's a lot of options.


I appreciate your point, but I still disagree.

> usage information doesn't conform to the structure of ordinary program output

This is weak. Pretty much every command line program will have multiple flags that change the output in some profound way. Why should --help get special treatment?


Mark Jason Dominus did a quick-and-dirty survey on where usage messages should go: http://blog.plover.com/Unix/usage.html

Most people think stderr (so as not to muck up pipelines when you get the invocation syntax wrong). But then there is also the cleverer option of using stdout if and only if the help is explicitly requested, which seems to be justifiable, popular in the survey, but (I think) not much implemented.



Uh. No.

Spiped just 'pipes' an SSH connection from one point to another. It's essentially a very thin VPN.

But this bug is exploited via authenticated users. If you're using ssh keys (...you are, aren't you?) you would basically already need a valid SSH login to use this vuln; if nobody but you has an ssh key login, you're safe. This vulnerability may still affect you - even with spiped - depending on who has an ssh key login to your box.

(And this thinking that you can just 'wrap another layer of encryption/abstraction' around a problem, is why we can't have nice things)


> this vulnerability might permit code execution with the privileges of the authenticated user

Yet another vote for secure authentication, which typically means requiring use of a private key and disabling root login.


Not sure what you're thinking about here, since the vulnerability is post-authentication?

A bug like this could be pretty devastating for github and bitbucket type setups where everyone in the world is using a shared, restricted "git" UNIX user authenticating with a private key, for git ssh push and pull.


This bug is problematic for all those situations where the ssh protocol is not used for telnet-style command line, but as a transport.

e.g. Git uses ssh as transport protocol, but github (and similar platforms) don't support direct user login.


> What about git-shell over ssh ?

I guess that gives the same problem. The exploitation allows to execute arbitrary code, as you would do by launching commands from, say, bash.

I don't know git-shell, but I guess (from the manpage) it restricts the allowed commands. The exploitation of the bug would allow a malicious user to execute a command instead of git-shell. A good example of command could be /bin/bash.


What about git-shell over ssh ?


Hopefully this clears it up for you:

https://news.ycombinator.com/item?id=6696070


I'd rather put this as a vote for proper os level privilege management/access control. You shouldn't need to trust OpenSSH to limit user privileges.


The authentication method used is completely orthogonal to this bug.


> completely orthogonal

I was explaining the benefits of layers of security. This issue is an issue only to the degree that your server does not trust authenticated users, because it can only be exploited by authenticated users.

The main thing I wanted to see when I saw the OP was 'do I need to update SSH on my personal server yesterday'... and I'm not worried about it at this point.


Perhaps you have got a good guide how to implement this?


Doesn't help in this case but in your /etc/ssh/sshd_config or similar change or, if missing, add

    PubkeyAuthentication yes

    ChallengeResponseAuthentication no
    PasswordAuthentication no
    PermitRootLogin no
Make sure you can log in using your key and su/sudo from that unprivileged user before applying those changes.


Thanks, done that.


Just google it, its in the manual as well.

But the gist of it is PermitRootLogin no in your sshd_config and Protocol 2 (to disable protocol 1), PubkeyAuthentication yes Password authentication no

For this specific vulnerability those will not do, as this exploit is for post-authentication, whether password or pubkey doesnt matter.

Hence, you can add Ciphers blowfish-cbc which will only allow that cipher to be used.

Also add MACs hmac-sha2-512 to only allow that MAC and none, weaker, others.



That's for previous release. Look at the current errata:

http://www.openbsd.org/errata54.html

In places where -current is not an option, look into M:Tier binpatches and their `openup` utility:

https://stable.mtier.org/

http://www.mtier.org/index.php/solutions/apps/openup/


SO that page isn't updated for the affected versions? Is that because they only "support" (release patches for) the newest release? I was under the impression they "supported" the two latest releases..


http://www.openbsd.org/errata53.html

The page you linked to is not updated as often or as thoroughly. Errata ('Patches' link on the main page) is the page you want. Sorry for the confusion.


Thanks, I thought it looked kind of empty. It's a little confusing though..

Also, maybe a warning the unsupported version could be on order?


I had never heard of this GCM mode for block ciphers, do any distro package OpenSSL with this mode activated by default?


Quoting Matthew Green

"the only people who hate GCM are those who've had to implement it. You see, GCM is CTR mode encryption with the addition of a Carter-Wegman MAC set in a Galois field. If you just went 'sfjshhuh?', you now understand what I'm talking about. Implementing GCM is a hassle in a way that most other AEADs are not. But if you have someone else's implementation -- say OpenSSL's -- it's a perfectly lovely mode."

From an interesting read

http://blog.cryptographyengineering.com/2012/05/how-to-choos...


It's https://en.wikipedia.org/wiki/Galois_Counter_Mode and IIRC it was added to OpenSSL in the 1.0.1 release.

Don't let my hazy recollection be a substitute for checking your platform though :) (see some good recipes in this thread)


GCM is one of the authenticated-encryption modes, which means that in addition to encrypting the data, it also generates an authentication tag. It is somewhat like using a plain-encryption mode like CTR or CBC and also using a MAC like HMAC, but a single algorithm, so there's less complexity. Other authenticated-encryption block cipher modes include OCB and EAX.

(Most of these are actually "authenticated encryption with associated data", which also let you add some cleartext data to the authentication tag without encrypting it.)

I believe GCM support is new as of OpenSSL 1.0.


It's used in TLS 1.2, so I'd assume any distros using OpenSSL 1.0.1 (which added TLS 1.2 support) would have it enabled.





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: