Cryptographic Agility

tptacek · on May 24, 2016

I mostly agree with AGL, who is smarter than me, but I may take a step further.

Negotiation is the most dangerous feature in cryptography. Nobody has ever gotten it right (I mean that literally).

Therefore, in the same sense as Dan Bernstein used to have a design goal of "don't parse", "don't negotiate".

Instead of having protocols that negotiate features, we should have protocols that have simple versions, and the only negotiation that should ever occur is "is this protocol in the set of allowable versions for my local policy".

If a version gets something wrong, like using implicit IVs for CBC, or using a hilariously broken cipher like RC4, the whole protocol should version. RC4 should take you from TLS 1 to TLS 2. Lucky13 should take you from TLS 2 to TLS 3. Want to switch everyone to ChaCha/Poly? TLS 4 it is.

djcapelis · on May 24, 2016

Yep. So maybe we should try and ship this as a variant of TLS where we strip out a lot of the negotiation and extra features and just say "TLS2 means you use X, Y and Z."

I think it'd have a shot at being better than we've got now. And frankly, better than what TLS 1.3 is doing.

yuhong · on May 25, 2016

Unfortunately, the way it works in reality means adding ciphersuites is easier than adding new versions.

tptacek · on May 25, 2016

"Ciphersuites" simply should not be a thing. Each version of the protocol should hardcode one set of crypto primitives.

wolf550e · on May 25, 2016

What if people want AES-GCM on hardware with AESNI and CHACHA20-POLY1305 on all other hardware? Should those be two protocol versions that are both accepted by everyone and neither is planned to be deprecated? What about Ed25519 vs. Ed448?

What if something was deployed with (SHA256 P-256 ECDHE-ECDSA-AES128-GCM), which Google promised to support for a long time, and now people want to switch to djb-blessed X25519-Ed25519-AES256-GCM, will the older ciphersuite be an old protocol version that everyone keeps supporting for >10 years?

tptacek · on May 25, 2016

My take? Tough. Use ChaCha/Poly. It's a better design for other reasons (mostly not related to TLS, but still).

Same with curves. Nobody really benefits from selectable 25519 vs 448.

djcapelis · on May 26, 2016

sgtm. Let's go build an Internet.

api · on May 25, 2016

That's why I've dragged on implementing certain more advanced crypto in ZeroTier. Right now it's boring DJB hipster crypto with no state and I like it like that. Not much bug surface area.

yuhong · on May 25, 2016

That being said, ciphersuite negotiation with weakened crypto is often even worse. The way TLS does it means that weakening the asymmetric one is worse than weakening the symmetric one. This is why OpenSSL disabling EXPORT1024 in 2006 was a bad idea.

wmf · on May 24, 2016

Yeah, negotiating the version up front seems to have worked for HTTP/2 which I guess is a success story as these things go.

kordless · on May 25, 2016

Negotiation and parsing have costs. Computing costs and loss of data costs resulting in the risk of loss of stored value. On the other hand, aversion to loss of stored value leads to loss of wisdom.

Slap some crypto-currency payments on the risks you expose via APIs and charge for the risk.

e12e · on May 25, 2016

"Let's just consider symmetric ciphers for a moment. Because everyone wants them to be as fast as possible, BoringSSL currently contains 27 thousand lines of Perl scripts (taken from OpenSSL, who wrote them all) that generate assembly code just in order to implement AES-GCM."

Yikes. Is that... the state of the art of compiled/machine code AES-GCM implementations? I guess I'll look to NaCl for an AEAD construct...

agl · on May 25, 2016

Yep, that's it. Different platforms need slightly different assembly and, although I wouldn't choose Perl for it, it's not fundamentally that different from any other text-to-text transform.

Handwritten asm still outperforms compiler output by enough to warrant its use in hot functions like AES-GCM.

(NaCl also uses a text-to-text transform system to output asm, although it's written in C and does register allocation too. The important aspect of NaCl is djb & friend's attention to quality.)

pbsd · on May 25, 2016

While some of NaCl's kernels are written using qhasm, qhasm itself is not part of the build process of NaCl. This means that those functions cannot be ran on, say, Windows, because the calling convention (and the assembly syntax) is different. qhasm also does not have programming capabilities, so one would need a lot of code duplication to write an SSE2, SSSE3, AVX2, etc variant of the same function. While OpenSSL's perlasm is an abomination, it does produce "portable" assembly output, and there is no obvious superior option to accomplish that. This is something sorely missing from our collective toolchains.

astrange · on May 25, 2016

The nasm/yasm macro packages in x264 source can do all this for you and more, across i386/x86-64 sysv/win64.

The Win64 versions are a little less optimal IIRC, but to be fair Win64 is dumb.

btrask · on May 25, 2016

It sounds like this perlasm is the state of the art in maximally efficient high level languages. (Half joking.)

I feel like if you took something like qhasm and extended it to be a full language rather than sort of a hack, you would end up with something really beautiful, like Forth minus the crazy and plus both standard assembly and high level syntax. It could be a better C than C.

e12e · on May 25, 2016

I'm probably missing something, but looking at:

https://boringssl.googlesource.com/boringssl/+/master/crypto...

and in particular:

https://boringssl.googlesource.com/boringssl/+/master/crypto... and https://boringssl.googlesource.com/boringssl/+/master/crypto...

I don't see how this is any improvement on just writing plain assembler? Apart from the perl-style comments, is this any cleaner than say, some code compiled with yasm?

[ed: I guess what I'm missing is that there's probably more similarities between the x86 flavours and ARM flavours -- but I still think one would need to look at, and understand, the generated assembly for each platform in order to audit the code. So I still wonder a bit about what the gain is.]

wolf550e · on May 25, 2016

Can the Perl output in different syntaxes for different assemblers?

technion · on May 25, 2016

This is one of the reasons I'm a very strong proponent of LibreSSL.

Much of the code they stripped is just about impossible to audit.

People talk about writing less C because it's so hard to do right. How much faith do you really have in "many eyes" finding issues in Perl generated assembly?

throwaway7767 · on May 25, 2016

> This is one of the reasons I'm a very strong proponent of LibreSSL.

http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libssl/src/...

Looks like LibreSSL has the exact same perl-generating-assembler code.

bdamm · on May 25, 2016

Pure crypto functions are highly deterministic. If the output is correct then all you need to worry about is side-channel attacks, and to analyze that you don't need code audits at all.

e12e · on May 25, 2016

I thought one of the key difficulties in analysing (and writing) good crypto implementations was making sure that they are both efficient and that there are no allocations of memory or branching that is key dependent. While it may be easy to prove an implementation insecure (find an example where there are obvious timing issues), isn't it still a matter of code analysis to show an implementation is actually secure?

bdamm · on June 2, 2016

I am just saying that you don't need code analysis to find out that there's a side channel attack (e.g. through memory allocation analysis, power analysis.) In fact, I believe actual physical testing is the way to prove security of an implementation, given the subtle bugs that can appear in optimizing compilers, runtimes, downstream packagers, etc.

tomsmeding · on May 25, 2016

Is assembly then not hard to do right?

technion · on May 25, 2016

I'm not going to pretend to have an authoritative answer on whether Perl generated assembly is easier to get right than assembly, but I'll say this: The number of people who are skilled in assembly is small, and it's made even smaller when what you are looking for is where it intersects with those skilled in Perl.

yuhong · on May 25, 2016

One of the worst being the SSLv2 code.

caf · on May 25, 2016

The headline undersells the article - it has some good advice with general applicability to protocol design, not just crypto protocols.

Animats · on May 25, 2016

Crypto is special. You do not want to have both ends negotiate down to an insecure cryptosystem. SSL/TLS has at times been persuaded to do just that via MITM attacks. In most other systems, you want interoperability if at all possible.

jsnell · on May 25, 2016

I'm not convinced you actually want interoperability at all costs in most places.

Just this year I've bumped into two separate security-related networking devices which were stripping out all TCP options except the MSS from the SYN packet [0]. So no window scaling and no selective acknowledgements. This is absolutely crippling to TCP.

In one case the middlebox was next to an FTP server used for transferring huge files all over the world. Just taking that box out of traffic would have given a 5x-10x speedup on those connections. The other was in an LTE core network; in that environment just losing window scaling cuts the maximum throughput to maybe 20Mbps. A bit of a problem when your marketing is promising 100Mbps.

If TCP stacks wouldn't negotiate the connection settings like that, crap like this wouldn't get deployed. It would be obvious that something is horribly wrong. Now they get deployed, and nobody realizes that this particular box is why the network is currently working like crap. (Until somebody looks at a trace of the traffic, and notices that the TCP options are straight from the '80s).

And TCP is supposed to be the canonical success story for Postel's law!

[0] Why would anyone do such a horrible thing? AFAIK in both devices this behavior was linked to SYN-flood protection. So it might have been some kind of a horrible SYN-cookie implementation that could not encode any interesting TCP options.

caf · on May 25, 2016

Sure, but the observation that an extensibility mechanism that isn't regularly used will tend to rust shut is very widely applicable.

yuhong · on May 25, 2016

"TLS has a second, major extension mechanism which is a series of (key, value) pairs where servers should ignore unknown keys"

And even that used to be broken too requiring fallback. There used to be several version of the SSLv3 spec out there, and the first time this requirement was mentioned was in an errata document.