Hacker News new | past | comments | ask | show | jobs | submit login
Rustls Outperforms OpenSSL and BoringSSL (memorysafety.org)
154 points by jaas 4 months ago | hide | past | favorite | 44 comments



'We'd also like to thank Intel for helping with AVX-512 optimizations for aws-lc-rs recently. This was an important part of achieving our performance goals.'

Testing on an intel processor, with frequency scaling disabled, which will adversely affect non AVX-512 more than AVX-512 stuff due to the limited boost available when using this. I'm pretty sure this is a not totally fair comparison, and tuning the box to give your solution an advantage rather than tuning it for each solution to give optimal performance would be more realistic.

However, i'm not knocking it, sounds like a great achievement, and it'll spur the other solutions on to improve their implementations which is a win all round.


Note that the AVX-512 code we're referring to is the code that Intel also contributed to OpenSSL.

As a side-note, I believe the CPU we tested this on does not suffer from the AVX-512 power limits reported with earlier AVX-512 parts. https://travisdowns.github.io/blog/2020/08/19/icl-avx512-fre... seems to confirm that.


~That page is the first I've heard of license-based downclocking. I know there's no ethical reason not to do it, and it's similar to fusing a higher/lower performance chip out of the same base design, and free-market etc.

But it just makes me sad.~

Edit: Based of this comment [0] and replies, it appears I've misunderstood what 'license' means. My apologies

[0] https://news.ycombinator.com/item?id=24218310


Ah, ok. So the frequency locking was to reduce jitter on the performance tests? If so, this makes sense.


It's also fairly normal to do this. Google benchmark [1] even warns you about this.

[1] https://github.com/google/benchmark


Is it really reasonable to lock your TLS web transfer to a specific CPU thread? Not sure it actually does make sense.

Would be nice to see the non-AVX512 results.


> Is it really reasonable to lock your TLS web transfer to a specific CPU thread? Not sure it actually does make sense.

Only if you want good performance. If you're doing a lot of networking, you want your userland socket servicing pinned to the same CPU that the kernel is using for that socket. Which is easiest to achieve if you cpu pin nic queues and server threads. (If you want really good performance, you might want to skip userland with sendfile + kTLS or nic TLS, or maybe skip the kernel with userland networking)

For a benchmark, cpu pinning and tight control of the system is a clear prerequisite; you want to maximize repeatability, and user threads bouncing around cpu threads leads to less repeatable results.


If you really need good performance, aren't you going to be running far more sockets than you have network adapters?


My one and only one beef with Rustls is the inability to support some legacy crypto standards that aren't web safe but necessary for replacing OpenSSL in some cases (ie: server to server, database SSL, etc).

The project is the best one for use on the internet with modern SSL standards, however.


How so? What standards do you need support for?


As a library vying to replace OpenSSL, the same set of suites as OpenSSL.

I'm no longer blocked on this particular issue that I filed on behalf of my work at Deno, but they aren't interested in adding less-secure suites that may be required by certain server configurations, but still appropriate for traffic that isn't general web-use.

https://github.com/rustls/rustls/issues/1607

At some point I had a list of suites required to connect to some older versions of MySQL/Microsoft SQL Server, but again, no longer blocked.

For server-to-server use where I don't control one end of the equation, I stick with the OpenSSL crate. If there's potentially older servers in the mix, I'm OK with using rustls as a backend for things like reqwest, but it'll be openssl for servers for now.

I understand the philosophy, but rustls is never going to be an OpenSSL drop-in until this approach changes.

Semi-related, I now avoid native-tls because MacOS + gatekeeper + weird JAMF configuration makes that library completely unreliable in the wild.


More accurately: primitives from the aws-lc library (written in C and assembly, with tests in C++) outperform the OpenSSL and BoringSSL implementations they are based on, on some platforms.


I'm super proud of the work that the aws-lc team have been doing. Insanely powerful optimizations on many platforms (not least Graviton!) ... and those optimizations are formally generated or formally verified (see https://github.com/awslabs/s2n-bignum and https://github.com/awslabs/aws-lc-verification for directly related work) and also make massive improvements to the constant-timeness of the operations, which is important for mitigating side-channels.

I suspect most of the team would tell anyone "We have to write this in Assembly and C, but you don't have to! Rust is what we prefer to see at the application layer."


Well, that is true of any compiled language that still happens to have some of its parts written in either C or C++ instead of being fully bootstrapped.


This has been a deliberate design choice, because these primitives typically have to be constant-time, and are full of tricks to avoid CPUs' side channels. It's a very delicate code that is dangerous to rewrite.

However, TLS still involves a lot of code code that isn't pure low-level cryptography, like numerous protocol and certificate parsers, CA store interface and chain validation, networking, protocol state handling, etc.


“ Rustls is a memory safe TLS implementation with a focus on performance.”

If the other commenter was right, then what they’re saying is that people seeing a Rust TLS stack outperform non-Rust stacks might assume critical operations were written in memory-safe Rust. Then, that the post was implying memory-safe Rust is fast even with low-level operations. That maybe they could use Rust to replace C/C++ in other low-level, performance-critical routines. Then, they find out the heavy-lifting was memory-unsafe code called by Rust.

It does feel misleading if a reader thought Rust was replacing ASM/C/C++ in the low-level parts. I mean, even the AI people are getting high performance wrapping unsafe accelerator code in Python. So, what’s that prove?

In these situations, I might advertise that the protocol engine is in memory-safe code while the primitives are still unsafe. Something like that.


The lowest-level routines need to be written mostly in assembly to have constant-time execution (which is a difficult task even in assembly due to the complexity of modern CPUs). None of Rust, C, nor C++ can guarantee constant-time execution, and all three have aggressive optimisers that can remove "useless" code that is meant to defend against side-channel leaks.

However, there's more to TLS than just the lowest-level primitives. There's parsing, packet construction, certificate handling, protocol logic, buffer management, connection management, etc. These things can be written in safe Rust.


That’s memory-safe Rust mixed with unsafe assembly. The Rust should block many errors that would exist in an unsafe stack. There’s definitely benefits even if the whole program is no longer memory safe.

It’s also the same strategy I would have used except maybe attempting extra verification of the assembly. It’s one of the best choices with today’s tools. There’s work on constant-time compilation and certification but I don’t know its maturity.


It's a post about a memory safe TLS stack outperforming the dominant memory unsafe TLS stack, with metrics. The only observation that detractors are making is that, as with virtually every TLS stack, the lowest-level cryptography primitives are done in assembly. Ok, and?


Because, by definition, it’s not a memory-safe, TLS stack at that point. Security is only as strong as its weakest link. If critical components aren’t memory safe, we don’t usually call it memory safe overall or claim it’s in a memory safe language without clear qualifiers.

The detractors are talking about how they’re marketing or describing it. They want the memory safe and Rust labels to only be used for memory safe and purely-Rust programs. That’s fair.

Outside the marketing, the stack is good work. I’m grateful for all the TLS teams do to keep us safer.


I am switching to zig after writing rust professionally for 5+ years, but this take doesn’t make any sense having small amount of unsafe primitives is not the same as having all of your code unsafe. Especially higher level logic code can have a lot of mistakes, and the low level primitives very likely will be written by more experienced and careful people. This is the whole point of rust, even if it is questionable if it reaches it. Title only says rustls beats the other libraries which is objectively true so don’t see what is misleading here.


>this take doesn’t make any sense having small amount of unsafe primitives is not the same as having all of your code unsafe

I've been arguing this for years. It makes the area you need to review more tightly much smaller. Making it way easier to find bugs in the first place. I'm sometimes wondering if unsafe was the right choice of keyword. Because to people that don't understand the language, it conveys the sense that Rust doesn't help with memory safety at all.

I've written a bunch of Rust, and rarely needed to use unsafe. I'd say less than 0.1% of the lines written.

Aside from that unsafe Rust still has a lot more safety precautions than standard C++. It doesn't even deactivate the borrow checker. [1]

[1] https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html


In the past, safe vs unsafe meant whether it preserved invariants in all executions of your code. Was your code type- and memory-safe by default in all situations? Was there a guarantee? If so, it was safe. If breaking that guarantee or outside the type system, it was “unsafe.”

Note: I don’t know enough about Rust to tell you how they should label it.

Another thing you might find interesting is external verification of unsafe modules. What you do is build static analyzers, verifiers, etc that can prove the absence of entire categories of bugs, esp memory safety. It’s usually for small code. You run that on the code that doesn’t use memory safety.

Another technique is making a verified, reference implementation that’s used to confirm the high-performance implementation. Their interfaces and structure are designed to match. Then, automated methods for equivalence checking verify the unsafe code matches the safe code in all observed cases. The equivalence might be formal and/or test generators.

You can also wrap the unsafe code in safe interfaces that force it to be used correctly. I imagine the Rust TLS does this to some degree. Projects like miTLS go further to enforce specific, security properties during interactions between verified and unsafe code.

The last thing to consider are abstraction gap attacks. If mixing languages or models, then the behavior of one can make the other unsafe just because they work differently. Especially in how the compiler structures or links them. This led to real vulnerabilities in Ada code that used C just due to interactions, not the C code. Although previously checked by eye, there’s a new field called secure compilation or abstract compilation trying to eliminate the integration vulnerabilities.

Lastly, if not too bad for performance, some sandboxed the unsafe code with the interfaces checking communication both ways. Techniques used include processes (seL4), segments (GEMSOS), capabilities (CHERI), and physical (FPGA coprocessors). It’s usually performance-prohibitive to separate crypto primitives like this. Whereas, coprocessors can have verified crypto and be faster, though. (See Cryptol-generated VHDL.)


There’s no disagreement between us on the value of using mostly memory safe code. I’ve advocated it here for years.

I also promoted techniques to verify the “unsafe” portions by using different, verification methods with some kind of secure linking to avoid abstraction gap attacks.

The detractors were complaining about changing the definition of memory-safe code. It was code in a language that was immune to classes of memory safety errors. If the code compiles, the errors probably can’t occur. A guarantee.

The new definition they’re using for this project includes core blocks written in a memory unsafe language that might also nullify the safety guarantees in the other code. When compiled, you don’t know if it will have memory errors or not. That contradicts what’s expected out of memory-safe code.

So, people were objecting to it being described as memory safe Rust if it included code blocks of memory-unsafe, not-Rust code. There’s projects that write the core, performance-critical blocks in safe languages. There’s also those doing making crypto safer, like Galois’ Cryptol or SPARK Skein. So, using the right terminology helps users know what they’re getting and reviewers do apples to apples comparisons.

For this one, they might say it’s “mostly safe Rust with performance blocks written in unsafe assembler for speed.” (Or whatever else is in there.) The high-security community has often talked like that. Instead of hurting perception, it makes suppliers more trustworthy with our users more educated on well-balanced security.


> Title only says rustls beats the other libraries which is objectively true so don’t see what is misleading here.

You are correct.

Although, communication has two parts: sending and receiving.

Application named “rustFoo”, automatically is an advertising for rust, and title “RustFoo is faster than Foo” for many implies “rust is faster than <probablyC>”.


And.... It raises the very pointed question as to WHY they are getting better performance when all the performance-critical code is written in C/assembler in the Intel library. It seems inconceivable that 75% of the CPU profile isn't being spent in the Intel crypto library. In which case, big fat so what?

The question is: are they cheating?

Could it possibly be that that they have (somewhat suicidally) chosen to force the AVX512 execution path, when more reasonable implementations have decided that it's not really worth risking halving the performance of EVERY OTHER TASK ON THE ENTIRE COMPUTER in order to use AVX512 for a performance gain that isn't going to matter except in the very tiniest slice of use cases -- big iron running on the edge with dozens (hundreds?) of gazillo-bit/s network adapters, doing nothing but streaming TLS connections. Plus the fact that you'd have to lock your TLS encryption code to a particular CPU core on previous-generation CPUS, which is also a Really Bad Thing To Do for a TLS transfer.

I rather suspect it's entirely that.

Even on latest generation intel CPUs it's not clear whether using AVX512 for TLS is a sensible choice. AVX52 still drops the processor frequency by 10% on latest-gen CPUs. So every core on the entire CPU would have to be spending 80% (60%?) of their time running TLS crypto code in order to realize actual benefit from using AVX-512 crypto code.

That's what and.


> OpenSSL and its derivatives, widely used across the Internet, have a long history of memory safety vulnerabilities with more being found this year. It's time for the Internet to move away from C-based TLS.

Seems like a cheap shot, considering Rustls's default cryptography is implemented using a fork of OpenSSL's libcrypto.

Of course, there's nothing wrong with writing memory-safe TLS atop C and assembly primitives. But to say that OpenSSL causes memory safety vulnerabilities without being clear that aws-lc-rs uses FFI to call down into AWS-LC, which is based on libcrypto from OpenSSL and BoringSSL seems disingenuous.


Most OpenSSL vulnerabilities are in TLS itself and in format processing (X.509, PKCS), and the vulnerabilities that do implicate libcrypto tend not to implicate constructions Rustls would use.


> tend not to implicate constructions Rustls would use.

Ah, so if it's just a question of identifying and using the "good" C code, it really makes me wonder what Rust is actually adding here.


It's replacing the bad C code. Not all C code is equivalently easy or difficult to write.


The ciphers and hashes from OpenSSL have almost always been good C code. I'm sure there have been issues with variable runtimes leaking information, but memory safety won't be a cipher problem.

The protocol code, and the x.509 code from OpenSSL hasn't always been great. Rust providing memory protection on that is a nice thing.

There's certainly a question of how that makes for significantly more performance; handshake performance is usually dominated by crypto performance, and the same for data transfer... so if the crypto is coming from the same library, it's unexpected to see such a big change (10%+ in most graphs by my eye?)

Seems like it would be interesting to take one of these and really dig into it. That said, I know OpenSSL 3 was having some performance issues in some applications because of new locking behaviors; I don't know if BoringSSL took those or not.


Maybe the good C code isn't C code at all, it's actual perl scripts! https://github.com/dot-asm/cryptogams


From what I can tell most if not all of the security issues that have plagued OpenSSL etc have been with the code around the cryptographic primitives implementing the protocols, rather than the primitives themselves, which generally are very self-contained.

That said, the performance speaks for itself here.


Prefix Note: I am a Rust Nerd

This Problem exists all the way down in Rust‘s crypto libraries. OpenSSL just uses bindings, Ring uses BoringSSL which is just again C under the hood.

The only real Rust-only crypto project is RustCrypto, but they got a bit too clever with traits and generics. Also, the project is pretty undocumented.


It seems that RustCrypto performance is not competitive according to these benchmarks: https://jbp.io/graviola/


Cryptography functions are much safer than parsers or complex protocol logic.


AWS-LC was forked from libcrypto, but it’s also (partially) been formally verified.

Comparing it to OpenSSL is a bit much.


libcrypto is core component of the OpenSSL toolkit, and its source tree is part of the OpenSSL repo: https://github.com/openssl/openssl/tree/master/crypto


Are the improvements due to AWS-LC ie: what about a test of that without the rust wrapper?


TLS != ciphers


A comparison to https://en.wikipedia.org/wiki/LibreSSL would also be nice.


Libressl is openssl with a coat of paint (a sane interface) and improved documentation.

They are doing good solid work, but I would not expect any dramatic improvements in security and seeing as the openbsd project values correctness over speed, very likely a small hit in the speed department.


LibreSSL also has focused on removing code which is viewed as being actively bad (poor/problematic reimplementations of stdlib features), or less valuable or not aligned with OpenBSD's goals (ex: FIPS support).

It's probably worth benching just to see the impact of these changes - the fork happened around the same time as BoringSSL, which as these graphs show has quite different perf characteristics.


Will RustTLS support ECH? I would like the ability to hide the real server name in the SNI handshake to HAProxy.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: