A Formal Verification of Rust's Binary Search Implementation

millstone · on July 22, 2016

Here's the Rust binary search source: https://github.com/rust-lang/rust/blob/27e766d7bc84be992c8dd...

I don't know much about formal methods. I was surprised by this:

> Indeed, if you like at the prelude, you may notice that currently there’s no modelling of overflows – all integer types are unbounded on the Lean side

But integer overflow is one of the most common bugs in binary search [1]. I don't mean to attack this work, but what good are formal methods if they still allow failures like this? Honest question.

> In any case, overflow checking is not very interesting for the implementation at hand because every integer in there is trivially bounded by self.len()

Hmmm. See line 317 or 321: base + head.len() is not "trivially" bounded by self.len(), at least not to me.

[1]: https://research.googleblog.com/2006/06/extra-extra-read-all...

lilyball · on July 22, 2016

If you read the Rust code you linked, you'll notice that it's written very differently than the classic binary search. Specifically, it doesn't keep track of low/high indices, which means it doesn't have the `(low + high) / 2` step that your googleblog.com article says is the source of integer overflow.

> Hmmm. See line 317 or 321: base + head.len() is not "trivially" bounded by self.len(), at least not to me.

Sure it is. Just look at that next line, `s = &tail[1..];`. When head.len()+1 is added to base, we then discard head and the first element of tail, so we just discarded head.len()+1 elements. It should be fairly obvious that if we always add the number of discarded elements to base, then base can never exceed the number of discarded elements, and similarly that we can never discard more elements than self.len(). Therefore, base can never exceed self.len().

millstone · on July 22, 2016

Hey, there's more than one way to overflow an integer!

Trivially true things cannot depend on "fairly obvious" observations. In fact you cannot reason locally here: you must understand the whole code to convince yourself that it does not overflow. Consider an off-by-one error, like replacing line 318 with:

   s=&tail[0..];

and now base will overflow.

Anyways the point of formal methods is to give yourself something better than "read carefully" to convince yourself of correctness.

derkha · on July 22, 2016

Well, that change would also break the proof in general :) . But I see your point.

I've written about overflow checking some more below, but what you'd really want for that is some solid support of subtypes or refinement types - it's not just integers that have to become bounded, but also most data structures. Lean isn't quite there yet, unfortunately, but this should change in the near future with its new focus on powerful automation.

ArkyBeagle · on July 23, 2016

I believe ( (low>>1) + (high>>1) + ((low&1) & (high&1)) ) renders the desired result sans overflow.

Gankro · on July 22, 2016

I imagine a basic proof amounts to: the result (base) is monotonically increasing (even in intermediate calculations) and the length of the slice fits in a usize, so if the correct result is always produced with infinite precision there's no concern for overflow (correct results are bounded by the len).

Similarly `s.len() >> 1` is a strictly decreasing value by division, so no concern for underflow.

millstone · on July 22, 2016

I like that. That's a powerful argument!

For it to work, it requires that every number be returned as part of the result, and that all calculations are monotonically increasing, as you said. Most algorithms won't have those invariants, but perhaps you can tweak them until they do. Once you have, you can extend an infinite-width proof to a finite-width proof.

seanwilson · on July 22, 2016

> I don't mean to attack this work, but what good are formal methods if they still allow failures like this? Honest question.

It's research and not a production product: they're likely concentrating on problems that are original and interesting to the research community instead of looking at problems that the community knows are already solvable.

euyyn · on July 22, 2016

Not "trivially", sure :) but it's not hard to prove either.

Animats · on July 22, 2016

Nice. But you have to check for integer overflow.

Unwinding assignments into a single-assignment notation is common in verification. (We did it automatically 30 years ago in the Pascal-F verifier.) Rust is expressive enough that you can do that in Rust itself, which is convenient. It's not "functional" that's important for this; it's single-assignment. Each variable in the proofs can have only one meaning, because the proofs have no notion of time. (Well, there's temporal logic, but you don't need to go that route.)

Whether you should trust the compiler is always an issue. When we did this, we verified at the stack-machine level. At that level, operations are like "pop two operands from stack, add in 16 bit integer mode, push result". At that level, semantics are relatively simple. If you have the compiler's dictionary available, you can give names to all those things being pushed and popped, much as a debugger does.

derkha · on July 22, 2016

By calling the verification 'simple' myself, I concede that it may also be _simplistic_ in parts. Perhaps I should have emphasized that this project is mostly about algorithmic correctness currently. A bug in the compiler backend would not be a bug in the algorithm, and neither would I call a function unexpectedly panicking on more than 2^63 elements an algorithmic bug. And if you compile your Rust program with overflow checks (default in debug mode), the verification can still guarantee partial correctness: If it does not panic due to some overflow, the result is correct.

You're correct in that what I'm constructing is some form of Dynamic Single Assignment. But together with the eradication of mutable references (and perhaps other effects in the future), it felt better to me to describe the transformation from the purity aspect.

ktRolster · on July 22, 2016

Whether you should trust the compiler is always an issue.

The answer is no.

Animats · on July 23, 2016

True. It's also easier to let the regular compiler front end do a lot of the work. Verification can occur after templates have been instantiated, macros have run, types have been determined, structure field offsets have been assigned, and syntax errors have been detected. This gets rid of most ambiguities of working at the source code level.

In the early days of program verification it was popular to work at the source code level, because automatic efforts were trying to emulate what humans were doing by hand. Early programming verification was trying to emulate what mathematicians did. Efforts were made to describe languages axiomatically at the source level. Now that's recognized as both very hard and likely to lead to errors.

mikejmoffitt · on July 22, 2016

This would have been better written in Ru-

Oh.

kbenson · on July 22, 2016

I am a Rust proponent, and I approve of this in a self deprecating humorous way. Well done, sir.

kartD · on July 22, 2016

I actually wanted to know about the reverse operation. Can we get some formal verification and modeling software to generate Rust(or C) code?

When I used Uppaal in my grad school course, I kept messing up when I tried to hand translating my code from the model.

seanwilson · on July 22, 2016

> I actually wanted to know about the reverse operation. Can we get some formal verification and modeling software to generate Rust(or C) code?

Haven't tried it but this claims to be able to extract Rust programs from Coq proofs:

https://github.com/pirapira/coq2rust

derkha · on July 22, 2016

There is a PR for a Lean backend producing both C++ and Rust: https://github.com/leanprover/lean/pull/1090 Note, however, that this is purely focused on executing Lean code, not integrating with other C++/Rust sources.

nickpsecurity · on July 23, 2016

My first thought was why Lean vs Isabelle/HOL or Coq given all they have that can be leveraged? And selectively depending on how challenging or different you want your project.

derkha · on July 23, 2016

I actually started the project in Isabelle: https://github.com/Kha/electrolysis/tree/isabelle/thys. Automation was nice, translating Rust traits not so much. You really want them to be type classes, which Isabelle only supports for the single-parametric case.

As of right now, the project would probably be easier in Coq. But I'm confident that Lean 3 will eventually feature more powerful automation (and no Ltac). It is, after all, being developed by Leonardo de Moura, one of the creators of Z3.

kartD · on July 23, 2016

All of these look really interesting, I'll take a look at time. Thanks everyone!

fmap · on July 22, 2016

Very cool project with a great name!

If you haven't looked at it already, you may be interested in Arthur Charguéraud's work on program verification using characteristic formulas (http://www.chargueraud.org/softs/cfml). CFML is a similar tool for ocaml verification using Coq.

derkha · on July 23, 2016

I have heard of it, it features a nice application of Separation Logic. Though not needing that kind of logic at all does feel even better! There is also an extension of an extension of CFML to asymptotic complexity analysis, after which I may model my own analysis: http://gallium.inria.fr/blog/formally-verified-complexity-wi...

hannesm · on July 23, 2016

This is similar to my experience verifying imperative code (as part of http://itu.dk/research/tomeso): rewrite code as pure as possible (learning that mutation is premature optimisation), then try to specify and prove it.

Aside, if we're reinventing the wheel anyways, why not in a proper style? There's no reason to not deploy purely functional code (+some effectful shims). Think https://mirage.io https://nqsb.io -- rust has way too much mutation and unsafe builtin already...

vvanders · on July 22, 2016

Very cool. Love the part about lifting mutable borrows into purely functional components.

14113 · on July 23, 2016

On the topic of suggestions for future work, and discussion of time complexity proofs, would any of the techniques presented here [1] be useful? As far as I understand their techniques would be quite applicable (and similar to what you've already looked at) e.g. a comparison counting monad etc.

[1] http://dl.acm.org/citation.cfm?id=1573604 also available at [2]

[2] http://www.cs.ru.nl/~james/2008-TYPES/final.pdf