That's very nice. It's surprisingly similar to the Pascal-F verifier I worked on in 1983. I recently found the sources for that, and since they were approved for release, I put them on Github.[1] (It's not runnable yet; passes 1 and 2 now work, but pass 4, which needs to be ported to a more modern LISP, doesn't.) The approach is amazingly similar - automatic range and bounds verification, entry and exit assertions, pure functions can be used in assertions, additional lemmas can be proven.
Here are some Pascal-F examples that passed the verifier.[2] "bubble.pf" is a bubble sort. "circle.pf" is a circular buffer. Everybody who does verification work does those. Note how close these are to the similar Dafny examples. The terminology is
different but equivalent; Dafny uses "requires" and "ensures" where Pascal-F used "entry" and "exit". Dafny used "old(f)" to talk about the old value (at function entry) of a variable, while Pascal-F uses "f.old". Both systems have lemmas, proved
recursively with some extra effort by the user.
A more useful Pascal-F example is "engine1.pf", which is a simple engine control program for an auto engine. This has concurrency and real time constraints. Proof goals include proving that the spark is fired exactly once per cylinder time, and that if the engine stops rotating, the fuel pump is cut off.
Past this level, the next problem is scaleup. What do you want to prove about a large program? These simple verifiers don't address that. However, this sort of thing is really good at preventing buffer overflows and other things which break the program in a low-level way.
I'd like to see Rust have Dafny's level of proof capability. This would help deal with "unsafe" code where the programmer claims Rust's invariants are preserved. Most of those situations just require proving that some assertion holds both before and after some data manipulation, even though there are moments when it doesn't hold. Verification technology is quite capable of that, and has been for a long time.
(Having 3000x as much compute power as we had in 1982 helps a lot.)
Old Berkeley Pascal to Free Pascal wasn't too bad. Other than clashes over new reserved words and conversion to modern I/O, that went well.
The LISP part, though, is not going well. Porting clever 1970s Stanford AI Lab macros written on the original SAIL machine to modern Common LISP is hard. Anybody with a knowledge of MACLISP want to help? The code is all on Github. If you're into this, try to convert the macros "defsmac" and "defmac" in this file [1] to clisp. Submit a pull request.
At least now both languages can do POSIX I/O. We had some implementation-specific C glue code in the original just to talk to pipes. That can go away.
Looking to the future, Dafny level verification could easily deal with two of the big unsafe areas in Rust - partially initialized arrays (used for collections that can grow), and backpointers. Those have clear invariants, easy proofs, and the proofs can be done by an automatic prover. You don't need a labor-intensive interactive system like Coq.
The Dafny approach has the nice property that it's all right there in the program source code. There's no separate verification data to be kept in sync. We did that in Pascal-F, too, but many later systems have the user working in a completely different mathematical notation for verification purposes. This is a huge turnoff for programmers.
For those of you not familiar with Haskell, it's worth pointing out that types and proofs are equivalent (with caveats). What this means in practice for Dafny is that anything you can do in it can equally be expressed in a language like Idris (with footnotes). Equally, it means that the trickiness encountered in heavily typed functional programming languages is carried over directly to working with program verifiers. At that point, it's more a case of what's easier to grok and work with.
I mention this, because I see a lot of people who dismiss something like Idris out of hand saying they want correctness checking, not realizing they're basically equivalent.
> For those of you not familiar with Haskell, it's worth pointing out that types and proofs are equivalent (with caveats). What this means in practice for Dafny is that anything you can do in it can equally be expressed in a language like Idris (with footnotes).
For those of you not familiar with Brainfuck, it's worth pointing out that it's Turing complete and hence equivalent to Haskell (with caveats). What this means in practice for Haskell is that anything you can do in it can equally be expressed in a language like Brainfuck (with footnotes).
Footnote: The programming and proving styles of Dafny and Idris are (roughly) as similar as the programming styles of Haskell and Brainfuck.
[What I am trying to say is this: Just because two systems do "verification" and are hence "equivalent", it's silly to suggest that they are pragmatically similar.]
The main advantage of program verifiers (e.g. Dafny) is that there's no (or very little) type/term division. A runtime function `+` is the same as the proof/mathematical function `+`, runtime numbers are mathmatical numbers, etc. (it gets a bit more complicated with equality AFAIK). As a result, in Haskell you have to reimplement the whole (Peano) integer arithmetics again in the type system to prove e.g. array bounds checks, whereas Dafny automatically maps between proofs and code.
I think that Haskell is slowly doing away with this (they're unifing the term and proof language). I think that Idris completely unifies the two. The only difference that remains then is that Dafny uses an automated theorem prover (Z3), whereas in Haskell/Idris you have to do proofs by yourself.
In short: no. It's exactly as cumbersome. If you look at the other comments you'll see plenty of people that found it hard to work with. Equally, if you look at Idris mailing lists you'll find people who love it.
Another way of looking at it is which is easier to grok: imperative or functional flow.
It's complicated. Basically, verifying programs is hard and pretty cumbersome regardless of the formalism, and the differences in the theory don't matter too much for most things people do. Having said that, making practical use of dependent types is currently behind proof tools for direct logic (in terms of automated provers, model checking etc.). What people sometimes do is start with direct logic expressed as a contract (say, like in Dafny, Java with JML or Ada SPARK), run automated tools on those (SMT solvers), and if the solvers fail, generate obligations for a logic based interactive prover or a dependent type one, at which point the differences matter even less. But working completely within a dependently typed language is currently unrecommended unless you just want to learn or play a bit, or unless you're an expert. Dependent types are just not yet "production ready".
I had to use this for an algorithms class while getting my CS master's. Very, very tough to use for anything beyond simple experiments. It has confusing syntax and incomplete documentation. But when you finally get your program to execute, you can be sure it's perfect! (Supposedly)
What were the things that were confusing? I tried it out and found it much more intuitive than Coq (especially in the proofs). The keywords made sense, and so did the error messages. That said, I come from a more math oriented background, and I've played typed around with this sort of thing for a while now. So yeah, I'm honestly interested in hearing your perspective!
From what I remember, the assignment was to write a function that implemented quick sort (or something else moderately complex) and getting it to run and satisfy the proof checker was exhausting.
Wow, the feedback you got on that forum has a pretty rare level of detail and helpfulness. Confusing syntax, perhaps, but the community is point for sure!
Does Dafny not provide any macros for their invariants declarations ensures clauses? The QuickSort Example has a lot of the same invariants again and again, and most of them boil down to "I swear that the integer values are valid indices into the array".
What are the competitors to this language? Can someone provide a list. I'd like to start learning a provable language, not sure if Dafny is the one I should start with though.
Per this reddit comment [1], the languages supporting some kind of compile-time constraint-checking/proving include Why3, Dafny, Fstar, Coq, Agda, and Idris.
Also, some languages have been extended to support it: LiquidHaskell, Frama-C, and Spark/Ada. I also recall some Java extension but can't find it atm.
Per the Wikipedia page for Whiley[1] the languages akin to Dafny are mentioned when explaining what Whiley is: “The primary purpose of such a tool is to improve software quality by ensuring a program meets a formal specification. In other words, to help identify and eliminate software bug in the programs being developed. Whiley follows many attempts to develop such tools, including notable efforts such as SPARK/Ada, ESC/Java, Spec#, Dafny, Why3 and Frama-C.”
I'm thinking there must be a reason they do not mention F*, Coq, Agda, and Idris, perhaps because they are proof assistants[2] rather than automatic provers of programs with formal specifications. See also Isabelle and others. Regarding Idris (which you mention but is not on that page) and Epigram (which is noteworthy and again not on that page), neither are production ready and the latter is unmaintained.
> I'm thinking there must be a reason they do not mention F* , Coq, Agda, and Idris
The ones they do mention are all imperative languages with added deductive verification in a Hoare calculus style (think loop invariants). The ones you mention are functional languages with a different style of proof (think inductive proofs).
Of course there are overlaps between the two styles (notably in Why3), but there are pronounced differences between them.
Coq is the one I use most. I've heard of lean as well though.
Those are both proof assistants not theorem provers. Dafny uses z3 to attempt to automatically prove theorems about your code. Lean or Coq (for the most part) require that you write the proofs manually in a special language they provide and they will tell you whether or not your proof has errors.
In general these languages lie on a spectrum between more automatic to more manual with the automatic ones taking potentially more time and being more limited but requiring less of you and the manual ones such as lean or coq are more expressive but harder to use because you have to provide all of the proofs.
I should say all this with the caveat that I'm only learning these things now so I've probably said something wrong. Please correct me if that is the case.
It is a gradual range from untyped (assembly) to proof-anything (Isabelle, Coq). The more interesting question is: Where are the sweet spots?
The current mainstream (Java,C++,etc) form a big cluster "sweetrange" of roughly equivalent type safety. Rust with its ownership type system goes one step towards more proofs. ATS is even further. Agda and Idris want to be useable programming languages while providing as much proof as possible. Isabelle can generate code, but is also used as just a theorem prover.
> Rust with its ownership type system goes one step towards more proofs.
Does it really, though? The ownership system only talks about ownership, which you only need to reason about due to the artificial choice of avoiding garbage collection. So yes, you get closer to proofs, but only to proofs of properties that are not necessary to prove for Java, say.
You might use the ownership and lifetime system to derive must-not-alias guarantees that might help with some verification tasks, but I'm not sure how much it would help.
I'd love to see a deductive verification system for Rust, but I wouldn't expect it to be the silver bullet for verification.
Author of that project here, it's the other way around - I'm using the absence oft aliasing to turn mutable Rust code immutable, which I can then embed in Lean. But you're right that ownership (ie, affine types) per se doesn't help.
There's the so-called 'B Method', which seems very similar to this on the programming language side. I haven't checked the proof language so that may be very different.
I've got a two-parter on my blog giving a glimpse of B's programming language[1] and proof language[2].
Perfect Developer [1] is another one not yet mentioned. You write and verify your "specifications" (program-parts to be verified) using a language called "Perfect" and generate C++, C#, or Java from that (SPARK Ada generator in progress).
For those interested, Java has a mature specification language called JML, with tools similar to those provided by Dafny[1] and many more. In addition to SMT solvers, you can use interactive proofs for obligations that are not automatically discharged[2][3], or use the specifications to generate concolic tests[4].
Unless..... "As you type in your program, the verifier constantly looks over your shoulders and flags any errors." .... it's a feature being highlighted =)
Assuming it's anything like the Emacs mode for Coq, they're probably just using an editor/editor plugin that replaces the operators/keywords with the mathematical symbols.
I had the same argument with the Stanford Pascal Verifier people around 1980. They were using the SAIL PDP-10, with keyboards with extra characters and extra shift keys TOP and META. Assertions were written with mathematical notation but code was written in ASCII. We stuck to ASCII and Pascal operators for Pascal-F.
Today, everybody has Unicode output, and all the math symbols are potentially typeable, but it drives programmers nuts if they have to enter them. If you're going to do a verification system, you have to design it for programmers, not mathematicians, or it won't be used. This is a serious problem with formal methods. The formalism is a means to an end, not an end in itself. The end goal is bug-free code.
Dafny for Rust has potential. One big problem is where to put the assertions inside terse functional notation. This is word wrap in Rust:
The screenshot shows what looks like the emacs mode which actually just masks the ASCII you type with the unicode characters to make it look nice. In practice you're just writing plain ASCII https://github.com/houli/verification/blob/master/strings3.d...
Here are some Pascal-F examples that passed the verifier.[2] "bubble.pf" is a bubble sort. "circle.pf" is a circular buffer. Everybody who does verification work does those. Note how close these are to the similar Dafny examples. The terminology is different but equivalent; Dafny uses "requires" and "ensures" where Pascal-F used "entry" and "exit". Dafny used "old(f)" to talk about the old value (at function entry) of a variable, while Pascal-F uses "f.old". Both systems have lemmas, proved recursively with some extra effort by the user.
A more useful Pascal-F example is "engine1.pf", which is a simple engine control program for an auto engine. This has concurrency and real time constraints. Proof goals include proving that the spark is fired exactly once per cylinder time, and that if the engine stops rotating, the fuel pump is cut off.
Past this level, the next problem is scaleup. What do you want to prove about a large program? These simple verifiers don't address that. However, this sort of thing is really good at preventing buffer overflows and other things which break the program in a low-level way.
I'd like to see Rust have Dafny's level of proof capability. This would help deal with "unsafe" code where the programmer claims Rust's invariants are preserved. Most of those situations just require proving that some assertion holds both before and after some data manipulation, even though there are moments when it doesn't hold. Verification technology is quite capable of that, and has been for a long time.
(Having 3000x as much compute power as we had in 1982 helps a lot.)
[1] https://github.com/John-Nagle/pasv [2] https://github.com/John-Nagle/pasv/blob/master/src/Examples