Dafny – A programming language with a program verifier

Animats · on Jan 20, 2017

That's very nice. It's surprisingly similar to the Pascal-F verifier I worked on in 1983. I recently found the sources for that, and since they were approved for release, I put them on Github.[1] (It's not runnable yet; passes 1 and 2 now work, but pass 4, which needs to be ported to a more modern LISP, doesn't.) The approach is amazingly similar - automatic range and bounds verification, entry and exit assertions, pure functions can be used in assertions, additional lemmas can be proven.

Here are some Pascal-F examples that passed the verifier.[2] "bubble.pf" is a bubble sort. "circle.pf" is a circular buffer. Everybody who does verification work does those. Note how close these are to the similar Dafny examples. The terminology is different but equivalent; Dafny uses "requires" and "ensures" where Pascal-F used "entry" and "exit". Dafny used "old(f)" to talk about the old value (at function entry) of a variable, while Pascal-F uses "f.old". Both systems have lemmas, proved recursively with some extra effort by the user.

A more useful Pascal-F example is "engine1.pf", which is a simple engine control program for an auto engine. This has concurrency and real time constraints. Proof goals include proving that the spark is fired exactly once per cylinder time, and that if the engine stops rotating, the fuel pump is cut off.

Past this level, the next problem is scaleup. What do you want to prove about a large program? These simple verifiers don't address that. However, this sort of thing is really good at preventing buffer overflows and other things which break the program in a low-level way.

I'd like to see Rust have Dafny's level of proof capability. This would help deal with "unsafe" code where the programmer claims Rust's invariants are preserved. Most of those situations just require proving that some assertion holds both before and after some data manipulation, even though there are moments when it doesn't hold. Verification technology is quite capable of that, and has been for a long time.

(Having 3000x as much compute power as we had in 1982 helps a lot.)

[1] https://github.com/John-Nagle/pasv [2] https://github.com/John-Nagle/pasv/blob/master/src/Examples

btat1 · on Jan 20, 2017

Great job! I remember porting Fortran 77 programs on a Debian. The harder was to find then to understand the old compiler documentation :)

Animats · on Jan 20, 2017

Old Berkeley Pascal to Free Pascal wasn't too bad. Other than clashes over new reserved words and conversion to modern I/O, that went well.

The LISP part, though, is not going well. Porting clever 1970s Stanford AI Lab macros written on the original SAIL machine to modern Common LISP is hard. Anybody with a knowledge of MACLISP want to help? The code is all on Github. If you're into this, try to convert the macros "defsmac" and "defmac" in this file [1] to clisp. Submit a pull request.

At least now both languages can do POSIX I/O. We had some implementation-specific C glue code in the original just to talk to pipes. That can go away.

Looking to the future, Dafny level verification could easily deal with two of the big unsafe areas in Rust - partially initialized arrays (used for collections that can grow), and backpointers. Those have clear invariants, easy proofs, and the proofs can be done by an automatic prover. You don't need a labor-intensive interactive system like Coq.

The Dafny approach has the nice property that it's all right there in the program source code. There's no separate verification data to be kept in sync. We did that in Pascal-F, too, but many later systems have the user working in a completely different mathematical notation for verification purposes. This is a huge turnoff for programmers.

[1] https://github.com/John-Nagle/pasv/blob/master/src/CPC4/defm...

pdw · on Jan 20, 2017

I can't read MACLISP, but these two Common Lisp macros match the behavior that's described in the big comment:

    (defmacro defsmac (name params expr)
      `(defmacro ,name ,params
         (sublis (mapcar 'cons ',params (list ,@params)) ',expr)))
    
    (defmacro defmac (name params expr)
      `(defmacro ,name ,params
         (list '(lambda ,params ,expr) ,@params)))

Animats · on Jan 20, 2017

Thanks very much. It turns out the first one can be called with multiple args in the "expr" slot, so it has to be

    (defmacro defsmac (name params &rest expr)
      `(defmacro ,name ,params
         `,(cons 'progn (sublis (mapcar 'cons ',params 
                (list ,@params)) ', expr))))

but your code got me un-stuck.

vmarsy · on Jan 19, 2017

This is a Microsoft Research project started in 2008 [1]. It was previously hosted on codeplex [2]

[1] https://www.microsoft.com/en-us/research/project/dafny-a-lan...

[2] http://dafny.codeplex.com/wikipage/history

moomin · on Jan 20, 2017

For those of you not familiar with Haskell, it's worth pointing out that types and proofs are equivalent (with caveats). What this means in practice for Dafny is that anything you can do in it can equally be expressed in a language like Idris (with footnotes). Equally, it means that the trickiness encountered in heavily typed functional programming languages is carried over directly to working with program verifiers. At that point, it's more a case of what's easier to grok and work with.

I mention this, because I see a lot of people who dismiss something like Idris out of hand saying they want correctness checking, not realizing they're basically equivalent.

tom_mellior · on Jan 20, 2017

> For those of you not familiar with Haskell, it's worth pointing out that types and proofs are equivalent (with caveats). What this means in practice for Dafny is that anything you can do in it can equally be expressed in a language like Idris (with footnotes).

For those of you not familiar with Brainfuck, it's worth pointing out that it's Turing complete and hence equivalent to Haskell (with caveats). What this means in practice for Haskell is that anything you can do in it can equally be expressed in a language like Brainfuck (with footnotes).

Footnote: The programming and proving styles of Dafny and Idris are (roughly) as similar as the programming styles of Haskell and Brainfuck.

[What I am trying to say is this: Just because two systems do "verification" and are hence "equivalent", it's silly to suggest that they are pragmatically similar.]

qznc · on Jan 20, 2017

It is types≘theorems and programs≘proofs.

https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon...

moomin · on Jan 20, 2017

Yes, sorry.

amelius · on Jan 20, 2017

But wouldn't it be much more cumbersome to check correctness using a type-system?

tomp · on Jan 20, 2017

Yes.

The main advantage of program verifiers (e.g. Dafny) is that there's no (or very little) type/term division. A runtime function `+` is the same as the proof/mathematical function `+`, runtime numbers are mathmatical numbers, etc. (it gets a bit more complicated with equality AFAIK). As a result, in Haskell you have to reimplement the whole (Peano) integer arithmetics again in the type system to prove e.g. array bounds checks, whereas Dafny automatically maps between proofs and code.

I think that Haskell is slowly doing away with this (they're unifing the term and proof language). I think that Idris completely unifies the two. The only difference that remains then is that Dafny uses an automated theorem prover (Z3), whereas in Haskell/Idris you have to do proofs by yourself.

moomin · on Jan 20, 2017

Idris has pluggable peoof strategies, so whilst it's not doing it all for you, in theory you could plug Z3 into it.

moomin · on Jan 20, 2017

In short: no. It's exactly as cumbersome. If you look at the other comments you'll see plenty of people that found it hard to work with. Equally, if you look at Idris mailing lists you'll find people who love it.

Another way of looking at it is which is easier to grok: imperative or functional flow.

pron · on Jan 20, 2017

It's complicated. Basically, verifying programs is hard and pretty cumbersome regardless of the formalism, and the differences in the theory don't matter too much for most things people do. Having said that, making practical use of dependent types is currently behind proof tools for direct logic (in terms of automated provers, model checking etc.). What people sometimes do is start with direct logic expressed as a contract (say, like in Dafny, Java with JML or Ada SPARK), run automated tools on those (SMT solvers), and if the solvers fail, generate obligations for a logic based interactive prover or a dependent type one, at which point the differences matter even less. But working completely within a dependently typed language is currently unrecommended unless you just want to learn or play a bit, or unless you're an expert. Dependent types are just not yet "production ready".

seibelj · on Jan 20, 2017

I had to use this for an algorithms class while getting my CS master's. Very, very tough to use for anything beyond simple experiments. It has confusing syntax and incomplete documentation. But when you finally get your program to execute, you can be sure it's perfect! (Supposedly)

harpocrates · on Jan 20, 2017

What were the things that were confusing? I tried it out and found it much more intuitive than Coq (especially in the proofs). The keywords made sense, and so did the error messages. That said, I come from a more math oriented background, and I've played typed around with this sort of thing for a while now. So yeah, I'm honestly interested in hearing your perspective!

seibelj · on Jan 20, 2017

From what I remember, the assignment was to write a function that implemented quick sort (or something else moderately complex) and getting it to run and satisfy the proof checker was exhausting.

Edit: Found a forum post I made during it asking for help. Oh, the frustrating memories... https://dafny.codeplex.com/discussions/546995

pierrec · on Jan 20, 2017

Wow, the feedback you got on that forum has a pretty rare level of detail and helpfulness. Confusing syntax, perhaps, but the community is point for sure!

harpocrates · on Jan 20, 2017

I see. Googled the for Dafny quick sort and it does have some pretty long invariants.

[1]: http://rise4fun.com/Dafny/ks

archgoon · on Jan 20, 2017

Does Dafny not provide any macros for their invariants declarations ensures clauses? The QuickSort Example has a lot of the same invariants again and again, and most of them boil down to "I swear that the integer values are valid indices into the array".

houli · on Jan 20, 2017

Yes, you can define predicates that take parameters and use them as the invariants

skrebbel · on Jan 20, 2017

> much more intuitive than Coq

Is there software that is less intuitive than Coq?

nickpeterson · on Jan 20, 2017

Microsoft Word

didibus · on Jan 20, 2017

What are the competitors to this language? Can someone provide a list. I'd like to start learning a provable language, not sure if Dafny is the one I should start with though.

SilasX · on Jan 20, 2017

Per this reddit comment [1], the languages supporting some kind of compile-time constraint-checking/proving include Why3, Dafny, Fstar, Coq, Agda, and Idris.

Also, some languages have been extended to support it: LiquidHaskell, Frama-C, and Spark/Ada. I also recall some Java extension but can't find it atm.

[1] https://www.reddit.com/r/ProgrammingLanguages/comments/4ref8...

igravious · on Jan 20, 2017

Per the Wikipedia page for Whiley[1] the languages akin to Dafny are mentioned when explaining what Whiley is: “The primary purpose of such a tool is to improve software quality by ensuring a program meets a formal specification. In other words, to help identify and eliminate software bug in the programs being developed. Whiley follows many attempts to develop such tools, including notable efforts such as SPARK/Ada, ESC/Java, Spec#, Dafny, Why3 and Frama-C.”

I'm thinking there must be a reason they do not mention F*, Coq, Agda, and Idris, perhaps because they are proof assistants[2] rather than automatic provers of programs with formal specifications. See also Isabelle and others. Regarding Idris (which you mention but is not on that page) and Epigram (which is noteworthy and again not on that page), neither are production ready and the latter is unmaintained.

[1] https://en.wikipedia.org/wiki/Whiley_(programming_language)#...

[2] https://en.wikipedia.org/wiki/Proof_assistant

tom_mellior · on Jan 20, 2017

> I'm thinking there must be a reason they do not mention F* , Coq, Agda, and Idris

The ones they do mention are all imperative languages with added deductive verification in a Hoare calculus style (think loop invariants). The ones you mention are functional languages with a different style of proof (think inductive proofs).

Of course there are overlaps between the two styles (notably in Why3), but there are pronounced differences between them.

igravious · on Jan 20, 2017

Ok. But that's not what I was getting at.

tom_mellior · on Jan 20, 2017

> I also recall some Java extension but can't find it atm.

You might be thinking of JML (Java Modeling Language): http://www.eecs.ucf.edu/~leavens/JML//index.shtml Apparently JML-annotated programs can passed to Why3 using a confusingly documented set of tools: http://krakatoa.lri.fr/

There is also the Cheker Framework for Java: http://types.cs.washington.edu/checker-framework/ But as far as I understand it's less general than all the other tools you mention.

tom_mellior · on Jan 20, 2017

Also, and probably even more likely than the ones I mentioned above, there is KeY for Java verification: http://www.key-project.org/index.php

Devagamster · on Jan 20, 2017

Coq is the one I use most. I've heard of lean as well though.

Those are both proof assistants not theorem provers. Dafny uses z3 to attempt to automatically prove theorems about your code. Lean or Coq (for the most part) require that you write the proofs manually in a special language they provide and they will tell you whether or not your proof has errors.

In general these languages lie on a spectrum between more automatic to more manual with the automatic ones taking potentially more time and being more limited but requiring less of you and the manual ones such as lean or coq are more expressive but harder to use because you have to provide all of the proofs.

I should say all this with the caveat that I'm only learning these things now so I've probably said something wrong. Please correct me if that is the case.

nickpsecurity · on Jan 20, 2017

SPARK Ada is the main competitor & has been in industrial use a long time:

http://www.spark-2014.org/uploads/itp_2014_r610.pdf

http://www.anthonyhall.org/c_by_c_secure_system.pdf

http://libre.adacore.com/tools/spark-gpl-edition/

profquail · on Jan 20, 2017

Isabelle is another language for program verification / theorem proving, similar to Dafny and Coq:

https://isabelle.in.tum.de/

qznc · on Jan 20, 2017

It is a gradual range from untyped (assembly) to proof-anything (Isabelle, Coq). The more interesting question is: Where are the sweet spots?

The current mainstream (Java,C++,etc) form a big cluster "sweetrange" of roughly equivalent type safety. Rust with its ownership type system goes one step towards more proofs. ATS is even further. Agda and Idris want to be useable programming languages while providing as much proof as possible. Isabelle can generate code, but is also used as just a theorem prover.

tom_mellior · on Jan 20, 2017

> Rust with its ownership type system goes one step towards more proofs.

Does it really, though? The ownership system only talks about ownership, which you only need to reason about due to the artificial choice of avoiding garbage collection. So yes, you get closer to proofs, but only to proofs of properties that are not necessary to prove for Java, say.

You might use the ownership and lifetime system to derive must-not-alias guarantees that might help with some verification tasks, but I'm not sure how much it would help.

I'd love to see a deductive verification system for Rust, but I wouldn't expect it to be the silver bullet for verification.

qznc · on Jan 20, 2017

There is a convert-Rust-to-Lean-for-verification project. The not-aliasing guarantees did help tremendously there.

https://kha.github.io/2016/07/22/formally-verifying-rusts-bi...

tom_mellior · on Jan 20, 2017

My reading of that is that everything is made immutable, at which point aliasing isn't an issue anymore. I might be wrong though.

derkha · on Jan 20, 2017

Author of that project here, it's the other way around - I'm using the absence oft aliasing to turn mutable Rust code immutable, which I can then embed in Lean. But you're right that ownership (ie, affine types) per se doesn't help.

Sean1708 · on Jan 20, 2017

Ownership is used for more than just memory management though, concurrent memory access for example.

tom_mellior · on Jan 20, 2017

True! I wasn't thinking beyond verification of single-threaded programs, which is hard enough as it is ;-)

gergoerdi · on Jan 20, 2017

There's the so-called 'B Method', which seems very similar to this on the programming language side. I haven't checked the proof language so that may be very different.

I've got a two-parter on my blog giving a glimpse of B's programming language[1] and proof language[2].

[1] https://gergo.erdi.hu/blog/2010-02-16-the_b_method_for_progr... [2] https://gergo.erdi.hu/blog/2010-02-22-the_b_method_for_progr...

tomp · on Jan 20, 2017

One of the most serious competitors is F* - it's even based on the same underlying technology (Z3) but uses a more ML-like syntax.

https://fstar-lang.org/

didibus · on Jan 21, 2017

F* looks pretty neat. I might try to learn that one.

xearl · on Jan 20, 2017

Perfect Developer [1] is another one not yet mentioned. You write and verify your "specifications" (program-parts to be verified) using a language called "Perfect" and generate C++, C#, or Java from that (SPARK Ada generator in progress).

[1] http://www.eschertech.com/products/perfect_developer.php

Kametrixom · on Jan 20, 2017

Check out Idris maybe

ivcha · on Jan 26, 2017

Leon (http://lara.epfl.ch/w/leon) -- effectively, a verification/synthesis framework around Scala

bjz_ · on Jan 20, 2017

http://whiley.org/ is another similar language that uses an SMT solver for testing program invariants.

mrefj · on Jan 24, 2017

ACL2 http://www.cs.utexas.edu/~moore/acl2/

pron · on Jan 20, 2017

For those interested, Java has a mature specification language called JML, with tools similar to those provided by Dafny[1] and many more. In addition to SMT solvers, you can use interactive proofs for obligations that are not automatically discharged[2][3], or use the specifications to generate concolic tests[4].

[1]: http://www.openjml.org/

[2]: http://www.key-project.org/index.php

[3]: http://krakatoa.lri.fr/

[4]: https://github.com/Rafael-Baltazar/jmlcute

yairharel · on Jan 20, 2017

I love the thoughtful logo

vbit · on Jan 20, 2017

Would be nice to have a screenshot without the red squiggly underlines.

dualogy · on Jan 20, 2017

Unless..... "As you type in your program, the verifier constantly looks over your shoulders and flags any errors." .... it's a feature being highlighted =)

gravypod · on Jan 19, 2017

"Try Dafny" is not complete. It doesn't have a link to request a special purpose keyboard to be sent to you.

saghm · on Jan 20, 2017

Assuming it's anything like the Emacs mode for Coq, they're probably just using an editor/editor plugin that replaces the operators/keywords with the mathematical symbols.

awinter-py · on Jan 20, 2017

so cool! every language should have an invariant keyword.

not sure how one uses it in practice -- most of the operators don't exist on my keyboard.

Animats · on Jan 20, 2017

I had the same argument with the Stanford Pascal Verifier people around 1980. They were using the SAIL PDP-10, with keyboards with extra characters and extra shift keys TOP and META. Assertions were written with mathematical notation but code was written in ASCII. We stuck to ASCII and Pascal operators for Pascal-F.

Today, everybody has Unicode output, and all the math symbols are potentially typeable, but it drives programmers nuts if they have to enter them. If you're going to do a verification system, you have to design it for programmers, not mathematicians, or it won't be used. This is a serious problem with formal methods. The formalism is a means to an end, not an end in itself. The end goal is bug-free code.

Dafny for Rust has potential. One big problem is where to put the assertions inside terse functional notation. This is word wrap in Rust:

    s.lines()
        .map(|bline| UnicodeSegmentation::graphemes(bline, true)   
            .collect::<Vec<&str>>())
        .map(|line| wordwrapline(&line, maxline, maxword))
        .collect::<Vec<String>>()        
    .join("\n")

OK, where would we put assertions? Maybe intersperse

    .assert(|arg| assertionexpression)

which is just a passthrough that returns its value but allows making assertions about what's going through it.

houli · on Jan 20, 2017

The screenshot shows what looks like the emacs mode which actually just masks the ASCII you type with the unicode characters to make it look nice. In practice you're just writing plain ASCII https://github.com/houli/verification/blob/master/strings3.d...