TLA+: design, model, document, and verify concurrent systems

gralx · on May 3, 2019

I've been learning TLA+ for the past two months, hoping I could use it as a formal language for software design generally and not just for concurrent systems. I'm enamored with the idea of using math formalisms to describe a state machine.

But now that I've gone through Lamport's (very charming) video tutorials at OP's link and studied some of the specifications at

https://github.com/tlaplus/Examples/tree/master/specificatio...

I'm less sure now that I can. I'll keep plugging away, but if anyone here has any suggestions or offers any alternatives, I'm all ears.

pron · on May 3, 2019

I think TLA+ is a great fit for what you want to do. What problems are you facing?

BTW, while state machines are certainly the recommended approach for TLA+ specifications (and for very good reasons), there are others, which may be useful in some cases. For example, here I have a rule-based specification of Tic-Tac-Toe: https://pron.github.io/files/TicTacToe.pdf Instead of describing a state machine, the specification is a conjunction of state machines in a style known as "behavioral programming".

Also, you may benefit from looking at the TLA+ subreddit (https://old.reddit.com/r/tlaplus/) for various posts on TLA+ and formal methods in general, and maybe get some ideas.

sktrdie · on May 3, 2019

I can't believe finally someone mentions behavioral programming here on HN. I'm a huge proponent of it and have written some in the context of React https://lmatteis.github.io/react-behavioral/

How does it compare with TLA+ though? I've tried reading through your document, but I can't seem to understand how the interweaving is done and the blocking.

pron · on May 3, 2019

Behavioral programming is built on top of a paradigm called synchronous programming, developed by David Harel (who later invented behavioral programming), Amir Pnueli and later Gérard Berry and others. The mathematical formalism underlying synchronous programming (in the same way that the lambda calculus underlies functional programming) is temporal logic, introduced into computer science by Pnueli (for which he received the Turing Award). TLA+ is based on a logic called TLA, the Temporal Logic of Actions, which Lamport designed to fix what he saw as some shortcomings in Pnueli's approach. So there's the connection.

Harel and Pnueli wanted a formalism that makes it easy for both humans and machines (formal verification) to understand. Formal verification with model checking is, of course, also central in behavioral programming. I guess you could say that synchronous programming (and so behavioral programming, which is a kind of SP) is related to TLA+ in a similar way to how Haskell is related to, say, Agda.

sktrdie · on May 3, 2019

Interesting I see. I thought BP was built on top of Live Sequence Charts - a visual formalism for describing specs that can also be executed.

Thanks for the info. I'll have to dive a bit more into TLA+. Even though I still don't see how the request/wait/block is implemented in TLA+... I mean you still need an even selection mechanism for it or not?

hwayne · on May 3, 2019

In the TicTacToe spec, we have

    Next2 ==
      /\ turn' = turn + 1            \* A
      /\ current' = Opponent         \* B
      /\ ∃i,j∈1..N:                  \* C
           /\board[i,j] = Empty      \* C.1
           /\ board'[i,j] = current  \* C.2

For `Next2` to be true, the square [i, j] has to be empty AND in the following state, that same square must have whichever mark was `current` in it. Since `current` switches between the two each step of the behavior, this forces each player to wait until their turn comes around.

pron · on May 3, 2019

> I thought BP was built on top of Live Sequence Charts

Yes, but LSC are also based on temporal logic, as were Harel's Statecharts (perhaps the first synchronous programming language).

> I mean you still need an even selection mechanism for it or not?

Not sure what you mean by "even." TLA+ is nondeterministic, and the formulas can serve as rules to restrict that nondeterminism (where simple logical conjunction is used to compose the rules). Also, nothing is really "implemented" in TLA+, as it's not a programming language. It's a formal specification language that describes the behavior of discrete systems.

anewhnaccount2 · on May 4, 2019

I'm pretty sure the (here much maligned) Eve programming language was an example of this. It was posted a few times before its demise. I'm also a fan of the paradigm, particularly Eve's relational/behavioral style.

pron · on May 4, 2019

Not quite behavioral (which is a very specific thing), but Eve was certainly synchronous. It was one of the two languages (the other being Céu) that brought synchronous programming from the niche of safety-critical realtime systems to more mainstream software. IMO, it was the most theoretically-advanced programming language around.

gralx · on May 3, 2019

> I think TLA+ is a great fit for what you want to do.

I'm very happy to hear that.

> What problems are you facing?

Generally, the problem of formally specifying a program in set theory and first-order logic syntax before choosing implementation details, such as the language to program it in.

I have to give more thought to how I would describing specific problems. I've started with a program I want to write that I've already modelled informally. I'll edit this post later to explain it.

pron · on May 3, 2019

First, just to be precise, system dynamics are specified in TLA+ with TLA, which is a linear-time temporal logic (albeit one that seeks to minimize temporal reasoning); set theory is used "only" for the data part.

Second, the whole point of a high-level specification is that it is not dependent on a programming language, and allows you to choose the level of detail that suits you. One of the things that set TLA+ apart is that it allows for very convenient refinements, i.e. you can specify the same system multiple times, at different levels of detail, and show that a more detailed specification indeed implements a more abstract one.

Perhaps looking at examples (which you can find on the subreddit I linked and in the sidebar) will give you a better feel for how TLA+ is used for precisely the goal you have in mind. The Practical TLA+ book[1] (which uses PlusCal) also has good examples.

[1]: https://www.apress.com/gp/book/9781484238288

gralx · on May 3, 2019

Fantastic, pron - thanks for the helpful replies.

I had a look at Practical TLA+ a few weeks back, but because its first example also specifies a concurrent system, and because most of it is written in PlusCal syntax, not in TLA+, I decided to put it off. I'll give it another look on your recommendation. I'll also check out the subreddit you linked to.

The idea of starting with a general high-level specification before adding detail makes a lot of sense.

As for specific problems, I have in mind a program that should be simple to specify. At its heart, it tests whether each numeric value in an input tuple X lies between values at the same index in tuples A and B, so that if

    ∀x ∈ X : a_i ∈ A ≤ x_i ≤ b_i ∈ B

then the program outputs "true", and otherwise the program outputs "false". Because both outputs are allowed,

    True ≜  ∧ ∀x ∈ X : ∧ x_i ≥ a_i ∈ A
                       ∧ x_i ≤ b_i ∈ B
            ∧ Output′ = true

    False ≜ ∧ ∃x ∈ X : ∨ x_i < a_i ∈ A
                       ∨ x_i > b_i ∈ B
            ∧ Output′ = false

covers all states. Some of my syntax is not TLA+, as you can see, so that's the specific problem I'm up against right now.

hwayne · on May 3, 2019

Yo, author of PT here. I'd say that given your use case, the book _probably_ won't be that useful to you. Among other things, I intentionally avoided teaching refinements. You'd probably be happier starting with either _Specifying Systems_[1] or the _Hyperbook_[2], particularly the latter's proof track.

If you still have access to a copy of PT, I'd recommend reading chapter 7 (specifying and implementing algorithms) to see if that helps at all. The code is in PlusCal, but the general approach transfers to pure TLA+.

[1]: https://lamport.azurewebsites.net/tla/book.html [2]: https://lamport.azurewebsites.net/tla/hyperbook.html

EDIT: in terms of writing it as TLA+, you probably want something like

    EXTENDS Sequences
    IndicesBetween(X, A, B) ==
      \A i \in 1..Len(X):
        /\ A[i] <= X[i]
        /\ B[i] >= X[i]

Assuming `Len(X) <= Min(Len(A), Len(B))`.

gralx · on May 3, 2019

Thanks, Hillel! The TLA+ formulation looks good. I was so happy to get your advice that I bought a copy of Practical TlA+ from Apress. Even if it doesn't help with this particular project, I'm sure it will be useful down the line.

Specifying Systems has been my go-to reference alongside Lamport's video tutorial. Thanks for suggesting Chapter 7 of the Hyperbook - I never thought to look there.

EDIT: I misread: Chapter 7 refers to the chapter in Practical TLA+, not to the Hyperbook's proof track.

sethwm · on May 3, 2019

Have you looked at PlusCal [1][2]? It's more natural for many software developers and compiles down to TLA+.

[1] https://en.wikipedia.org/wiki/PlusCal

[2] https://lamport.azurewebsites.net/tla/p-manual.pdf

gralx · on May 3, 2019

Yes, I did. I prefer math/logic syntax, and I often like to work with pen and paper, and those are my main reasons for preferring TLA+ syntax over PlusCal. The author of Practical TLA+ says PlusCal is innately less expressive.

Even so, pron elsewhere recommends reading Practical TLA+ for other reasons, so I think I'll learn PlusCal anyway. I appreciate the suggestion.

pron · on May 3, 2019

I have mixed feelings about PlusCal. I, too, prefer TLA+ (except for low-level concurrent algorithms, or maybe algorithms best specified with long series of sequential steps), but for many programmers PlusCal can feel more familiar (which may be misleading; specification is not programming, and when you specify in TLA+ it also looks different from programming, but PlusCal looks a lot like programming, but it isn't). I think that those who are not threatened by mathematical notation should start with TLA+, and those who are should start with PlusCal. But there is value in the examples of specifications of the book regardless of the notation used. The central principles of specification are always the same.

hwayne · on May 4, 2019

One of these days I gotta write an in-depth discussion of my teaching logic. The reasons I have for starting with PlusCal are fairly complex, and there are also disadvantages to it, too. But I still think that for many engineers, it's easier to learn PlusCal and then TLA+ than to go straight to TLA+.

tonyarkles · on May 4, 2019

Lamport agrees with that as well, from what I recall going through his videos. PlusCal for most engineers, TLA+ for people who like to think in math.

And just another Thank You. Your material helped me on my journey too, even though I ended up favouring TLA+ personally.

raister · on May 3, 2019

Have you tried considering UPPAAL?

gralx · on May 3, 2019

No - first I've heard of it. Just looked at its Wikipedia page. Color me intrigued.... Thank you!

yingw787 · on May 3, 2019

Raymond Hettinger just have a talk at PyCon an hour ago about formal reasoning and solvers. There’s a lot of polish in the talk, with a whole ReadTheDocs site with runnable Python code with solvers written from scratch. The slides aren’t out yet publicly or privately, but I would highly recommend going through the examples when the PyCon committee releases the information. The way he puts it makes it much less intimidating than I thought it would be.

Also Raymond is a very nice guy :D

camel_Snake · on May 3, 2019

PyCon usually uploads talks within a day or two of their presentation. There should be a PyCon 2019 YouTube channel already.

manaskarekar · on May 4, 2019

PyCon 2019 https://www.youtube.com/channel/UCxs2IIVXaEHHA4BtTiWZ2mQ

billfruit · on May 4, 2019

One thing that bothered me while trying to learn the TLA+ is the two different but equivalent forms or syntax for writing it: a mathematical formula/expression form and code/program form. This topic is difficult as it, and having to deal with not one but two equivalent forms of saying one thing is perhaps a bit hard on a learner. I wish if there was a tutorial of TLA+ that ditched the whole mathematical formula notation, and taught only the code form.

pron · on May 4, 2019

They're equivalent in the sense that PlusCal, the pseudo-code-like language compiles to TLA+, the mathematical notation. The book Practical TLA+ teaches pretty much only PlusCal. Eventually, though, to get the full power, even those who start with PlusCal will need to learn at least some TLA+.

bayareanative · on May 4, 2019

Interesting. I've worked concurrent systems in VHDL and Verilog (but not SystemVerilog) stacks, and those using C, C++, Haskell, OCaml, Ruby, Go, Erlang and Rust.

I wonder if this approach could formally verify systems like seL4.

Here's an approach to verifying concurrent systems using coq:

https://www.sciencedirect.com/science/article/pii/S157106610...

pron · on May 4, 2019

The main difference between Coq and TLA+ is that Coq is a research tool, intended and designed for use by researchers, while TLA+ is intended as an industry tool, and designed for use by engineers. They are essentially equivalent in their power when it comes to specifying and verifying software (although Coq is more suitable for proving general mathematics), but Coq is used almost exclusively by researchers. The learning curve is very different (one could get productive to the point they specify real nontrivial systems after learning TLA+ within a few days or a couple of weeks, while with Coq this takes many months), and TLA+ has much more automation.

hwayne · on May 4, 2019

You might be interested in TLAPS, the TLA+ proof system: https://tla.msr-inria.inria.fr/tlaps/content/Home.html

There's also this essay on proving TLA+ specifications in Isabelle: https://davecturner.github.io/2018/02/12/tla-in-isabelle.htm...

hcnews · on May 3, 2019

This keeps getting re-shared periodically on HN. I wonder what's HN policy around re-sharing same/similar content.

Tomte · on May 4, 2019

If it did not get "attention" (which I interpret as 10+ karma, but that's my interpretation), it may be resubmitted.

If the last "attention" is older than a year, resubmitting is also okay.

The first part is in the FAQ, the second part has been said by the mods multiple times.

lolptdr · on May 4, 2019

I searched and found several instances of a similar link, but HN doesn't stop me from posting it. I'm guessing there's some decay algorithm or it's not checking sub-domains or deep-links.