Adding design-by-contract conditions to C++ via a GCC plugin

josephg · on Jan 1, 2023

Using invariants like this is now my preferred programming style. Almost all of my complex container classes have a dbg_check() method which checks all the internal invariants hold, and panics if not. When testing and debugging, I’ll add calls to dbg_check() after each mutation of my data structure. And then I write a fuzzer which exercises my API in a loop. Each iteration, I check that my invariants still hold.

The nice thing about this sort of invariant checking is that it makes it very fast and easy to narrow in on bugs. The crash happens right after a change which made the data invalid, not later when the invalid data is accessed.

And a few dozen lines of fuzz testing can find an extraordinary quantity of subtle bugs. It’s remarkable. Devastating for the ego, but remarkable.

Edit:

Simple dbg_check implementation example: https://github.com/josephg/diamond-types/blob/3eb48478fd879e...

And here's a simple fuzz tester: https://github.com/josephg/jumprope-rs/blob/318e87d3aae1b2a0...

jandrewrogers · on Jan 2, 2023

I do something very similar, and as you say it is highly effective. I don't recall ever finding a bug later in code that was properly tested this way. It is not an inexpensive type of testing when done well but I think it is a pragmatic way of producing code that can approach the robustness of formal verification in practice with a lot less work, so the ROI is high.

thesuperbigfrog · on Jan 2, 2023

>> I think it is a pragmatic way of producing code that can approach the robustness of formal verification in practice with a lot less work, so the ROI is high.

This is correct.

Design by Contract techniques help to ensure that code is robust and works as expected.

It is good that C++ is getting these tools through compiler plug-ins.

The Ada programming language added support for Design by Contract in Ada 2005:

https://learn.adacore.com/courses/intro-to-ada/chapters/cont...

The same language syntax used to define preconditions, postconditions, and invariants in Ada paved the way for SPARK to prove program correctness:

https://learn.adacore.com/courses/intro-to-spark/chapters/03...

Hopefully Design by Contract can be included in the C++ standard itself in the future.

pjmlp · on Jan 2, 2023

It almost made it into C++20 and was dropped at the last minute, I doubt it ever does again unless everyone agrees, even then, it is at least a decade away.

josephg · on Jan 2, 2023

> It is not an inexpensive type of testing when done well

I find the ROI is much higher than most forms of unit testing. Fewer lines of testing code finds more bugs. And with invariant checking and a seeded random number generator, you can reproduce any failures and (usually) find the bug pretty quickly. They do take more brain cells per line of testing code though, especially at first.

This sort of testing also addresses my biggest frustration with unit testing. Unit tests have diminishing returns as you add more tests. Normally by the time I have "enough" unit tests, it becomes exhausting to refactor my code because of the giant pile of tests I need to rewrite. Fuzz tests are much easier to update.

(That said, fuzz testing can't replace unit tests entirely. Especially when you have a lot of methods in your API).

_a_a_a_ · on Jan 2, 2023

Good as DBC/assertions are, it absolutely does not "approach the robustness of formal verification in practice" especially for potentially large states.

Also DBC/assertions are of little value unless you provide test code to exercise that asserted section, so that's more work, unless you're ok with the assertion triggering by the user after you've shipped it.

Also comprehensively asserted code (with assertions switched on / compiled in) can absolutely crawl because preconditions that formal verification ensures have to be repeatedly checked at runtime - that's your assertion being run ten thousand times instead of being known-true in the code.

It is very good, I use it extensively, but it is not even comparable to formal proofs. Nowhere near.

pjmlp · on Jan 2, 2023

Besides the usual sources like Eiffel, one Microsoft book actually did it for me, showing off the various uses of it on their products.

"Code Complete: A Practical Handbook of Software Construction"

So for those on Windows systems, a good way to do DBC for the last 20 years has been the various ASSERT_ variants, specially for MFC,

https://learn.microsoft.com/en-us/visualstudio/debugger/c-cp...

And the SAL annotations for security invariants,

https://learn.microsoft.com/en-us/cpp/code-quality/understan...

gavinray · on Jan 2, 2023

Do you use the Cargo "contracts" for Design-by-Contract style invariants that plugs into Facebook's MIRAI prover thing?

I always thought it this was super neat:

https://crates.io/crates/contracts

https://github.com/facebookexperimental/MIRAI/blob/main/exam...

  [dependencies]
  mirai-annotations = ...
  contracts = {version = "XXX", features = ["mirai_assertions"]}

josephg · on Jan 2, 2023

No I haven't seen that!

On its own, I don't see much point in "contracts" annotation-style syntax - it looks like a complex solution to a problem I don't have. Whats so wrong with assert!() that we need to invent syntax.

MIRAI looks wild though. Thanks for sharing!

intelVISA · on Jan 2, 2023

That's a really solid approach, as much as OO architecture in excess (Java) is grim I think it has a lot of merit for testing as hopefully your code is already organized to benefit from a lil fuzz.

Hoping to try the new C++ contracts one day(tm).

throwaway9870 · on Jan 2, 2023

"Writing Solid Code" pushes this type of checking and I have been using it for at least 20 years with similar effectiveness as you. I don't understand why such a simple and pragmatic method is so overlooked.

_a_a_a_ · on Jan 2, 2023

I can only agree, and Hoare's work on assrtions go back to the 80's (or even earlier?). We're in a bad place as devs when the bar is so pathetically low yet we still choose to crawl beneath it.

gavinray · on Jan 2, 2023

Invariants + property-tests/fuzz-tests (anything that does random generation/permutation) is a nuclear weapon

Great stuff, thank you for sharing =D

SoylentYellow · on Jan 1, 2023

Do you have an example of this you can share?

josephg · on Jan 2, 2023

Updated the comment.

Animats · on Jan 1, 2023

I thought "contracts" were dropped out of C++20.[1]

Entry and exit assertions, and invariants, are powerful, especially when coupled with a proof of correctness system. But they're a tough retrofit.

Invariants have some issues. The general idea is that the invariant has to be true when control is outside the object. This was once the core idea of the object concept, although it's been somewhat forgotten. A big issue is, when does control enter and exit the object? What if you call a public member function from inside the object? Did you re-enter? What if you call out of an object to something that calls back in? What about recursion? What if another thread enters? Objects need clarity on the inside/outside issue for invariants to work. This is quite possible but a tough retrofit.

Invariants need syntax for talking about arrays and parts therof. You need quantifiers, or something like them. Lambdas?

From back when I did this sort of thing, decades ago, a simple SAT solver can eliminate the need to check over 90% of assertions at run time. So you want something like that, rather than trying to check everything at run time. Otherwise, nobody will keep the checks turned on.

All this is quite do-able. Most attempts to do it have suffered from academic overreach - the technology is pushed by people in love with formal methods, and the result is too complicated for routine use.

[1] https://www.reddit.com/r/cpp/comments/cmk7ek/what_happened_t...

gavinray · on Jan 1, 2023

  > I thought "contracts" were dropped out of C++20.[1]

You're right, but (I'm not an authority, someone from /r/cpp would be better to speak here probably) I think after the Kona meeting recently they were back on the charter for next iteration

Specifically, GCC 13 has had support for Contracts in main since November:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ea63396f6b08f8...

  > A big issue is, when does control enter and exit the object? What if you call a public member function from inside the object? Did you re-enter?

If you check the generated code from the end of the blogpost (see the bit that shows the Ghidra dissasembly) or the notes section, the pattern used in the plugin is:

   T function() {
     check_invariants(); // unless we're in constructor
     result = function_body();
     check_invariants(); // unless we're in a destructor
     return result;
   }

This does mean that a single member call can cause multiple cascading calls to check_invariants()

If "this.foo()" calls "this.bar()" which calls "this.baz()", etc.

But it's not avoidable, because in this open-world assumption we assume any member call can potentially mutate the state of the member.

scatters · on Jan 2, 2023

If you're calling back into user code, you need to reestablish invariants first.

As for threading, C++ distinguishes mutating from non mutating (const) methods. Usually, any number of threads can call const methods simultaneously, but only one thread can have access if mutating methods are called. This may require external synchronisation.

The optimizer is often capable of inferring when invariants are maintained. One controversy is whether it should be permitted to use contracted invariants as axiomatic.

nly · on Jan 2, 2023

The compiler can't reliably make assumptions about whether const methods in C++ cause mutation, because both const_cast and aliasing are permitted.

layer8 · on Jan 2, 2023

> What if you call a public member function from inside the object? Did you re-enter?

It's best to program as if every call to a public function is from client code (i.e. has the same pre/postconditions/invariants). If you need different semantics, it's straightforward to forward to a private or protected function to distinguish the two cases. This is similar to the Non-Virtual Interface (NVI) pattern.

Animats · on Jan 2, 2023

When did you leave the object? There should be a check of the invariant as control leaves the object. At what moment does this happen.

The F# people struggled with this. In proof systems, you check invariants at exit, and get to assume them true at entrance. So you need to be very clear about entrance and exit.

layer8 · on Jan 2, 2023

Whenever you return from a public member function. You might wish for more flexibility, but it's a simple rule that's not difficult to work with.

Karliss · on Jan 2, 2023

> I thought "contracts" were dropped out of C++20.

C++ compilers often implement C++ features which haven't been fully standardized yet. Either to prepare for stuff that is very likely to be in the next version, or as experimental features so that people can try them out, find potential issues with proposal, and come to conclusion of what exactly should be standardized.

Contracts were not dropped from C++20 because people didn't want the feature. But at least partially because people had very different opinions of what exactly should they do (runtime checks, documentation to library users, optimization annotation giving permission to UB if input requirements aren't satisfied, hint for optional static analyzers).

apankrat · on Jan 1, 2023

Tangentially related -

Few years ago started using asserts as an ad-hoc documentation mechanism for invariants and then also started shipping them in production builds. When triggered, asserts grab the stack trace, write it into the application log and give an option of sending a bug report. And then they shut down the program.

Was scary at first, but the initial pain is absolutely worth it.

This flushed hundreds of absolutely crazy edge cases, improved code quality and stability tremendously. It also forced writing cleaner code to begin with and sped up debugging while in development. Now have about 3K asserts in 250 KLoC code base. Can't recommend this practice strong enough.

civopsec · on Jan 2, 2023

I looked at this thread in the morning. Then I waited the whole workday to get home and read anecdotes like yours.

This is seriously cool. I never understood why people don’t talk more about this style of programming.

TazeTSchnitzel · on Jan 2, 2023

I believe Rust had some form of contract enforcement in its early days, but they eventually removed it because it had too many problems. I wish I could find the post about it, but Google is useless these days.

mihaigalos · on Jan 2, 2023

The contracts crate [1] supports this via macros.

I actually prefer the invariant section above the definition, but do admit the pre and post in C++ look better.

[1] https://docs.rs/contracts/latest/contracts/

kaashif · on Jan 1, 2023

I think this is very interesting:

> The working code for insertion of the check_invariants statements was a stroke of dumb luck after 15 hours, from one of the suggestions given by Github Copilot.

GitHub Copilot coming in handy when documentation is lacking? Will widespread use of tools like this reduce the incentive, over time, to write good documentation? Something to think about.

gavinray · on Jan 2, 2023

I prototyped much of this code using ChatGPT and Copilot. I won't go so far as to say, "I couldn't have done it without you, ma!", but it certainly would have taken me much longer.

In terms of learning the actual API, they were worth their weight in gold.

One trick to use is that Copilot can read up to (apparently) 20 other tabs that you have open. So, I will open the source code containing the API methods I need in my other tabs and close everything else.

  > The working code for insertion of the check_invariants statements was a stroke of dumb luck after 15 hours, from one of the suggestions given by Github Copilot.

What is funny here is that it suggested something radically different. I was going down the rabbit hole I chronicled here:

https://stackoverflow.com/questions/74964153/gcc-gimple-c-ap...

One strategy I use in unfamiliar territory is to repeatedly generate the set of 10 Copilot suggestions and see if anything sticks out. I had gone through maybe 3-4 rounds when it spits out one that looks nothing like what I currently have.

I try that suggestion, and it works. I shake my head. What the fuck. Alright, I'll take it.

It wanted to do a regular C-style call that had no member namespace, which by sheer luck happens to work due to how C++ lookup works.

sgerenser · on Jan 2, 2023

One trick to use is that Copilot can read up to (apparently) 20 other tabs that you have open. So, I will open the source code containing the API methods I need in my other tabs and close everything else.

:shocked pikachu: This can’t possibly be true? How would it even do that, and unless it was off by default, that seems like a huge privacy problem. Edit: never mind, I was reading tabs as “browser tabs” not “editor tabs.”

WalterBright · on Jan 2, 2023

I added dbc to C++ back in the early 90s.

https://www.digitalmars.com/ctg/contract.html

gavinray · on Jan 2, 2023

The inspiration was taken directly from D, I credit as much in the README =)

https://github.com/GavinRay97/gcc-invariant-plugin#gcc-desig...

  > Inspired wholly by the D Programming Language's invariant feature.

How I wish D had caught on as the popular language rather than C++.

badtuple · on Jan 2, 2023

Does anyone have an example of a codebase that uses Design By Contract to an extreme extent? You really only get a sense of the power of patterns like this when you see that power abused. I've greatly enjoyed writing 2 very small codebases in that style...but that doesn't mean I did it well or that it'd hold up after years of maintenance from a revolving door of developers.

kgeist · on Jan 1, 2023

What's the advantage of invariants over unit testing? Seems like there must be lot of overhead at runtime.

>I was ending up with garbage, and not realizing it until I had visualized it with Graphviz!

>Imagine if we had invariants that we could assert after every property change to the tree.

Have you tried writing unit tests? What didn't work about them that you decided to try invariants?

Someone · on Jan 1, 2023

Not the writer, but it seems obvious to me. It’s a luxury version of assert. You leave them enabled in debug builds to get better bug reports from testers.

Those testers may hit cases you forgot to write unit tests for.

Of course, you can also forget to write invariants, or write invariants that are less tight than they should be, but I think it often is easier to write invariants than to write exhaustive unit tests.

Firstly, writing “p should always be a prime” is clearer than writing “if p is a prime, and you call f, p should be a prime afterwards”, and secondly, invariants can apply to multiple methods that you, otherwise, would have to write separate tests for (“foo keeps p a prime”, “bar keeps p a prime”, “bar keeps p a prime when called with a null argument”, “baz keeps p a prime if it throws an exception”, etc)

Also, invariants, IMO, can be way better documentation than unit tests.

Finally, invariants leave open the possibility of using a theorem prover to (dis)prove that they hold.

nimish · on Jan 1, 2023

They are declarative vs imperative and sufficiently smart tooling exists such that they can be checked and enforced statically, see liquid haskell and "refinement" typing for an example. Formal verification starts to become a possibility -- enforcing both conformance and documentation.

Tests are fine too but more tedious work.

ThePadawan · on Jan 2, 2023

> What's the advantage of invariants over unit testing?

I studied Eiffel in university under Bertrand Meyer, and here's his (probably unique) point of view.

If someone hands you their library code and their unit tests, you have to understand their unit tests - which involves understanding why they chose the values that they did. There's also nothing stopping someone from testing the internals of their library as opposed to the publicly exposed behavior.

With contracts, imagine that to understand their library, you only see the method declarations with accompanying contracts. You don't see how HashSet.Add(x) is implemented, but you do see "if this.Contains(x) old(this.Count) == this.Count".

You don't have to see a test where this is tested with 5 example values, on an empty set, on a large set, whatever. You can rid yourself of that cognitive load by thinking, as sibling points out really well, declaratively over imperatively.

westurner · on Jan 1, 2023

icontract is one implementation of Design by Contract for Python; which is also like Eiffel, which is considered ~the origin of DbC. icontract is fancier than compile-time macros can be. In addition to Invariant checking at runtime, icontract supports inheritance-aware runtime preconditions and postconditions to for example check types and value constraints. Here are the icontract Usage docs: https://icontract.readthedocs.io/en/latest/usage.html#invari...

For unit testing, there's icontract-hypothesis; with the Preconditions and Postconditions delineated by e.g. decorators, it's possible to generate many of the fuzz tests from the additional Design by Contract structure of the source.

From https://github.com/mristin/icontract-hypothesis :

> icontract-hypothesis combines design-by-contract with automatic testing.

> It is an integration between icontract library for design-by-contract and Hypothesis library for property-based testing.

> The result is a powerful combination that allows you to automatically test your code. Instead of writing manually the Hypothesis search strategies for a function, icontract-hypothesis infers them based on the function’s [sic] precondition

charcircuit · on Jan 1, 2023

This isn't a replacement for unit tests. This is an alternate syntax to using asserts. You would still want unit tests to see if those asserts are being violated.

balfebs · on Jan 1, 2023

The post assertions do look very similar to a unit test, but the pre assertions seem really useful; it can sometimes be difficult to know every code path that leads to your function, and though tools exist for this, assertions on inputs help you catch errors arising from unusual conditions.

This seems like it’s mostly syntactic sugar for assertions, keeping them at the interfaces of the function (in and out).

It can also be sometimes useful to have these conditions right there alongside the implementation and not just somewhere else in your unit tests.

josephcsible · on Jan 1, 2023

Because this leads the way to being able to verify that your functions will act as expected in all cases, rather than just for the ones that you thought about when you were writing unit tests.

jjice · on Jan 1, 2023

I believe I've read about some languages with invariant using them at compile time to verify the value meets the invariant, if possible. For example, let's say a function's invariant guarantees it returns an even integer, and then we pass that into a function that only accepts odd negative integers, it could catch that during compile time. To me, that's the coolest case for invariants.

Again, I believe that this is a PL research topic, but I'm not super well versed in it, so take that with a grain of salt.

henrikeh · on Jan 2, 2023

Ada / SPARK can do that. Ada has had some form of that analyses since the first version from 1983.

https://learn.adacore.com/courses/intro-to-spark/chapters/03...

JonChesterfield · on Jan 2, 2023

Requires dependent types. And yeah, research is probably a fair characterisation.

pstrateman · on Jan 1, 2023

It's more likely that they'll get updated correctly when the surrounding code is changed.

Unit testing requires someone to enforce they get updated.

loeg · on Jan 1, 2023

The syntax is cool, but you can always check and assert invariants and contract semantics without the special syntax.

ghotli · on Jan 1, 2023

Invariant vs unit tests aside. Doesn't this being a gcc plugin make it inherently unattractive? Wouldn't want to have to rip that code out if I wanted the code to behave on a non-gcc toolchain.

Otherwise, seems cool if that doesn't matter / will never matter in the lifetime of the codebase

gavinray · on Jan 1, 2023

  > Doesn't this being a gcc plugin make it inherently unattractive?

I'm the author, and even I think so. I'm more of an LLVM fan myself (though I can't not mention David Malcom's work on the GCC Static Analyzer).

(Ideally it wouldn't be a plugin at all, it'd be a language feature. We got Contracts and left out the most useful contract of them all)

Originally, I started it as a Clang plugin, thinking that I could also implement support for the Contracts "[[pre]]" and "[[post]]" specification on top (or at least some minimal implementation of it).

This turned out to be a lot more work.

If people would like to use this from Clang, even without support for regular Contracts, I will publish a compatible Clang plugin.

I think at some point there was support for Contracts in Clang, maybe longer term I'll try to get them working again? (I've no experience here)

https://github.com/arcosuc3m/clang-contracts

This fellow wrote a whole ~200 page thesis on this just as recently as 2018, such a shame for it to go to waste =/

https://e-archivo.uc3m.es/bitstream/handle/10016/29231/TFG_J...

msla · on Jan 1, 2023

> Wouldn't want to have to rip that code out if I wanted the code to behave on a non-gcc toolchain.

I can only imagine some companies would literally never have the problem of switching to a new compiler toolchain.

balfebs · on Jan 1, 2023

That is a concern, but this makes invariants something the compiler can reason about more easily, since they happen at function boundaries and are distinct from regular code, unlike “vanilla” assertions.

yjftsjthsd-h · on Jan 1, 2023

> Wouldn't want to have to rip that code out if I wanted the code to behave on a non-gcc toolchain.

I wonder if it can be wrapped in #ifdef ?

gavinray · on Jan 2, 2023

Yeah sure, there's nothing preventing you from using "constexpr" so it's compile-time evaluated and inlined:

(Please no more macros, it's 202- I mean, 2023!)

  static constexpr bool DEBUG = true;

  [[demo::invariant]] [[gnu::used]]
  void check_invariants()
  {
    if constexpr (!DEBUG) return;

    assert(top >= 0 && top <= MAX_SIZE);
  }

celrod · on Jan 2, 2023

Note that `assert`s are disabled if you define the macro `NDEBUG`, e.g. https://godbolt.org/z/hMWo8KM7q

CMake defines the macro in release builds: https://github.com/Kitware/CMake/blob/e1eacbe2c522a8bf9a82af...

Would be nice to have a non-macro solution for controlling behavior at configure time, but the `NDEBUG` macro is basically already your `DEBUG` constexpr.

loeg · on Jan 2, 2023

Better to define a macro that evaluates to the contract/invariants on GCC and empty on other compilers, rather than copying the ifdef/else/endif logic around to every usage.

loeg · on Jan 2, 2023

It's a reasonable concern, but you could wrap it in macros that selectively enable it only on GCC.

MuffinFlavored · on Jan 1, 2023

The B-Tree invariant example in the unit test seems like a bunch of assertions. Feels very similar to unit tests? Am I missing something?

Assert at runtime or assert at unit test time, it's assertions all the way down? Not sure the benefit of some weird @decortator-like syntax to achieve 5-10 lines of assertions?

gavinray · on Jan 1, 2023

Contracts and Invariants don't replace unit tests, they're complementary

This is the big mix up with DbC in general I think. Think of it more like sanity-tests and assertions that help ensure your program isn't in some haywire state.

A good example with the BTree is that the invariants ensure the properties are enforced at runtime, but you still need test to invoke those properties.

These sorts of invariants mix really well with property-style tests:

You write some declarative specification for a set of boundaries on inputs/outputs, and then subject your classes to the generated inputs. If your tests pass, then all your invariants have held and you know at the bare minimum, the data structures are doing what you intend them to do.

Above that you can do load-testing and subject the invariants to scale-factors.

MuffinFlavored · on Jan 2, 2023

andreareina · on Jan 2, 2023

Design by Contract

renox · on Jan 1, 2023

If I remember well(), the idea between contracts is that they can be enabled at runtime, so very different from UT.
Also different from assertions in that 'class' invariants applies to every public method of the class.

: All I know about 'Design by Contract' I learned it in the (very good IMHO) book Object Oriented Software Construction by Bertrand Meyer (he made the Eiffel language).

unsafecast · on Jan 2, 2023

You missed a backslash before your asterisks :)

vlovich123 · on Jan 1, 2023

Very similar but I suspect most people don’t use asserts properly so this is more of an opinionated approach that makes it more difficult to misuse.

skocznymroczny · on Jan 2, 2023

D has had support for pre/post contracts and invariants for a long time, but I haven't seen them used in many projects. I think it's one of those niche uses, or something that sounds like it'd be fun to have but in the end ends up being annoying to maintain.

_a_a_a_ · on Jan 2, 2023

Irritating they are, but utterly, utterly worth it if you care about quality.

kokonoko · on Jan 1, 2023

Creating a separate gcc plugin for this seems unnecessary if I am not misunderstanding something. Just ensure the invariant function is called at every method, in debug builds maybe. You could of course change the object properties without using a method but then you have bigger problems.

Volt · on Jan 1, 2023

> Just ensure the invariant function is called at every method

You make this sound way easier than I would expect it to be in the general case.

gavinray · on Jan 2, 2023

They're suggesting you write the output of the codegen shown in the image here:

https://gavinray97.github.io/static/images/ghidra-check-inva...

Yeah you totally could, and before writing this plugin it's what I was doing, haha. As there is a human involved, it is very error prone though, and boy is it tedious!

loeg · on Jan 2, 2023

You add something like:

  DCHECK(myInvariants());

to your method bodies. Sure, it makes it possible to forget, but it's not that hard to use.

int_19h · on Jan 2, 2023

A big part of DbC is that contracts are part of the public interface, not the implementation. This becomes especially important once you mesh them with Simula-style OOP, as derived classes need a way for overridden methods to widen or narrow the inherited contract.

rurban · on Jan 2, 2023

Such assertions are nice, but easier done with normal assert(). No need for a plugin.

Much better is formal verification though, which I do on all my containers. cbmc is your friend.

zelphirkalt · on Jan 2, 2023

In other languages adding something like this is possible from within the language in basically any syntax you want. No need for some plugin on another layer like GCC.