Potted Programming Language #2: B

May 15, 2024

In the previous post in this series I talked about Ada. Having hopefully whetted your whistle for safety-critical programming, this time I’m going to squeeze a little bit further down this rabbit hole and talk about a language called B, which is part of a broader set of tools and techniques known as the B method.

Safety critical programming may not be a mainstream topic, but you are undoubtedly an end user of safety critical systems, and would presumably like them to continue working as they should, so that planes don’t fall out of the sky and the software running the world’s stock exchanges doesn’t accidentally wipe out your pension — to name but two undesirable consequences of them not working as they should.

Safety critical programming could also be a good thing for AI, but more on this after I introduce B…

Source — https://www.clearsy.com/en/thematics/b-method/

Genus: B is used for formal specification and refinement. A formal specification is a high level (usually mathematical, but please don’t run away yet) description of a program, and a refinement is a version of the specification that brings it a bit closer to actual code. The idea is that you start with a high level specification that captures all the important behaviour of the program in a well-thought-out and consistent way, and then you iteratively refine it down to code, each time using proof checkers to be sure that your refinement is equivalent to the previous specification. Eventually out plops a pile of code that’s guaranteed to do the same thing as the original specification.

Ancestry: B followed from a formal specification language called Z. Unlike Z, which is full of weird mathematical symbols, you can type B using a conventional keyboard.

Appearance: Here’s a simple example of a B specification for a program that sums up the integer numbers between start and end, generated by Google’s Gemini LLM1:

SET nat = { n | n ∈ INTEGER, n >= 0 }

VAR start, end : nat

INVARIANT start ∈ nat ∧ end ∈ nat ∧ start <= end

OPERATION calculate_sum(start : nat, end : nat) RETURNS INTEGER
  PRE start ∈ nat ∧ end ∈ nat ∧ start <= end
  THEN
    VAR sum, i : nat
    INITIALISE sum := 0, i := start
    WHILE i <= end DO
      sum := sum + i
      i := i + 1
    END
    RETURN sum
  END
END

The business logic occurs between the THEN and END statements, and is implemented much as it would be in any language, i.e. it just loops through the numbers between start and end and sums them up. Nothing to see so far.

The more interesting, B-specific, parts occur elsewhere. The first notable part is:

SET nat = { n | n ∈ INTEGER, n >= 0 }

One thing to know about B is that for much of the specification and refinement process, it is common to describe and reason about behaviour using mathematical sets, rather than the kind of concrete data structures programmers are used to. Here it introduces a set called nat to represent the natural numbers (which is the posh term for positive integers) and it uses set builder notation to define what these are. If you’ve not come across this notation before, the part in curly brackets just says that nat is “the set of all numbers which are integers and are greater than or equal to zero”.

Later, the set nat gets used in two other lines of the specification:

INVARIANT start ∈ nat ∧ end ∈ nat ∧ start <= end
PRE start ∈ nat ∧ end ∈ nat ∧ start <= end

These parts are known as invariants and preconditions, and they play an important role in proving that any program will remain complete and consistent in its behaviour. Preconditions say what state a program should be in before an operation (such as a function or procedure) can be carried out, and invariants say what should remain true after an operation has completed.

In this case, they say the same thing, this time using set notation and a bit of Boolean algebra: start and end both need to be positive integers, and start can’t be bigger than end. This means that the calculate_sum operation can’t accidentally be called with negative numbers or with start being larger than end — neither of which would make sense in practice. In a more complex program with multiple operations, invariants and preconditions would not be the same, with each operation having its own particular precondition, and the invariant responsible for capturing fundamental guarantees that shouldn’t be violated by any of the operations.

Invariants and preconditions are a bit like unit tests on steroids, but unlike unit tests (and software testing in general2) they can be used to guarantee that a program doesn’t contain bugs. And this is why formal methods like B are important — because they’re the only option we have to demonstrate, beyond a shadow of a doubt, that a program has been implemented correctly.

Pests: With practice, writing specifications and refinements is not as hard as you might think. The more challenging part is proving that each refinement is equivalent to the previous one. Fortunately this can mostly be done using automated proof checkers — but proof checkers can’t check everything3, so sometimes you have to do this by hand, which is a barrier for the majority of programmers.

Flowering period: B was planted back in the 80s. Whilst not seen on many window ledges (yes, remember the gratuitous plant theme for this series), it’s become a big player in the rail industry. Famously, it was used to formally specify and refine software used in the driverless Paris metro lines, proving that trains wouldn’t unintentionally4 mangle passengers. A similar process was later used for most of the driverless railways around the world. So, next time you don’t get crushed in the vicelike grip of a train’s doors, you can thank the B method.

Okay, that’s B in a very small nutshell… or, erm, plant pot.

All well and good, but why is this relevant? Chances are you’re not in the business of writing safety critical software, or have any great desire to do so. However, many of you probably do have an interest in machine learning, or AI more generally. This is an area in which solutions often don’t work how they’re supposed to, and in which the consequences of them not working correctly are becoming increasingly pronounced. And in this regard, perhaps there’s scope for AI practitioners to learn from the decades of experience in safety critical programming.

As an example of where existing safety critical techniques might be applied, consider AI tools and frameworks. These form a part of pretty much all AI/ML implementations, but they’re often open source projects, developed by teams of programmers working loosely together and not necessarily following well-defined processes. Which is to say, they often contain bugs. Given the increasing importance of these tools in delivering the ML infrastructure of our lives, there’s a case to be made for them to be developed in a more safety critical fashion. And the B method could be one approach to doing this — especially as people working in ML/AI are relatively comfortable with maths.

This is all maybe wishful thinking, but in a world of increasing dominance by LLMs and Python, it’s good to be aware that there are more dependable ways of doing software development. However, I’m not going to badger on about this any more, and in the next post in this series I’ll move onto something decidedly less safety critical.

Using an LLM to produce a formal specification is a crazy thing to do and goes against everything safety critical development stands for, but I’m not going to write it myself ;-P

See my previous post Mr Bates vs The Post Office vs Software Engineering for a bit more about the limitations of software testing.

This is closely related to the limits of computation, which I talked about in a previous post Turing machines and the limits of computation.

Or intentionally, presumably, unless the penalties for fare dodging are really severe.

Fetch Decode Execute

Discussion about this post