Steve Loughran: Build tools as Proof Engines

Someone has put up a thoughtful post on whether you can view make as a proof system. I thought about it and have come up with a conclusion: maybe in the past —but not any more.

As background, it's worth remembering

I did "write the book on Ant", which is a build system in which you declare the transformational functions required to get your application into a state which you consider shippable/deployable, a set of functions which you define in a DAG for the Ant execution engine to generate and order from and then apply to your environment
Early last year, in my spare time, I formally specified the Hadoop FS APIs in Spivey's Z notation, albeit in a Python syntax.
In YARN-913 I went on to specify the desired behaviour of the YARN registry as a TLA+ specification.

And while I don't discuss it much, during my undergraduate work on Formal Specification of and implementation Microprocessors, I was using Gordon's HOL theorem prover. The latter based on Standard ML, just to fend off anyone who believed me when I was claiming not to understand functional programming earlier this week. I didn't meant it, it just entertained the audience. Oh, and did I mention I'm happy to write Prolog too?

This means that (a) I believe I understand about specifications, (b) I vaguely remember what a proof engine is, and (b) I understand how Prolog resolves things, and specifically, why "!" is the most useful operator when you are trying to write applications.

Now I note that the author of that first post, Bob Atkey, does not only has a background of formal logic, SML and possibly even worked with the same people as me, his knowledge is both greater and more up to date than mine. I just have more experience of breaking builds and getting emails from jenkins telling me this.

Now, consider a software project's build

A set of source artifacts, S, containing artifacts s1..sn
A declaration of the build process, B
A set of external dependencies, libraries, L.
A build environment, E., comprising the tools needed for the build, and the computer environment around them, including the filesystem, OS, etc.

The goal is to reach a desired state of a set of redistributables, R, such that you are prepared to ship or deploy them.

The purpose of the build system, then, is to generate R from S through a series of functions applied to (S, L) with tools T within the environment E. The build process description, B, defines or declares that process.

There's many ways to do that; a single line bash file cc *.c && cc *.o could be enough to compile the lexical analyser example from Ch02 of the dragon book.

Builds are more complex than that, which is where tools like make come in.

Make essentially declares that final list of redistributables, R, and a set of transformations from inputs artifacts to output artifacts, including the functions (actions by the tools) to generate the output artifacts from the input artifacts.

The make runtime looks at what artifacts exist, what are missing, and what are out of date somehow builds a chain of operations that are hypothesised to produce a complete set of output artifact whose timestamp in the filesystem is later than that of the source files.

It is interesting to look at it formally, with a rule saying that to derive .o from .c, a function "cc -c" is applied to the source. Make looks at the filesystem for the existence of that .o file, its absence or "out-of-dateness" and, if needed applies the function. If multiple rules are used to derive a sequence of transformations then make will build that chain then execute them.

One interesting question is "how does make build that series of functions, f1..fn, such that:

R = fn(fn-1(fn-2(fn-3..f1(S, L)

I believe it backchains from the redistributes to build a series of rules which can be applied, then runs those rules forwards to create the final output.

If you view the final redistributables as a set of predicates whose existence is taken as truth, absence as false, and all rules are implies operators the define a path to truth, not falsehood (i.e. we are reasoning over Horn Clauses, then yes, I think you could say "make is a backward chaining system to build a transformation resulting in "truth".

The problem is this: I'm not trying to build something for the sake of building it. I'm trying to build something to ship. I'm running builds and tests repeatedly, from my laptop on my desk, from my laptop in a conference auditorium, a hotel room, and a train doing 190 mph between Brussels and London. A that's all just this week.

Most of the time, those tests have been failing. There are three possible (and non-disjoint) causes of this

(a) the null hypothesis: I can't write code that works.
(b) a secondary hypothesis: I am better at writing tests to demonstrate the production code is broken than I am at fixing the production code itself.
(c) as the external environment changes, so does the outcome of the build process.

Let's pretend that (a) and (b) are false; that I can actually write code that works first time, with the tests intended to show that this condition is not met being well written and correct themselves. Even if such a case held, my build would have been broken for a significant fraction of the time it was this week.

Here's some real examples of fun problems, for "type 3 fun" on the Ordnance Survey Fun Scale.

Build halting as the particular sequence of operations it had chosen depended on maven artifacts which were not only absent, but non-retrievable from a train somewhere under the English Channel.
Connection Reset exceptions talking to an in-VM webapp from within a test case. A test case that worked last week. I never did find the cause of this. Though I eventually concluded that it last worked before I installed a critical OS.X patch (last weeks, not this week's pending one). The obvious action was "reboot the mac' —and lo, it did fix it. I just spent a lot of time on hypotheses (a) and (b) before settling on cause (c)
Tests hanging in the keynote sessions because while my laptop had an internet connection, GET requests against java.sun.com were failing. It turns out that when Jersey starts some servlets up, it tries to do a GET for some XSD file in order to validate some web.xml XML document. If the system is offline, it skips that. But if DNS resolves java.sun.com, then it does a GET and blocks until the document is returned or the GET fails. As as the internet in keynote was a bit overloaded, tests just hung. Fix: edit /etc/hosts to put java.sun.com == 127.0.0.1 or turn off the wifi.
A build at my hotel failing as the run crossed the midnight marker and maven decided to pull down some -SNAPSHOT binaries from mvn central, which were not the binaries I'd just built locally, during the ongoing build and test run.

What do all these have in common? Differences in the environment of the build, primarily networking, except case (4), which was due to the build taking place at a different time from previous builds.

Which brings me to a salient point

The environment of a build includes not only the source files and build rules, it includes the rest of the Internet and the connections to it. furthermore, as the rule engine uses not just the presence/absence of intermediate and final artifacts as triggers for actions, time is an implicit part of the reasoning.

You made could make explict that temporal logic, and have a build tool which look at the timestamp of newly created files in /tmp and flagged up when your filesystem was in a different timezone (Oh, look ant -diagnostics does that! I wonder who wrote that probe?) But it wouldn't be enough, because we've reached a point in which builds are dependent upon state external to even the local machine.

Our builds are therefore, irretrievably nondeterministic.

Steve Loughran

2015-04-18

Build tools as Proof Engines

No comments:

Post a Comment