Testing

~“Program testing can be used to show the presence of bugs, but never to show their absence!”/

― Edsger W. Dijkstra~

Every serious discussion about testing absolutely must begin with this quote. It is like priming of a religious ritual.

It also captures the essence of software engineering - that complexity is so overwhelming that we cannot even know how many bugs are there at any given moment.

How

Write everything down. The visual system is the most powerful. It will prime the brain better. Write down examples (expectations) of what a function does in a “corner cases” (especially “empty” or “nothing”), to check your expectations. Write these as unit-tests - particular assertions (statements of fact). Then write down as distinct assertions (tests) the specification (equations) which states the “laws” and invariants - the equation reasoning at work. This is how a non-bullshit TTD meets a poor-man’s executable specs in a systematic program design (of Kiczales).

Why

A modern aeroplane performs a self-test before a takeoff to check if anything broken since last landing. Pure rationality.

One ask other people to modify your code. How do you know they did not break anything? At least a self-test should pass.

Everyone thinks that he is smarter than “other people”, so he will not break the code. This is a naive selfish bias and it means that one must double-check everything after every change.

A static type checking of a strongly typed language (ML family) is the first line of testing (it catches inconsistencies at a type level).

Defensive programming (checking preconditions with assertions) is the second.

Static typing + sum-types + testing frameworks will make modification easier and more confident. Modifications, including by other people, are essential.

The law of big numbers

There is, however, “natural experiments” (exactly as they occur in Nature) which could give us some confidence. This, by the way, is the same heuristic which allow animals to avoid poisonous foods or water.

When, lets say, a certain version of clang is used to compile produce official builds (available for millions to download) of vastly complex project such as Google Chrome and the traffic of bug-repports is low and has no mentioning of segfaults, we may conclude that this particular versions of clang is correct and stable?

How so? Because even if we obviously cannot do proper coverage testing for each an every code path (which won’t be enough anyway) millions of users go through these code patches eventually. If there is no complains then we could assume that not just browser itself, but the underlying compiler works well.

This is a hypothetical example, but it can be generalized very well. Find out the most common distro on major public cloud platforms (that would be some or another version Ubuntu) and you will see a stable set of packages with dependencies resolved. The particular version of the Linux kernel together with a set of applied vendor patches. Compiler, used to build this kernel for the most common architecture (with default optimizations), etc.

This is how testing works. It is the same notion as number of hours certain model of an aircraft flew so far. It does not guarantee anything, but it shows that the designs and implementations are stable.

Formal verification

One definitely would not try to formally verify all ones code, even in a pure functional language - it is just too much time consuming and too much of highly focused effort (you math must be solid, otherwise it is waste of time).

There is, however, a poorman’s formalism that comes for free, provided you know what you do.

Haskell has some set of unique properties which qualifies it as a system of logic. Its intermediate representation or IR is an actual implementation of the System F Omega (which is a system of logic).

Being pure it has certain properties, such as its expression could be evaluated mechanically with a paper and pencil by pure substitution (enhanced with the notion of frames of environment). It is as pure as algebra of polynomials (or any algebra).

When one expresses one’s problem domain and related domain logic (yes, DDD and TTD) in pure Haskell one will produce as set of pure expressions which, after type-checking and testing using specialized frameworks, will be weakly verified (remember, Haskell, and logic, are purely declarative and timeless - once a pattern is captured it is valid forever).

So, write your logic in Haskell, it will also be an order of magnitude less LOC, if you know what you do and reduce everything to the most fundamental forms (data dominates!)and related standard idioms.