The vast majority of developers have heard about program analysis techniques but often don’t have a good intuition about how these techniques work. In this series of articles, I will provide a brief overview of the major techniques and hopefully clarify some of their capabilities and limitations.

Program analysis is a field of computer science interested in techniques that infer properties of program executions. These two elements, properties and program executions, are the inputs of a program analysis technique and are the objects I will go over in this article. Let’s dive in and start with the executions!

Consider the following piece of text:

function factorial(n) {
if (n === 0)
return 1;

return n * factorial(n - 1);
}

To call this string a program, we need to specify which programming language we are working with. This is similar to natural languages where there are words called false friends or noncognate interlingual homographs which are written in the same way but have different meanings. Some amusing examples are the words sin (meaning “without”in Spanish) or spot (meaning “ridicule” in Dutch).

Let’s assume this is a JavaScript program. Once we fix the programming language (and the version), we can use a parser to check whether the string is valid or well-formed according to the syntax specification. If this stage succeeds, we say that we have a program in that JavaScript version.

However, fixing the programming language is not enough to have a precise understanding of what the program can do. Everyone has heard of cases where programs behave differently depending on the architecture, compiler/interpreter, or other execution environment aspects.

This understanding (which is known as the program semantics)can only be achieved when one at least fixes a particular implementation of the language, commonly known as an interpreter. This interpreter is nothing more than another program that receives a program (like the one above) as input and outputs a value corresponding to the result of the program. In case of JavaScript, examples of such implementations are the V8 or the SpiderMonkey engines.

The main goal of a program analyser is to infer properties about the program executions generated by running the interpreter on this program.

Developers do this task on a daily basis with testing. For example, when I copy-paste the program above into the Chrome console, I can observe that factorial(3) = 6 . From this observation, I can infer at least two properties of this execution: 1) that it terminates and 2) that its result is 6. In testing, developers enforce this type of observations with instructions like:assert(factorial(3) === 6).

In practice, developers don’t have the time to test all the expected observations. Here, a program analyser can help a developer (a lot!!) in two ways:

  1. You can specify multiple tests concisely. For example, you can use a program analyser to verify if 0 < n1 < n2 , then factorial(n1) < factorial(n2) .
  2. You can use a program analyser to find bugs in your program. For example, the program analyser can be used to discover a new test case for which it is convinced that the result is not intended by the developer. A useful example here would be factorial(-1) .

The first task is usually referred to as property checking or verification, while the second one as bug finding. Note that these properties or bugs are specific the V8 engine version that I’m using — it could be wrong for me to generalise for all JavaScript engines.

From the perspective of program analysis, a programming language is just a formal specification of an interpreter.

A first differentiator factor in program analysis is what representation of this specification it uses. This choice impacts how the technique represents a program execution and ultimately the kind of properties or bugs it can reason about. When a technique uses the actual interpreter (like the V8 engine) to execute the program, we say that it is a dynamic analysis — otherwise it is a static analysis. It is as simple as that!


To end this article, I want to briefly go over the concepts of false positive and false negative and how it connects to soundness and completeness using the two uses cases above: verification and bug finding.

Photo by Claire Anderson on Unsplash

I like to compare a program analyser to a witness during a trial testimony. Typically, in a trial, there are the events that actually happened and are being discussed, and then there is the interpretation of these events by the witness. So, we have the analyser in the witness stand being bombarded with questions by a lawyer. In very intimidating terms, some of these questions end with the yes or no? formula and others start with To the best of your knowledge.

In normal circumstances, there are some events in which the interpretation is faithful to what actually happened and others where the witness’s mind is filled with uncertainty or just plain ignorance. The lawyer knows this basic fact even if he does not know the questions which the witness will faithfully answer I don’t know. The existence of such questions is what we mean by incompleteness of the analysis. For example, in an attempt to verify the property above about factorial , the analyser might come back with reasonable I don’t know answer. Obviously, a witness that answers I don’t know to every question is just a waste of resources.

Now, lawyers can be tricky. Sometimes, they attack the witness’s character and demonstrate that it is not reliable. They do that by showing that an answer given by the witness is factually incorrect — for example by showing a video of a particular incident.

However, when this happens, the opposing councillor is clever enough to try to clarify whether this incorrect answer was a plain lie or a mistake induced by the other lawyer. If it turns out that it was a lie, we could say that the witness is not sound of mind — the analyser is unsound. In our two uses cases, the analyser would be unsound if:

  1. It answered yes after checking the property above — this is not true because on my Chrome console0 < 171 < 172 but factorial(171) === factorial(172) ;
  2. It suggested that factorial(3) would yield an unexpected result.

This leaves us with the case where the opposing councillor needs to demonstrate that the incorrect answer was a genuine mistake coming from the ignorance of witness. It is easy to see how this can happen.

When a lawyer knows a particular question that the witness would answer I don’t know if asked directly and he/she knows something about how much the witness believes, it’s easy to get a potentially incorrect yes or no answer. Let’s look at an example:

  1. Lawyer: To the best of your knowledge, is it possible that there is another accomplice? Witness: Yes.
  2. Lawyer: Since there might have been another set of footprints, we can’t exclude the possibility that there is another accomplice? Witness: No.

In both cases, the answer to the last question can be factually incorrect if it turns out that there is no such accomplice. The honest mistake in the first case is known as false positive and in the second case as false negative.

To know how to connect false positives and false negatives to soundness (and completeness) we need to understand what constitutes a lie in our scenario. This definition is related to the intention of the analysis. Lets look at a couple of examples:

  1. If the intention is to check if a property holds, the analyser can either output yes (and provide a proof that it is case) or simply say it can’t prove it. If the output is yes and it turns out that it was not the case, this is a lie — a false positive means that the analyser is unsound.
  2. If the intention is to find a bug in a program, the analyser can either output a buggy execution or simply say that it couldn’t find any. Again, if the bug candidate provided by the analyser turns out to be wrong, the analyser is unsound.

In these examples, the analyser could have false negatives — bugs and proofs of properties could be missed by the analyser. A notable example of this kind of analysers are type checkers.

An aspect that can be confusing is the fact that the intention of a lot of (static) analysers is to check if a particular state is not observed during the execution of the program. If the analyser says that the state is observed when it is not, this is a false positive and not a lie — remember that the intention is to show that the state is not observed. This is why there is much talk of false positives. A lie in this context occurs when the analyser says that the state is not observed but there is an execution where it can be observed — a false negative means the analyser is unsound.

I hope this clarifies why some program analysers are unsound if they have false positives and others are unsound if they have false negatives.


In the next story, I will go into the details and differences of the main dynamic and static analyses techniques. I will also explain some of the reasons why in practice, program analysers are unsound. Stay tuned!