How do programmers ensure that compilers create correct code?

Question

Reading this fascinating (and highest voted question) on SO, Why is it faster to process a sorted array than an unsorted array? made me wonder about compiler code correctness.

For example, the answer states that:

Intel Compiler 11 does something miraculous. It interchanges the two loops...

How does a compiler programmer know when it's OK to interchange loops?

And, in general, do they use mathematical proofs to demonstrate conclusions?

How does a compiler programmer know that their compiler will generate correct code? How do they test their conclusion? Do they have to write a test suite that runs the compiler, and checks that the generated code is correct?

+1 just for citing that incredible question and its even better answer. — duffymo, Jul 14 '12 at 02:11
The answer you accepted is full of nonsense. The answer by borisov is full of sense. — Heath Hunnicutt, Apr 15 '15 at 22:09

score 4 · Accepted Answer · answered Jul 14 '12 at 02:08

How does a compiler programmer know when it's OK to interchange loops?

The compiler runs a series of checks on the code to determine if it's safe to interchange loops. For example, if the code doesn't fully inline, it probably won't be able to interchange the loops. If the code modifies a volatile variable, it won't interchange the loops. If the code stores values that are computed in previous loop iterations, the compiler won't interchange the loops. If they can be sure it's safe because none of these conditions are triggered, the compile can interchange the loops.

And, in general, do they use mathematical proofs to demonstrate conclusions?

No. They just work out an optimization and a set of conservative tests to ensure that optimizations is safe. Over time, they develop more optimizations and more sophisticated algorithms to detect when the optimization is safe even in cases where it's less obvious.

How does a compiler programmer know that their compiler will generate correct code?

They do the best they can. Occasionally they make mistakes. People submit bug reports, and they fix it.

How do they test their conclusion? Do they have to write a test suite that runs the compiler, and checks that the generated code is correct?

They absolutely do use test suites. When a bug is detected in GCC, a test is specifically added to the test suite to make sure that bug is fixed and not re-introduced.

Passing tests is no guarantee of correctness. Correctness is a mathematical concept and I'm not aware of any modern compilers using proofs to ensure correctness. I posted a link in my answer to this question. — James, Jul 14 '12 at 02:30
@James: There are compilers that use proofs to ensure correctness, however, they're a long way from being able to compile real-world code in a language as complex as C or C++. And that's with no attempt whatsoever put into optimizing the resulting machine code at all. — David Schwartz, Jul 14 '12 at 06:48
@DavidSchwartz this answer is simply wrong. "They just work out an optimization and a set of conservative tests to ensure that optimizations is safe." That statement is false. The algorithms found in the book by Muchnik (as an example) are mathematically sound methods of optimization which will never change the meaning of the code being optimized. It's apparent that you aren't familiar with these algorithms, and just made up an answer. — Heath Hunnicutt, Apr 15 '15 at 20:58
"For example, if the code doesn't fully inline, it probably won't be able to interchange the loops." -- this is fiction. — Heath Hunnicutt, Apr 15 '15 at 21:04

score 2 · Answer 2 · answered Jul 14 '12 at 02:11

How does a compiler programmer know when it's OK to interchange loops?

When the modification doesn't alter program's behavior according to the language standard, when the change doesn't go against the standard itself.

For example, C and C++ standards say in a few places that the order of the evaluation of function parameters and sub-expressions is unspecified. This grants the compiler the freedom to generate code to evaluate them in any order it sees fit. If your program depends on a particular order, it's not conforming to the standard and you have no right to blame the compiler for "breaking" it.

Compilers may and often do use code analysis, logic and math with all those theorems to optimize code.

In practice, testing shows whether or not the compiler did the right job.

score 1 · Answer 3 · answered Jul 14 '12 at 02:24

Compilation is a complicated process. The program is structured as a graph and the compiler tries its best to 'optimise' this graph according to the rules that the developers have come up with.

However there's no guarantee that the code generated is anywhere near 'optimal'. There has been research into so-called 'super-optimisers' that try to generate truly optimal code using automated proof engines... i.e. they can answers questions such as 'Is there a way to compile this algorithm such that it takes less than X cycles'. Denali is one such super-optimiser I've read about. The technique is easier for some architectures than others. The downside is that these super-optimisers can take hours if not days to compile a simple routine which is unacceptable for most people.

score 1 · Answer 4 · answered Jul 15 '12 at 22:04

1

One of the main sanity tests for any compiler written in its own language is to get it to compile itself and then use the resulting new compiler to compile itself again. The two resulting compilers should be identical, modulo timestamps.

answered Jul 15 '12 at 22:04

user207421

305,947
44
307
483

baibo · Answer 5 · 2015-04-15T20:54:29.430

Summary

Ok so I agree with some of the answers but I think people are underestimating how rigorous and conservative compilers are in their transformations and optimisations. Long story short - although optimal code generation is indeed a sort of black art and 99% of the time heuristics are used rather than proofs (someone talked about the Denali super optimiser) correct code generation is, in fact, mathematically sound and it is designed to be so. The fact that there are bugs is due to the enormous complexity of compiler codebases. As a side node, an automatic theorem prover can have bugs, even though every part of it is mathematically sound, so the fact that compilers have bugs doesn't mean they are random collections of rewrite rules.

Now, to answer your questions in turn:

How does a compiler programmer know when it's OK to interchange loops?

There is a subfield of compiler theory called Dependence Analysis that deals with questions like that. There are entire books on the topic, but for your particular question: the compiler will automatically figure out the iteration space of the loop (if the bounds are computable, see NB at the bottom), it will figure out dependencies between the instructions of the loops (there are different kinds of dependencies), it will compute the distance and direction vectors of the loop, and then figure out if it's legal to do the interchange. The Wikipedia article on this just gives the different types of dependencies but has links to other subtopics, and also cites the book I mentioned.

And, in general, do they use mathematical proofs to demonstrate conclusions?

No, not in general. Also depends on what we mean by proof - do you mean using a theorem prover? The Denali paper uses a SAT solver, for example. The Dafny compiler and language, by Microsoft Research @ Cambridge, bakes in verification of your code by virtue of source level language annotations. The compiler verifies the correctness of your code, although to work with annotations is very difficult (I've fiddled with Dafny for a project at university and I speak from experience). Also, by "not in general" I mean not in gcc, the llvm family and I would suspect (but I haven't checked) in the Intel's icc. And again, we are talking here about C-like source to machine level compilation, nothing fancier.

How does a compiler programmer know that their compiler will generate correct code? How do they test their conclusion? Do they have to write a test suite that runs the compiler, and checks that the generated code is correct?

The other answers covered this extensively so there's not much more to say. I guess I can add that since compilers are really BEASTS in terms of complexity, testing them extensively is paramount. Also, I would like to reiterate that even in a mathematically sound piece of code (if we assume we have such notions formally defined) there could be bugs in the infrastructure that handles that code, so testing is always important.

Verdict

Compilers are very complicated. They need to be fast to enable seamless edit-compile-debug developer loops, and also correct. Compilers' correctness of the code they produce is kind of "locally mathematically sound" - every transformation that the compiler does must preserve the semantics of the original program. Improvements in speed are driven by heuristical algorithms, and such algorithms are good examples of black art techniques in the sense that they give no proof that the resulting code will be faster. They also give no formal proof that it will be correct, but every transformation is designed to produce correct semantic preserving code. Finally, the hottest trend in compilers (I guess as everywhere else in Computer Science these days) is to use machine learning for optimisations, so we'll have even more knobs to tune :) yay

NB

Almost everything in the compiler "back end" i.e. the codegenerator, is either NP complete or undecidable. In particular, to determine if loop interchange is legal is undecidable, since loop bounds can be arbitrary expressions (say function calls). However, if the bounds are computable (say integers or something simple) then that's where Dependence Analysis kicks in.

How do programmers ensure that compilers create correct code?

5 Answers5