Table of contents

The semantics of Intermediate Representations
- When does it matter to establish the details of the semantics?
- IR Evolution
Conclusion

Flambda2 Ep. 5: The Semantics of Change

Authors: Vincent Laviron, Dario Pinto, Pierre Chambart

Date: 2026-02-13

Category: OCaml

Tags: Flambda, Flambda2, lambda, f2s, boolean

$An abstract prism refracts a single beam labeled “Program” into multiple rays representing different semantic dimensions such as correctness, performance, memory layout, and nondeterminism.$

An abstract prism refracts a single beam labeled “Program” into multiple rays representing different semantic dimensions such as correctness, performance, memory layout, and nondeterminism.

Welcome to a new episode of The Flambda2 Snippets!

Do you know how compilers evolve?

The F2S blog posts aim at gradually introducing the world to the inner-workings of a complex piece of software engineering: The Flambda2 Optimising Compiler for OCaml, a technical marvel born from a 10 year-long effort in Research & Development and Compilation; with many more years of expertise in all aspects of Computer Science and Formal Methods.

Today, we are doing something a little different than usual. Indeed, all previous F2S deal with an aspect of the functioning of the compiler. Either how some optimizations are led, to front facing features like Loopify which allow our users to benefit from efficient tail-recursive calls.

The point of today's topic is not inherently technical, but is rather about how people handle technicalities together. We are going to cover a broader subject around how IRs, and their semantics evolve. We are going to see how their lifecycles are driven by social interactions and many other considerations.

We will illustrate today's topic by breaking down the context of this PR. We will eventually dive into the social questions that such lifecycles pose but first we'll introduce the technical context of language semantics and then elaborate on the specific technicalities of this PR.

We suggest to our readers to skip over the sections that they are less interested in and to use the Table of Contents to navigate the article.

Table of contents

The semantics of Intermediate Representations
- When does it matter to establish the details of the semantics?
- IR Evolution
Conclusion

The semantics of Intermediate Representations

So what do we mean by "semantics" in this context ?

Semantics is what gives a program its meaning.

The simplest kind of semantics we can think of would be a way to describe the mathematical function that, for a given correct program, links its parameters to its results.

Obviously, one can see how this description alone is not enough. There are just so many more things one may want to say about a program's behaviour.

What happens when the program is incorrect?
What about side effects: printing, system or user interactions?
Does a program terminate, and does it do something when it does not terminate?
Non-determinism, is there a single possible result of a given program?
Partial programs: what can we say about a single function without main?
Execution time, resource consumption: when does it terminate in practice? Will it blow the memory?
Precise memory layout: where are the bits in memory during execution?
and so on...

One may or may not want to specify one or several of these to different levels of rigor depending on the context.

Let's take Fibonacci as an example.

int fibonacci(int n) {
    if (n == 0)
        return 0;
    if (n == 1)
        return 1;
    return fibonacci(n - 1) + fibonacci(n - 2);
}

The most basic property of this function is that it should compute Fibonacci numbers.

Choosing to implement it in C, which has its own specification, already implies a lot of semantic properties. For example, using type int limits the range of representable values. On one hand, this guarantees that the function will return valid values for some sufficiently small parameters. On the other hand, this also guarantees that the function result will not be the expected mathematical result for larger arguments.

That's all the information that you need when the function call is correct in a non-buggy program.

You may have noticed that our function does not reject negative numbers. However, the Fibonacci sequence is not defined for negative indices. On paper, this is the correct way to implement this function, however the C compiler still has to decide what to do when the number is negative. You could consider this a second level of semantics because the compiler will have more precise semantics than our mathematical model, and that is fine as long as they are consistent with one another.

This specific function can consume resources. The most obvious would be stack memory. Computers have finite stack memory and recursive calls tend to consume stack space. This means that this function could exhaust the stack in some specific contexts, for instance if it was called under a sufficiently deep stack. If we look at this function independently from that context, however, the behaviour is non-deterministic. So even if we call the function with a valid value, it is possible that the function will not produce the correct result even though the function looks completely deterministic.

The naïve implementation we provided, has exponential complexity. It is possible to write a function that computes the same results more efficiently, with linear complexity. Our simple initial semantics allows the compiler to choose a more efficient implementation than the one we provided, or even a worse one.

When does it matter to establish the details of the semantics?

The previous section explained why it's important for the programmers and language desginers to agree on the semantics. It's also important for compiler developers to agree on the semantics of the different parts of the compiler.

Like many a French researcher in Formal Verification, we will resort to space crash analogies to illustrate our point.

See, in 1999, the Mars Climate Orbiter Space probe crashed on the surface of the red planet. After a detailed analysis of the events that led to it, NASA discovered an unfortunate misunderstanding between its teams and their contractor's engineers. Indeed, some software was expected to compute some thrust in SI units but instead returned it in Imperial units which is wrong by a factor of 4.45... While at a lesser scale, compiler development involves teams of people working independently, which makes the specification of the different parts of the compiler necessary.

While everybody agrees that a specification is required, what to put in it is a different story. A strict specification helps everybody agree on what exactly is happening in the code, but narrows the potential future evolutions. Also compiling and running a program involves many moving pieces, and writing down precisely what is happening (taking into account how a CPU works for real, for instance) might be as obscure as reading the code of the compiler itself.

So, back to the OCaml compiler. As its developers, we want to transform programs while preserving semantics. When writing these transformations, knowing what is and isn't part of the semantics is important.

Here are a few examples that might appear counter-intuitive at first but that show that there is a large set of properties that might or might not be relevant.

Resource consumption can be part of the expected specification

Let's take the example of Tail-Call Optimisation. The Wasm Committee has specified that TCO would not be permitted in Wasm because they want to enforce consistency between the different implementations. A program with too many recursive calls should StackOverflow on every platform indiscriminately.

Memory layout can be part of the expected specification

When memory layout is not part of the specification, you have more ways to optimise a given program. However when debugging, having a reliable and fixed memory layout is necessary. This means that supporting debugging creates more complexity in the compiler code.

Complexity can be part of the expected specification

In cryptography, some programs aim to be protected against side-channel attacks. One of these side-channels is execution time. This means that these programs rely on the consistent execution time for all the individual instructions.

Let's look at an example through this discussion on the LLVM mailing list. This mail describes a problem encountered by engineers writing cryptographic primitives. There were functions that they would have preferred to write in C for the sake of simplicity rather than assembly. However the LLVM compiler sometimes optimises a specific code pattern. This optimisation shortcutted some computations depending on the values. Simply put, this broke the constant-time properties that these engineers needed thus producing opportunities for side-channel attacks. That optimisation is generally a good choice for the majority of users, so removing it was out of the question. For the curious ones out there: this code is exposed to side-channel attacks. This example shows an instance of incompatible compiler specification and user requirements.

IR Evolution

An example of an arduous lifecycle that we wanted to cover was that of the fine details of Undef and Poison in LLVM.

This is probably one of the most complex instances of IR evolutions out there. LLVM is a huge codebase, developed by thousands of engineers, and almost everybody carries in their pocket some code compiled with it. Moreover, deciding what to do with Poison and Undef was fundamentally complex in itself.

For reference, see this presentation: Undef and Poison: Present and Future. It is one of the most complex IR changes ever undertaken, carried out over a long period of time. In hindsight, it clearly illustrates the very complex nature of designing an IR change in an already very large and widely used codebase. Despite a large group of developers, converging on a satisfactory solution was laborious.

Taking decisions about IR evolution can drastically change things for a significant portion of users. At that point, the discussion shifts towards formalising or specifying certain behaviours to get these suggestions adopted, and the subject becomes more social in nature.

In practice, developers have a mental representation of the semantics of intermediate languages. Obviously, each mental image differs from person to person, and they can even be incompatible with one another. However, they remain compatible with the implementation, and these incompatibilities are not observable until some edge case reveals them (a satellite crash, for example).

A mental image necessarily starts out naively, since it's generally based on what has been written about the semantics—but a complete semantics is usually too complex to be written down entirely. In a software project, if you don't write a proper semantics, the existing code becomes the only semantics, and that causes problems.

Faced with all these potential issues, there are two paths that compiler developers can take to evolve an IR while limiting the damage:

Replicate and modify: Duplicate the piece of semantics, changing it slightly in problematic cases. This preserves the old behaviour but is complex because it requires correctly tracking all the places to change, as well as updating the mental models of developers who were using it before.
Modify in place: Change the existing construct directly. This is more drastic and risky, but it doesn't add complexity to the language and is easier to track.

Conclusion

In this article, we explored the broad topic of IR semantics: what they are, why they matter, and how they evolve. From the Fibonacci example to the Mars Climate Orbiter, we've seen that semantics go far beyond the mathematical description of a program's behaviour. Whether it's resource consumption, memory layout, or execution time, specifying the right level of detail is both a technical and a social endeavour.

We also saw that evolving an IR's semantics is a delicate balancing act. The two strategies—replicate and modify, or modify in place—each come with their own trade-offs, and the choice depends heavily on the context and the people involved.

In the next article, we put these ideas into practice. We will examine a concrete case of IR evolution within the OCaml compiler: the transition from Generalized If to Strict Boolean If semantics in the Lambda intermediate representation, and see firsthand how both the technical and social dimensions play out.

Stay tuned, and thank you for reading!

About OCamlPro:

OCamlPro is a R&D lab founded in 2011, with the mission to help industrial users benefit from experts with a state-of-the-art knowledge of programming languages theory and practice.

We provide audit, support, custom developer tools and training for both the most modern languages, such as Rust, Wasm and OCaml, and for legacy languages, such as COBOL or even home-made domain-specific languages;

We design, create and implement software with great added-value for our clients. High complexity is not a problem for our PhD-level experts. For example, we helped the French Income Tax Administration re-adapt and improve their internally kept M language, we designed a DSL to model and express revenue streams in the Cinema Industry, codename Niagara, and we also developed the prototype of the Tezos proof-of-stake blockchain from 2014 to 2018.

We have a long history of creating open-source projects, such as the Opam package manager, the LearnOCaml web platform, and contributing to other ones, such as the Flambda optimizing compiler, or the GnuCOBOL compiler.

We are also experts of Formal Methods, developing tools such as our SMT Solver Alt-Ergo (check our Alt-Ergo Users' Club) and using them to prove safety or security properties of programs.

Please reach out, we'll be delighted to discuss your challenges: contact@ocamlpro.com or book a quick discussion.

Most Recent Articles

2026

2025

2024

[Show all]