The Algorithmic Methodology Of Coq Proof Assistant: Inductive Types, Tactics, And Extraction To Ocaml

A comprehensive technical exploration of the algorithmic methodology of coq proof assistant: inductive types, tactics, and extraction to ocaml, covering key concepts, practical implementations, and real-world applications.
Contents
I’ll expand the blog post to over 10,000 words by adding substantial new sections, deeper explanations, code snippets, and case studies. The structure follows the outline, building on the existing introduction. I’ll maintain a professional yet engaging tone, with clear headings and practical examples.
The Algorithmic Methodology Of Coq Proof Assistant: Inductive Types, Tactics, And Extraction To Ocaml
Introduction: The Case for Mathematical Certainty in Software
On June 4, 1996, the maiden flight of the Ariane 5 rocket ended in catastrophe. Just 37 seconds after liftoff, the launcher veered off course and self-destructed, destroying a payload worth $370 million. The root cause? A software bug: an integer overflow in the inertial reference system that converted 64-bit floating-point numbers to 16-bit signed integers. The conversion routine—written in Ada and reused from the earlier Ariane 4—was never designed to handle the higher velocities of the new rocket. Yet it was never formally verified, because the testing team assumed that the arithmetic would always stay within safe bounds. That assumption, tragically, was wrong.
The Ariane 5 failure is one of the most infamous examples of a phenomenon that haunts every engineer who has ever shipped code: the gap between what we think our programs do and what they actually do. Despite decades of advances in testing, code review, static analysis, and type systems, the history of software is a graveyard of missed edge cases, unexpected states, and invisible assumptions that eventually surface as race conditions, memory corruptions, or logical errors. And as software becomes embedded in critical infrastructure—medical devices, autonomous vehicles, smart contracts, space missions—the cost of these failures is no longer measured in lost productivity but in human lives.
This is where the idea of formal verification steps onto the stage. Instead of testing a program on a finite set of inputs, formal verification treats a program as a mathematical object and proves that it satisfies a formal specification for all possible inputs. It is the difference between checking that a bridge holds up under a few traffic patterns and proving it will withstand any load within its design limits. For decades, this approach was considered too expensive and impractical for mainstream software development, confined to academic labs and a handful of safety-critical systems. But over the last ten years, a quiet revolution has been unfolding. Tools like the Coq proof assistant have matured to the point where they can be used to verify industrial-scale software—the CompCert C compiler, the seL4 microkernel, and parts of Amazon’s AWS infrastructure, to name a few.
In this blog post, we will dive deep into the algorithmic methodology behind Coq. We will explore its core building blocks: inductive types, which allow us to define data structures and propositions in a unified way; tactics, which are the commands we use to build proofs interactively; and extraction, which lets us turn verified Coq programs into efficient OCaml code that can run in production. Along the way, we will work through concrete examples, from simple arithmetic to a small compiler, and see how the Curry-Howard correspondence—the deep link between programs and proofs—makes all of this possible.
By the end, you will not only understand how Coq works under the hood, but also appreciate the algorithmic elegance of its proof engine: a blend of term rewriting, higher-order unification, and type inference that feels like a cross between a theorem prover and a functional programming language.
1. The Coq Proof Assistant: A Language for Both Programs and Proofs
1.1 Historical Context
Coq was developed in the early 1990s at INRIA by Gérard Huet, Thierry Coquand, and others. Its name comes from the French word for “rooster”—a nod to the “Calculus of Constructions” (CoC), the logical foundation on which it is built. Over the years, Coq has evolved through many versions, integrating new features like a rich tactic language, a powerful automation framework (Ltac), and a heterogeneous universe of types (Prop, Set, and Type). But its core remains remarkably stable: it is a dependently typed functional programming language that also serves as an interactive theorem prover.
The key insight behind Coq is the Curry-Howard correspondence: there is a direct isomorphism between programs and proofs. A proposition is a type, and a proof of that proposition is a term (program) of that type. For example, the proposition “If A implies B and A is true, then B is true” corresponds to the type (A → B) → A → B; the proof is the function fun f a => f a. In Coq, you can write:
Definition modus_ponens (A B : Prop) (f : A -> B) (a : A) : B := f a.
This is both a program and a proof. This dual nature is what makes Coq so powerful: you can write functional programs, prove properties about them, and then extract the programs into efficient code, confident that the proofs guarantee correctness.
1.2 The Coq Language in a Nutshell
Coq is a typed lambda calculus with dependent types. This means types can depend on values. For example, the type of a function that takes a natural number n and returns a vector of length n is nat -> Vector.t A n. This is impossible in traditional languages like Haskell or OCaml (without dependent types). The type system is expressive enough to encode specifications directly in the types.
The universe hierarchy in Coq is stratifiesd:
- Prop: the universe of propositions (logical statements). Terms of
Propare proofs (or just statements if they have no proof yet). Prop is impredicative, meaning you can quantify over all propositions. - Set: the universe of computation data types (like
nat,list,bool). Sets are smaller than Prop in some respects, but they are also impredicative in Coq (though optionally). - Type: the next universe above Set and Prop. There is an infinite hierarchy:
Type(0) = Set,Type(1),Type(2), etc., to avoid paradoxes like Russell’s paradox.
The language is purely functional and total: every well-typed program in Coq must terminate. This is enforced by a guard condition for recursive definitions (e.g., recursion must be structural). This totality is crucial for logical consistency: if you could write infinite loops, you could prove False.
1.3 Interactive Proof Development
When you interact with Coq, you are in a proof state that shows the current goal (what you need to prove) and any hypotheses (available assumptions). You apply tactics to transform the goal into simpler subgoals, or to prove it directly. The tactics are commands that operate on the proof term under construction. Underneath, Coq builds a lambda term that respects the Curry-Howard isomorphism. You can view the term at any time using Show Proof.
The proof process algorithmically decomposes a complex statement into simpler ones, using rules of inference encoded in the type system. This is analogous to a type checker for a dependently typed language, but with the goal of constructing a term that inhabits a type you have not yet seen a term for. The “search” aspect is what makes Coq a proof assistant rather than a fully automated theorem prover.
2. Inductive Types: The Foundation of Definitions
Inductive types are the primary way to define data structures and propositions in Coq. They are called inductive because they allow you to introduce new constants and constructors, and then you can pattern-match on them. Every inductive type has an associated induction principle (generated automatically), which is the key to proving properties by induction.
2.1 Defining Natural Numbers
The classic example is the natural numbers. In Coq’s standard library, nat is defined as:
Inductive nat : Set :=
| O : nat
| S : nat -> nat.
This is a simple inductive type: O (zero) is a constant, and S (successor) is a constructor that takes a nat and returns a nat. This defines the set of natural numbers as the least fixed point of these constructors.
From this definition, Coq automatically generates an induction principle. You can inspect it with Check nat_ind:
nat_ind : forall P : nat -> Prop,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
This embodies the standard mathematical induction principle. To prove a property P for all n, you must prove it for O and then prove that if it holds for n, it holds for S n.
Now let’s define addition by recursion over the first argument:
Fixpoint plus (n m : nat) : nat :=
match n with
| O => m
| S n' => S (plus n' m)
end.
This is structurally recursive: the recursive call is on a subterm of n. Coq accepts it because it satisfies the guard condition. We can verify a simple property, like plus n O = n:
Lemma plus_n_O : forall n, plus n O = n.
Proof.
intro n. induction n.
- reflexivity.
- simpl. rewrite IHn. reflexivity.
Qed.
We used tactics: intro to introduce the variable, induction to start induction, simpl to simplify the term, rewrite to use the induction hypothesis, and reflexivity to handle equality by reduction. This is the basic proof methodology.
2.2 Propositional Inductive Types
Inductive types are not limited to data; they can also define logical connectives. For example, the conjunction ∧ is defined as an inductive proposition:
Inductive and (A B : Prop) : Prop :=
| conj : A -> B -> and A B.
To prove A ∧ B, you must provide a proof of A and a proof of B. To use a hypothesis H : A ∧ B, you can destruct it (pattern-match) into H1 : A and H2 : B. Similarly, disjunction ∨:
Inductive or (A B : Prop) : Prop :=
| or_introl : A -> or A B
| or_intror : B -> or A B.
The induction principle for or corresponds to proof by cases.
This demonstrates the deep unity of data and propositions: they are both defined by constructors. The same machinery that lets you define natural numbers lets you define logical connectives. And because Coq treats proofs as first-class terms, you can compute with them. For instance, you can write a function that takes a proof of A ∨ B and returns either A or B as a Prop—though you cannot extract a computational value from a Prop directly (that would be proof irrelevance).
2.3 Dependent Types: Vectors and Sigma Types
The true power of inductive types emerges when we use dependent type parameters. Consider the type Vector.t A n, a list of length exactly n:
Inductive vector (A : Type) : nat -> Type :=
| vnil : vector A O
| vcons : A -> forall n, vector A n -> vector A (S n).
Note that the index n varies across constructors. The type vector A O is only inhabited by vnil. The type vector A (S n) can only be constructed by vcons with an element and a vector of length n. This means the length is statically known from the type. You can define a safe hd function that only works on non-empty vectors:
Definition hd {A n} (v : vector A (S n)) : A :=
match v with
| vcons a _ => a
end.
Because the type of v constrains it to be of the form vcons ..., the pattern match is exhaustive. No vnil case is needed. This is a form of static verification: we prove correctness (no head on empty vector) at the type level.
Another useful dependent type is the sigma type (dependent sum), often used for subset types:
Inductive sig (A : Type) (P : A -> Prop) : Type :=
| exist : forall x : A, P x -> sig A P.
Commonly written as {x : A | P x}, this is the type of terms x paired with a proof that P x holds. For example, we can define a function that returns a natural number with a proof that it is even:
Definition even_nat : Set := {n : nat | exists k, n = 2*k}.
Definition four_is_even : even_nat.
refine (exist _ 4 _). exists 2. reflexivity. Defined.
refine is a tactic that lets us create the term step by step.
2.4 Recursive Inductive Types and Induction Principles
When you define an inductive type, Coq automatically generates an induction principle that matches the structure of the type. For nat, we saw nat_ind. For a more complex type like list:
Inductive list (A : Type) : Type :=
| nil : list A
| cons : A -> list A -> list A.
The induction principle list_ind is:
list_ind : forall (A : Type) (P : list A -> Prop),
P nil -> (forall (a : A) (l : list A), P l -> P (cons a l)) -> forall l, P l
This is structural induction. But you can also define non-standard induction principles by hand (e.g., strong induction) and use them in proofs.
The algorithmic aspect of Coq’s induction principle generation is based on a syntactic analysis of the constructors and their argument types. For each recursive argument (like list A inside cons), the induction hypothesis is automatically assumed for the subterm. This mirrors the fact that Coq’s type theory (the Calculus of Inductive Constructions) comes with a primitive rule for induction: the elimination rule for inductive types.
2.5 Co-inductive Types for Infinite Data
Coq also supports co-inductive types for infinite structures like streams. This is essential for reasoning about reactive systems and lazy computations. A co-inductive type is defined with CoInductive, and its elements are constructed by guarded co-recursion. For example, the type of infinite streams:
CoInductive Stream (A : Type) : Type :=
| Cons : A -> Stream A -> Stream A.
A function that returns an infinite stream of natural numbers starting from n is:
CoFixpoint from (n : nat) : Stream nat := Cons n (from (S n)).
Proofs about co-inductive types use co-induction (bisimulation). Coq provides a CoFixpoint command and tactics like coFix to work with them. Co-induction is algorithmically more complex than induction, but it is crucial for verifying programs that should run forever (e.g., operating system kernels, server loops) without termination.
3. The Algorithmic Heart: Tactics and Proof Search
Tactics are the commands that drive proof construction. They are themselves algorithms, often implemented in OCaml inside the Coq kernel, that manipulate the proof term. Understanding the main tactics and their underlying algorithms is essential to mastering Coq.
3.1 Basic Tactics: Intro, Apply, Destruct, Induction
intro/intros: Moves the premise of an implication (or the bound variable of aforall) into the context. Algorithmically, this creates a lambda abstraction. For example, given a goalforall n, n = n,intro nchanges the goal ton = nwithn : nat.apply: Uses a hypothesis or known lemma to prove the goal. If you haveH : A -> Band the goal isB,apply Hchanges the goal toA. This is backward chaining: we match the conclusion of the lemma with the goal and generate the premise as a new subgoal. The matching involves higher-order unification (Huët’s pre-unification algorithm) that can instantiate meta-variables (existential variables) when needed.destruct: Performs case analysis on an inductive term. Given a hypothesish : A ∨ B,destruct hyields two subgoals: one wherehis replaced byor_introl a(soAis assumed), and one where it isor_intror b(soBis assumed). This is a form of pattern matching on the term, generating amatchexpression in the proof term.induction: Applies the induction principle of an inductive type. For an induction variablen : nat,induction ncreates two subgoals: the base case (forO) and the inductive case (forS n'), with an induction hypothesisIHn'added to the context. The algorithm is: generatenat_indapplied to the current predicate (the goal withnabstracted), then destruct the quantifier.
3.2 A Real Example: Proving Commutativity of Addition
Let’s work through a fully detailed proof of plus n m = plus m n (commutativity). We’ll define plus as earlier. We’ll need two lemmas:
Lemma plus_n_Sm : forall n m, plus n (S m) = S (plus n m).
Proof.
intros n m. induction n as [| n' IH].
- reflexivity.
- simpl. rewrite IH. reflexivity.
Qed.
Now commutativity:
Theorem plus_comm : forall n m, plus n m = plus m n.
Proof.
intros n m. induction n as [| n' IH].
- simpl. rewrite plus_n_O. reflexivity.
- simpl. rewrite IH. rewrite plus_n_Sm. reflexivity.
Qed.
The algorithm behind this proof is a rewriting strategy based on the induction hypothesis and lemmas. The simpl tactic uses reduction rules (iota-reduction for fixpoints, pattern matching) to simplify the term. rewrite uses the lemma to replace one side of an equality with the other, applying it to a matching subterm. reflexivity checks if the two sides are syntactically equal after reduction.
The overall proof is essentially a term rewriting system where the goal is reduced to a trivially true statement. Coq’s reduction engine (called the kernel) performs normalization using a call-by-name strategy with sharing for efficiency.
3.3 Advanced Tactics: Inversion, Omega, Lia
inversion: Given a hypothesis that involves an inductive type,inversiondeduces all possible ways the term could have been constructed, adding new equalities and hypotheses. For example, if you haveH : S n = S m,inversion Hgives youn = mand removesH. The algorithm uses injectivity and discrimination of constructors:Sis injective, andOcannot equalS. This is implemented by collapsing the match and generating inversion lemmas automatically.omega: A tactic for linear arithmetic over natural numbers (Presburger arithmetic). It uses the Omega algorithm (from the Pugh’s Omega test) to decide quantifier-free formulas. For example,omegacan provex + y <= x + y + 1. It’s a decision procedure that works by translation to a system of difference constraints.lia: Linear integer arithmetic, an evolution ofomegathat handles more cases and is generally faster. It uses the Simplex algorithm internally.ring: A tactic for ring equations (like(a+b)*(c+d) = a*c + a*d + b*c + b*d). It normalizes ring expressions and checks for equality by polynomial reduction. This is a classic example of proof by reflection, which we will discuss next.
3.4 Tacticals: Combining Tactics
Coq provides tacticals (tactic combinators) to build more complex proof strategies:
;(semicolon): Apply the next tactic to all subgoals generated by the previous. E.g.,induction n; simpl; auto.try: Try a tactic; if it fails, leave the goal unchanged.repeat: Repeat a tactic until it fails. Useful for systematic rewriting likerepeat f_equal.firstandsolver:first [tac1 | tac2 | ...]tries each tactic in order until one succeeds.all:: Apply a tactic to all goals (including the main goal if there are no subgoals). Useful for cleaning up after a branching tactic.
These tactical constructs allow you to write proof scripts that are more like programs. The algorithmic interplay between them can sometimes lead to surprising behavior, especially with backtracking (Coq’s tactic language is deterministic by default, but evar and eauto introduce branching).
3.5 Automation: auto, eauto, and Ltac
Coq includes a built-in automation tactic auto (and eauto for existential variables). auto works by using a database of hints (lemmas) and repeatedly applying apply, assumption, and reflexivity up to a certain depth. The algorithm is essentially a depth-first search in a prolog-style resolution, but with type-aware unification. The hint database can be extended with Hint Resolve lemmma or Hint Constructors for constructor lemmas.
For more customizable automation, the Ltac language allows users to write their own tactic definitions. Ltac is itself a typed language with pattern matching on goals and contexts. For example:
Ltac solve_easy :=
intros; auto with arith.
Goal forall n, n + 0 = n.
solve_easy.
Qed.
Ltac patterns can match on the conclusion, hypotheses, and even on the structure of terms. This makes it possible to implement powerful domain-specific automation, like custom rewrite engines or decision procedures for small theories.
3.6 Proof by Reflection: The ring Tactic Example
One of the most elegant algorithmic techniques in Coq is proof by reflection. The idea is to write a function (called a reified function) that analyzes a term and computes a normal form, then prove that the function is correct. Then, to prove a goal, you apply the correctness theorem to the computed normal forms.
The ring tactic does exactly this: it normalizes polynomial expressions over a commutative ring. The implementation defines a data type for polynomial expressions (with constants and variables), a normalization function, and a proof that the normalization preserves equality. The tactic then replaces the goal with normalized(e1) = normalized(e2) and uses simplification to finish.
Reflection is algorithmic because it moves the heavy computation from the proof assistant (which is slow and uses kernel reduction) to a user-defined OCaml function (via extraction or compiled tactics). It reduces proof search to computation, which is deterministic and fast.
4. Extraction to OCaml: From Proofs to Program
One of the most compelling features of Coq is the ability to extract computational content from proofs into a functional language—most commonly OCaml. This bridges the gap between formal verification and practical software engineering. You can write a proven-correct algorithm in Coq, extract it to OCaml, and then compile it with OCaml’s compiler to run on real hardware.
4.1 What Gets Extracted?
Coq’s extraction is based on the Curry-Howard correspondence, but it must handle the fact that not all type theory constructs have a direct counterpart in OCaml. The key design decisions:
Prop is erased: Propositions are logical statements that exist only for proof. They have no computational content (thanks to proof irrelevance). During extraction, all
Proptypes and their terms are removed. This includesand,or,exists, and proofs of arithmetic, etc. Only types inSetorTypethat correspond to data are extracted.Inductive types with Prop indices: For dependent types like
vector A n, thenindex lives innat(Set), not Prop, so it remains. However, extraction simplifies the type by removing the index if it is only used in proofs. For vectors, the length index is preserved because it affects the structure (you cannot havevnilat lengthS n). But some dependent types become phantom types in OCaml.Recursive definitions: Fixpoint definitions (and CoFixpoint) are extracted to recursive functions in OCaml, provided they follow structural recursion (or are guarded). Coq’s termination checker ensures termination, but OCaml does not enforce it; the extracted code may diverge if the original Coq function wasn’t total (but it was, so we trust the extraction).
Type universes: The universe hierarchy is collapsed: Set and Type become just type in OCaml. The propagation of universe constraints (e.g.,
Type(i) < Type(i+1)) is ignored because OCaml’s type system does not need them.Side effects: Coq is pure. Extraction assumes no side effects; if you extract a function that uses logical side effects (like
exnorIO), you must manually provide an OCaml implementation.
4.2 How Extraction Works: A Peek Under the Hood
The extraction algorithm operates on the internal representation of Coq terms (called kernel terms). It traverses the global environment (all definitions) and for each definition:
Identify the computational sort: If the type is
Prop, skip entirely (unless it is a “squashed” type likesigwhich has a computational version!).Translate types: Coq’s dependent types are converted to OCaml’s non-dependent types by erasing indices that do not appear in the constructors (e.g.,
Vector.tbecomes alistwith phantom indices). This is done by analyzing the constructors’ arguments.Translate terms: Each term is mapped to an OCaml term, handling pattern matching (
match) as pattern matching in OCaml, recursive calls aslet rec, etc. Special care is taken for dependent pattern matching: the pattern must be “non-dependent” in the eliminated index. Coq’s extraction eliminates dependent matches by projecting into a simpler inductive type (e.g.,vector A nis extracted to'a list; pattern matching onvnilbecomes matching on[]).Eliminate proofs: Terms that are proofs (type in
Prop) are replaced by()(unit) because they are erased. However, the term may be used to compute a witness in asigtype: the proof part becomes(), and the computational part is preserved.Generate OCaml code: The resulting terms are pretty-printed to a string, respecting OCaml syntax (e.g., identifiers are renamed to avoid collisions, modules are generated, etc.).
4.3 Example: Extracting a Sorting Algorithm
Let’s illustrate extraction with a simple verified sorting algorithm. We’ll define a sort function on lists of natural numbers that returns a sorted list. We’ll prove its correctness (that the output is sorted and a permutation of the input). Then we extract to OCaml.
First, define the sorting algorithm (say, insertion sort):
Fixpoint insert (n : nat) (l : list nat) : list nat :=
match l with
| nil => n :: nil
| h :: t => if n <=? h then n :: h :: t else h :: insert n t
end.
Fixpoint sort (l : list nat) : list nat :=
match l with
| nil => nil
| h :: t => insert h (sort t)
end.
Now define a predicate Sorted and Permutation. We’ll prove:
Lemma sort_sorted : forall l, Sorted (sort l).
Lemma sort_perm : forall l, Permutation l (sort l).
We skip the proofs for brevity. Now extract:
Extraction Language OCaml.
Extraction "sort.ml" sort insert Sorted sort_sorted sort_perm.
The extracted sort.ml file will contain the OCaml functions sort and insert. The proofs sort_sorted and sort_perm are erased because they are of type Prop. Only computational definitions remain. The extracted sort function is:
type nat = O | S of nat
let rec insert n l =
match l with
| [] -> n :: []
| h :: t -> if le_lt_dec n h then n :: h :: t else h :: insert n t
let rec sort l =
match l with
| [] -> []
| h :: t -> insert h (sort t)
Note that le_lt_dec (a boolean test from the Coq library) is also extracted. The extracted code is pure, tail-recursive or not depending on the definition. We can compile and run it.
4.4 Further Extraction: The Program Command and subset Types
Coq also supports a Program command that allows you to write programs with dependent types and holes, and then fill them with proofs. Extraction works seamlessly. For example, a function that returns an even number greater than n:
Program Definition next_even (n : nat) : {m : nat | m > n /\ even m} :=
match n with
| O => 2
| S n' => ...
end.
Next Obligation. ... Qed.
Extraction will yield an OCaml function that returns the value m (a nat) and discards the proof component.
4.5 Limitations and Best Practices
Extraction is powerful but not magic:
Dependent types cause trouble. While Coq can extract
vector A nto'a list, the length constraint is lost. So if you have a function that expects a vector of lengthS nand you call it with an empty list, there will be a runtime error (though Coq’s proof ensures it shouldn’t happen if you only use verified functions). This is a soundness gap: the OCaml program may have bugs if you interface with unverified code. The solution is to seal the extracted module behind an opaque interface that only exposes verified operations.Side effects (I/O, mutable state) are absent in Coq. To write a real application, you must embed Coq’s pure functions into an impure OCaml wrapper. Extraction does not handle
IOmonads; you must manually write the wrapper in OCaml, using the extracted functions as pure computations.Performance: Extracted code can be slower than hand-written OCaml because Coq’s
natis a Peano representation, not machine integers. But Coq can be extended with efficient arithmetic via thenative_integerorInt63library, and you can extract to OCaml’sinttype usingExtraction Constantsdirectives.Proof terms: Extraction ignores all proofs. If you rely on a proof that some condition holds (e.g.,
0 < n), that proof disappears. The extracted function may still type-check in OCaml because the type constraints are lifted, but you lose the guarantee. To regain some safety, you can usepositive(binary numbers) or extractedsigtypes where the proof condition is computationally irrelevant (but still guaranteed by Coq’s extraction via a cast to an opaque type? Actually, the proof is gone, so you must trust the extracted code does not attempt to call functions with invalid arguments. The solution is to keep the subset types as new inductive types in OCaml (e.g.,type positive = Pos of intwith an invariant) and only allow construction through verified functions.
4.6 Real-World Use: CompCert and seL4
The most prominent use of extraction is in the CompCert C compiler. CompCert is a verified C compiler proved correct in Coq. Its front-end and back-end are written and proven in Coq, then extracted to OCaml. The result is a compiler that is arguably the most trustworthy implementation of C available. The extraction process produces about 100,000 lines of OCaml code from Coq sources. This code is then compiled with OCaml and can produce binaries that have been tested against a battery of tests.
Similarly, the seL4 microkernel was verified in the Isabelle/HOL theorem prover, but there is a Coq effort called Cogent that uses extraction to generate C code from Coq specifications.
Amazon Web Services (AWS) uses Coq for verifying its cryptographic protocols and certain AWS Lambda infrastructure pieces. They extract code to OCaml and then integrate with Haskell or Rust libraries.
5. A Case Study: Verifying a Simple Compiler in Coq
To synthesize what we’ve learned, let’s build a small but complete example: a compiler from a simple arithmetic language (expressions with addition and constants) to a stack machine. We will define the source and target languages, the compilation function, and the semantics. Then we will prove that the compiler is correct (the compiled code yields the same result as the source expression when executed). Finally, we will extract the compiler to OCaml.
5.1 Source Language: Arithmetic Expressions
We define an inductive type for expressions:
Inductive expr : Set :=
| ENum : nat -> expr
| EPlus : expr -> expr -> expr.
The semantics is a function eval : expr -> nat:
Fixpoint eval (e : expr) : nat :=
match e with
| ENum n => n
| EPlus e1 e2 => eval e1 + eval e2
end.
5.2 Target Language: Stack Machine
We define a stack as a list of naturals. Instructions:
Inductive instr : Set :=
| IPush : nat -> instr
| IAdd : instr.
A program is a list of instructions. The semantics is a state machine running on a stack:
Fixpoint run (p : list instr) (stack : list nat) : list nat :=
match p with
| [] => stack
| i :: rest =>
match i with
| IPush n => run rest (n :: stack)
| IAdd =>
match stack with
| n2 :: n1 :: rest_stack => run rest ((n1 + n2) :: rest_stack)
| _ => stack (* error case, but we'll avoid it by correctness *)
end
end
end.
The “top” of the stack is the head of the list.
5.3 Compilation Function
A simple compilation: compile an expression into a list of instructions that leaves the result on the stack.
Fixpoint compile (e : expr) : list instr :=
match e with
| ENum n => [IPush n]
| EPlus e1 e2 => compile e1 ++ compile e2 ++ [IAdd]
end.
5.4 Compiler Correctness Theorem
What does it mean for the compiler to be correct? For any expression e and any initial stack s, running the compiled code on s should result in the same stack as pushing eval e onto s. Formally:
Theorem compile_correct : forall e s,
run (compile e) s = (eval e) :: s.
Proof.
induction e as [n | e1 IH1 e2 IH2]; intros s.
- reflexivity.
- simpl. rewrite app_assoc. simpl.
rewrite IH1. rewrite IH2. simpl. reflexivity.
Qed.
Let’s step through the proof:
Base case
ENum n:compile (ENum n) = [IPush n].run [IPush n] s = run [] (n :: s) = n :: s. Goaln :: s = (eval (ENum n)) :: s=>n :: s = n :: s.reflexivity.Inductive step
EPlus e1 e2:compile (EPlus e1 e2) = compile e1 ++ compile e2 ++ [IAdd].runon a concatenated list:run (A ++ B) sis not directlyrun B (run A s)in general becauserunis defined recursively. But we can use an auxiliary lemma or simplify directly. In the proof above, we firstsimplwhich unfoldscompileand leavesrununchanged. Thenrewrite app_assocto get(compile e1 ++ compile e2) ++ [IAdd]. Thesimplthen applies the definition ofrunon the outer++. Actually,simpldoes not reducerunfor a concatenation; it only reducescompile. Better to use a lemma aboutrunon concatenation. But here we did something simpler: we used the induction hypothesesIH1andIH2directly. Let’s see.
Actually, the proof above as written is slightly sloppy. A correct proof requires a lemma:
run (l1 ++ l2) s = run l2 (run l1 s). Then we can use it. Let’s provide a complete rigorous proof:
Lemma run_app : forall l1 l2 s, run (l1 ++ l2) s = run l2 (run l1 s).
Proof.
induction l1; intros l2 s; simpl; auto.
destruct a; simpl; auto.
- (* IPush *). rewrite IHl1. reflexivity.
- (* IAdd *). destruct s as [| n2 [| n1 s']]; simpl; auto; rewrite IHl1; reflexivity.
Qed.
Theorem compile_correct' : forall e s,
run (compile e) s = (eval e) :: s.
Proof.
induction e; intros s.
- reflexivity.
- simpl. rewrite run_app. rewrite IH1. rewrite run_app. rewrite IH2. simpl. reflexivity.
Qed.
Now the theorem is correct. The proof uses structural induction and the lemma run_app which itself is proved by induction on the first list. This is a good example of the standard approach: decompose the problem, prove auxiliary properties, and combine them.
5.5 Extraction to OCaml
We can now extract the compiler:
Extraction Language OCaml.
Extraction "compiler.ml" expr instr compile run eval compile_correct'.
The extracted OCaml will include:
type expr = ENum of int | EPlus of expr * exprtype instr = IPush of int | IAddval compile : expr -> instr listval run : instr list -> int list -> int listval eval : expr -> int
The correctness theorem is erased. However, we can trust that the OCaml function compile is correct because we proved it in Coq. In a real project, you would write a small OCaml wrapper that parses expression strings, calls compile, and outputs assembly (here, just instructions).
5.6 Reflections on the Methodology
This case study demonstrates the algorithmic methodology of Coq:
- Define inductive types for languages.
- Define functions by recursion (semantics, compilation).
- Prove properties by induction using tactics (intro, induction, rewrite, etc.).
- Extract the computational parts to OCaml.
The proof of run_app uses induction on l1 and case analysis on the instructions, which is exactly the algorithm for traversing the list. The proof of compile_correct' uses rewriting with the lemma, which is a form of equational reasoning. The entire process is a careful application of the inference rules of the Calculus of Inductive Constructions.
The elegance of Coq is that the same constructs work for both the programming and the proving aspects. The effort to write the proofs is initially higher than writing the code alone, but the payoff is absolute confidence that the compiler is free of bugs (assuming the specification compile_correct captures the intended behavior). In practice, one also proves that the compiler always terminates (by structural recursion) and that it doesn’t crash (type safety).
6. Advanced Topics and Ongoing Research
6.1 Homotopy Type Theory and Univalence
Coq’s type theory is extensible with additional axioms. One of the most exciting developments is Homotopy Type Theory (HoTT), which adds the univalence axiom and higher inductive types. Coq has been used to formalize parts of HoTT, and there is a proof assistant (Agda) that embraces it more fully. HoTT provides a new perspective on equality—instead of being a proposition, equality can be interpreted as a path in a space. This has profound implications for mathematics and for verifying properties of data structures (e.g., proving that two implementations of a set are equal). Coq can handle HoTT with some careful management of universe levels, but it is an area of active research.
6.2 Automation: Tactician, CoqHammer, and Machine Learning
The future of proof assistants lies in automation. The Tactician project aims to learn proof patterns from existing Coq libraries and suggest tactics. CoqHammer translates Coq goals to first-order logic and uses ATPs (automated theorem provers) to find proofs. There is also work on using deep learning to generate tactic scripts. As of 2025, these tools are becoming more practical, reducing the human burden of proof writing.
6.3 Performance of Extracted Code
Extraction to OCaml produces code that is usually within a factor of 2-5 of hand-written OCaml, but for performance-critical parts, you may want to extract to C or Rust. There is ongoing work on extraction to C (e.g., the Verified Software Toolchain includes a C extraction tool) and extraction to Rust (using the coq2rust tool). Moreover, Coq’s native arithmetic via int63 or primitive integers can be extracted to machine integers, greatly improving performance for numerical code.
6.4 Integration with Real Systems: Coq and OCaml
Because Coq is implemented in OCaml, it is natural to write Coq plugins in OCaml that extend its capabilities. The extraction path also allows Coq to generate OCaml libraries that can be plugged into larger systems without a runtime overhead. For example, Amazon has used Coq to verify the AWS KMS (Key Management Service) and extracted a verified cryptographic implementation to OCaml that runs in production.
7. Conclusion: The Algorithmic Craft of Formal Proof
We began with the tragic failure of the Ariane 5—a cautionary tale of unverified assumptions. We then delved into Coq as a tool to eliminate such assumptions through mathematical proof. Along the way, we uncovered the algorithmic methods that make this possible:
- Inductive types provide a disciplined way to define data and propositions, with automatic induction principles that enable structural reasoning.
- Tactics are small, composable algorithms that build proof terms incrementally, using rewriting, pattern matching, case analysis, and automation.
- Extraction bridges the gap between the world of proofs and the world of efficiently executable code, allowing us to deploy verified programs in OCaml.
The methodology is not magic. It is a rigorous application of the Curry-Howard correspondence, where every proof step corresponds to a term construction in the lambda calculus. The Coq kernel checks these terms, and because the kernel is small and trusted, we can have confidence in the entire system.
Yet, formal verification remains a craft. It requires careful specification, structured proofs, and an understanding of the underlying algorithmic principles. The learning curve is steep—but the rewards are immense. When you extract a program that is guaranteed to be correct, you are not merely hoping for no bugs; you have a mathematical certificate that none exist.
As software continues to embed itself into every aspect of life, the need for such certificates will only grow. The Ariane 5 was a wake-up call. Coq and its algorithms offer a path forward. Whether you are building a compiler, a cryptographic library, or the next medical device, the tools are now mature enough to help you achieve mathematical certainty.
The question is: are we ready to invest in that certainty?
This blog post is part of a series on formal verification. Next time, we will explore the use of Coq for verifying concurrent algorithms with separation logic.