Intro to proof assistants
who transitioned into knowledge science, and I work every day with machine studying algorithms. I’m fascinated each by their obvious magic and by the arithmetic that underlies them. Pry open any machine studying library and also you’ll discover mathematical methods involving matrix decompositions, convolutions, Gaussian curves, and extra. These, in flip, are constructed on much more elementary axioms and guidelines, reminiscent of operate software and logic.
Throughout my journey to be taught these primitives, I collected an entire shelf of arithmetic books, lots of which now collect mud. I additionally enrolled on the Open College, the place I attended lessons over Zoom and realized about syntax and axioms, then find out how to mix them to construct theorems.
The subject was fascinating: axioms are like recreation items and authorized strikes on a board, whereas theorems are authorized performs, some extra attention-grabbing than others. Proving a theorem means unwinding the gameplay simply sufficient to persuade the gamers that it’s reachable, or disproving it by declaring an impossibility. For instance: “The 2 bishops are on white squares, and a bishop can by no means change to a sq. of a unique shade.”
However on a sensible degree, the programs had been tedious. I used to be given PDFs to unravel by hand after which submit scans of my work on-line. Whereas scribbling with pencil on paper, fighting the course assignments I had two insights:
- Constructing a proof is like programming.
- There’s no manner I can interact with arithmetic in these early century situations. I would like an IDE, with kind hints, syntax highlighting, and descriptive error messages, and interact with it interactively.
My ideas weren’t new. The primary thought known as the Curry-Howard correspondence, defined under. The latter is the objective of proof assistants reminiscent of Lean and their IDE extensions (often a VS Code extension).
Interactive proof assistants vs computerized theorem provers
Theorem checkers like Lean’s kernel can decide whether or not expressions are well-formed, if the steps are authorized, and if the ultimate conclusion is legitimate. Interactive proof assistants construct on them and also will make it easier to construct your proofs and counsel steps. Automated theorem provers (ATPs) try to search out the proof on their very own utilizing AI methods. Lean works as a proof checker and assistant, however it pairs properly with generative AI to operate successfully as an ATP.
Terence Tao streams AI-assisted proofs on his YouTube channel utilizing Lean 4 and GitHub Copilot. Many of the headlines within the type of “AI simply solved an x-years-old open downside” had been possible formalized with Lean (example 1, example 2, and 3). AI tasks and benchmarks reminiscent of LeanDojo, miniF2F, ProofNet, PutnamBench, FormalMATH, use Lean. OpenAI and Meta have educated fashions to make use of Lean, and DeepMind’s AlphaProof used it to earn a silver medalist at the International Mathematical Olympiad.
There’s a nice current corpus of guides and tutorials for Lean, in addition to a great neighborhood, but the target market doesn’t appear to be builders skilled in different programming languages. I used to be struggling to parse Lean’s syntax in my head, and as I realized, I seen I used to be strolling on unpaved territory. Hopefully the next sections will information you on this path.
The Curry-Howard correspondence
Let’s begin with a easy logic theorem, one thing you may need to show in an undergraduate logic class:
(P → Q → R) → (P ∧ Q → R)
The operator → denotes logical implication, however within the Curry-Howard correspondence between proofs and laptop packages it may be learn as a operate. ∧ is the logical “and.” In plain English, we’re asking whether or not this assertion is true: “If I’ve a operate that, given P, produces a operate from Q to R, then if I have already got each P and Q already, I can produce R.”
Notice:
(P → Q → R)is a “curried operate,” (Haskell Curry is right here accountable as properly), and the syntax is equal to(P → (Q → R)). In programming phrases, it’s a operate that returns one other operate, which may later be invoked asfoo(bar)(baz). It’s logically equal tofooreceiving eachbarandbazas a tuplefoo((bar, baz)), however that’s what we’re making an attempt to show right here. Likewise, your complete sentence may very well be written as(P → Q → R) → P ∧ Q → R.
Let’s think about this with a concrete instance: If I’ve a merchandising machine that given a coin (P) returns a cup of instantaneous noodles (Q → R), which given sizzling water (Q) returns noodles (R), than implies that I could make a system that given each a coin and sizzling water (P ∧ Q) as inputs, I can return noodles.

In the event you had been informed to show that that is potential as a programmer, you could be tempted to take action by writing a operate that conforms to this situation. In TypeScript we’d write:
Attempt it within the TypeScript playground, modifying something will trigger kind errors.
Written generically and simplified (playground):
We outlined a operate — our theorem — that receives a curried operate (P → Q → R) as a parameter and outputs a operate that receives a product kind P ∧ Q and outputs R, that’s, (P ∧ Q → R). This operate acts as a “witness” that the logic theorem is true. The proof appears sound and it even satisfies the categories that spell the unique theorem in TypeScript:
Let’s state this even stronger: The proof is sound as a result of it happy the sort. Propositions are sorts, and proofs are phrases of these sorts. We additionally noticed that conjunctions ∧ are product sorts / tuples, and that logical implications Q → R are features (q: Q) => R. The arrow is equal each syntactically and semantically.
Let’s implement this in Lean:
You may play with it in Lean’s playground.
instance is a strategy to outline nameless declarations solely to verify the categories (not like the key phrase theorem), and its kind (what follows the colon : as much as the definition :=) is our theorem. The proof is achieved when a definition “inhabits” this kind. As is trendy in purposeful programming, operate software is simply whitespace: essentially the most elementary operation is mapped to essentially the most fundamental operator within the language. P ∧ Q is a product kind / tuple, an its members are accessed through .left and .proper .
Discover that the sort and the implementation are almost an identical — the sort alone is sufficient to decide the code. In truth, given a sort signature, you need to use Loogle to seek for matching definitions in Lean’s library.
Different Curry-Howard Correspondences
| Logic/Lean | Kind kind | TypeScript Approximation |
|---|---|---|
| P ∧ Q | product kind | [P, Q] or { left: P; proper: Q } |
| P ∨ Q | sum kind | P | Q * |
| P → Q | operate kind | (p: P) => Q |
| True | unit kind | undefined, or null |
| False | empty kind | by no means |
| ¬P | operate to empty kind | (p: P) => by no means |
| P ↔ Q | pair of features | [(p: P) => Q, (q: Q) => P] |
| (P ∧ Q) → R | operate from product | (pq: [P, Q]) => R |
| P → Q → R | curried operate | (p: P) => (q: Q) => R |
* A greater sum kind could be a discriminated union: { tag: “left”; worth: P } | { tag: “proper”; worth: Q }
What you haven’t seen but
Dependent Varieties
Let’s have a look at the next, extra complicated, kind:
It means: “For all values (∀) of x which might be pure numbers, 0 is much less or equal to x.” the very first thing we discover is the Unicode mathematical symbols; these can both be auto-completed within the IDE with forall or just substituted with the ASCII key phrase forall.
This strikes us as not trying like a sort. The kind is dependent upon the worth of x — one thing that we often get from complicated validation libraries like Zod, not from static kind checkers like TypeScript. It’s a sort that “relies upon” on values. But Lean’s sorts don’t depend on runtime checks the best way Zod does; it’s nonetheless a sort, albeit a fancy one. It is a supply of confusion when studying Lean’s syntax.

Universes
Within the playground or IDE, we will verify the kind of phrases with #verify:
Now let’s verify the kind of Nat:
Nat is of kind Kind. However why cease there?
Kind is a sort of Kind 1, and so forth. This exhibits that there isn’t a tough syntactic distinction between sorts and phrases in Lean. Varieties are phrases which have their very own sorts, stacked like onion layers — these layers are known as ‘universes.’ 3 is a time period of kind Nat and Nat is a time period of kind Kind, Kind n has kind Kind (n+1), and so forth.
Lean should preserve monitor of universe ranges to keep away from the barber’s paradox: “A barber shaves everybody who doesn’t shave themselves, does the barber shave himself?” With kind universes, the barber is a process whose kind can solely function on components of a sort from a universe under it, it can not function components by itself degree, together with himself. This avoids the self-referential paradoxes that might trigger inconsistency. Inconsistency would imply we may show False to be true, and thru it show all propositions.
Propositions
Let’s strive a fair more durable kind:
In plain English: “Is that this true? For each pure quantity x, y, z, and n, given a proof that n > 2and x * y * z ≠ 0, we will produce a proof that xⁿ + yⁿ ≠ zⁿ”
That is Fermat’s Final Theorem, which dumbfounded mathematicians for 358 years after Fermat claimed his personal proof didn’t match the margins of the web page. The definition makes use of sorry which is a placeholder for an incomplete proof. It should compile however will inform you: declaration makes use of `sorry`. It looks like utilizing TypeScript’s any to silence its kind errors, however a reviewer would possibly discover it and ask you to repair the lint error. Offering this proof is your “objective.”
Let’s stroll again into one thing simpler:
That is ridiculous. How can I outline a time period whose kind is 2 + 2 = 4?
Let’s begin by trying on the kind:
The kind is a Prop, quick for proposition. A proposition is a press release that may be true or false. As we said above, within the Curry-Howard correspondence a proposition is a sort, and a proof is a time period or program of that kind. The proposition is true if the sort has “an inhabitant” and false if it doesn’t. Lean doesn’t “resolve” a proposition right into a Boolean true or false in a computational sense; we simply say it’s “true” or “false.” Utilizing Lean as a proof assistant means working principally with propositions.
Behind the scenes, Lean’s kind checker will normalize this kind for you into 4 = 4, however no additional. We simply want to supply a time period that inhabits this normalized kind. We do that with rfl, an entity constructor routinely imported from the Prelude package deal into your namespace.
This works. It’s declarative syntax for asserting that 2 + 2 = 4 is true by reflexivity. To grasp why this works, let’s verify the sort:
These are generics, which by conference use Greek letters. The curly brackets imply that Lean will infer them for you.
rfl.{u} is the title of the operate and its universe degree, (your layer of the categories onion). {α : Type u} means α is a sort at universe degree u, nothing to do with “sortable,” it’s simply of the “kind” u. {a : α} means a is a time period of kind α, and the kind of rfl is finally the proposition a = a. In our case, α is Nat and a is 4 , rfl has the sort 4 = 4 , matching the sort Lean normalized above. We’ve discovered an inhabitant for 2 + 2 = 4 proving the proposition by reflexivity.

Techniques
Earlier we solved our merchandising machine instance with operate software alone, however there are numerous extra attention-grabbing primitives in Lean. In Lean you can begin your proof with by which prompts “tactic” mode:
This allows you to construct your proof step-by-step, whereas the appropriate pane of your IDE (the InfoView) progressively exhibits your native assumptions and your objective, marked by ⊢. Techniques are a mixture of macros and syntactic sugar that resolve the proof state for the type-checking language kernel. They make it easier to assume analytically and in a declarative manner whereas tackling your objectives. It’s an imperative-style answer inside a purposeful language, operating inside a monad.
You’ll have to familiarize your self with techniques simply as somebody studying SQL does with analytical constructs like GROUP BY, PIVOT, and LEFT JOIN. Eradicating sorry will present the state and objectives within the InfoView:
We’ve one objective to show. Coin and HotWater are propositions. Noodles is an unknown of kind ?u.29, a random ID for a yet-unassigned universe degree. Our objective is to provide a proof of (Coin → HotWater → Noodles) → Coin ∧ HotWater → Noodles.
Let’s use the tactic intro to introduce an assumption into scope. It should routinely seize the enter kind (Coin → HotWater → Noodles) — one of many proofs I already “have” to work with:
The InfoView will refresh and present you the up to date native assumptions and objective:
Along with Coin, HotWater, and Noodles, we now have a neighborhood speculation (or assumption): noodleVendingMachine, a continuing of kind Coin → HotWater → Noodles (a curried operate). Our objective is now to provide Coin ∧ HotWater → Noodles.
Let’s introduce one other variable that can routinely seize the subsequent assumption, of kind Coin ∧ HotWater:
InfoView now exhibits:
Now we solely have to return Noodles utilizing our toolbox of techniques and the assumptions in native scope. We’ve a curried operate noodleVendingMachine and a product kind (tuple) coinAndHotWater. Let’s apply the Coin and the HotWater from coinAndHotWater to noodleVendingMachine:
The tactic precise serves as a return key phrase when now we have the precise kind to return. .left and .proper are the weather of the product kind / tuple coinAndHotWater, which we utilized in an effort to the curried operate noodleVendingMachine. Do not forget that whitespace is operate software.
InfoView exhibits:
This may be written generically as:
Satisfying our unique theorem (P → Q → R) → (P ∧ Q → R) above.
Some techniques assist decompose a objective into smaller obligations, reminiscent of intro, cases, constructor, apply, or refine. Others rewrite expression utilizing assumptions of equivalences, reminiscent of rw or simp. Different techniques normalize expression by computation or algebraic manipulation, reminiscent of rfl, and ring. And a few present focused automation: linearinth, omega, aesop. Seek the advice of the tactic reference.
Considering in techniques is a chic talent that may solely come from expertise. To reuse a earlier simile, it’s like a knowledge analyst pulling up the appropriate analytic features and effortlessly cleansing up, enriching, and summarizing tables — or a senior software program engineer piping collectively the appropriate Bash instructions or recognizing the cleanest design sample for the job.
Overarching Imaginative and prescient of Theorem Provers

Leibniz dreamed of a common formal language of thought (Characteristica universalis) paired with a mechanical calculator (Calculus ratiocinator) that might show all theorems and philosophical arguments by turning a crank. Hilbert requested, “is this possible?” — satisfied till loss of life that “there’s nothing that we will not know.” This was considered “the primary downside of mathematical logic” and “essentially the most common downside of arithmetic.” Gödel although about this and proved that some truths can’t be proven, whereas Church and Turing each independently proved that “often you’ll be turning that crank forever,” arguably beginning the sector of laptop science within the course of. To make use of the chess analogy from earlier, automated theorem proving is like looking the sport tree for successful positions, however arithmetic has an infinite board, so the search might by no means finish — some performs are “undecidable.” Nevertheless, we will nonetheless verify whether or not a transfer is authorized, and that is the place we at the moment are with proof assistants.
In 1956, Bertrand Russell wrote a letter concerning Logic Theorist, that had routinely proved theorems from Principia Mathematica, and which is usually described as the primary AI program:
“I want Whitehead and I had identified of this risk earlier than we wasted ten years doing it by hand. I’m fairly prepared to imagine that all the things in deductive logic will be finished by equipment.”
Conclusion
Lean has a steep studying curve, partly as a result of its syntax and its use of techniques, which it’s essential to be taught. Fighting Lean comes, nonetheless, with a reward: you’re studying mathematics. Varieties, universes, techniques, theorems, and proofs are usually not quirks of one more programming language however ideas that mathematicians and logicians struggled with for generations. They aren’t ineffective information to memorize however thought patterns as historical as Euclid and as everlasting because the theorems we proved.
When you’re acquainted with the basics, strive shifting on to extra superior matters with Mathematics in Lean, the place you’ll deal with issues in discrete arithmetic, linear algebra, differential calculus, and topology. Or delve deep into the foundations ranging from zero (actually) by way of Tao’s Lean companion to “Analysis I” Book. Mathlib, Lean’s collaborative library, already has 1.6 million strains of formalized arithmetic that you need to use to bridge to all branches of arithmetic.

