Exploring the Lean4 Language

As a tech enthusiast, I’ve always been interested in new concepts that new programming languages can provide us. My curiosity has often taken me into uncharted territories, which remain unexplored for most developers. Lean4 is one of these mostly unexplored technologies that brings new features and techniques to the development of specific applications. In this post, I’ll take you on my journey of discovering Lean4, starting from how I began to learn it and all the challenges and cool features that pure functional programming teaches us.

What is Lean4?

Lean4 is a relatively new, purely functional programming language for mathematicians to prove theorems. However, Lean4’s capacity extends far beyond the realm of mathematics. It’s a versatile and powerful language that features Dependent Types, Macros, FBIP, and many robust features for many applications.

My background

I started my journey on Lean4 with an extensive technical background in several programming languages, including Haskell and Idris. My prior experience served as a strong foundation, enabling me to transition between languages quickly. However, It’s essential to recognize that my journey in programming hasn’t always been so smooth.

The first dependently typed programming language I tried to learn was Idris, which is based on Haskell, and it was quite a challenging experience. The initial source of frustration was its syntax, which diverged significantly from any other programming language that I had previously encountered, i.e.

-- WARN: You don't need to understand it yet.

IntOrString : Bool -> Type
IntOrString True  = Int
IntOrString False = String

At first glance, the code resembles a mess, and I was unsure what this horizontal-oriented block of code was trying to say. Determined to discover more about this language, I asked people on a public discord server for functional programming about it. Soon, I made a friend called Heitor, who knew nothing about Idris, but he was working with Haskell on a company called Serokell. He guided me on my journey through this incredible language called Haskell using a book affectionately referred to as "The Purple Book of Haskell".

After hours of learning Haskell, I discovered the concept of an equational-style pattern match that resembles the Elixir
style of doing the same, i.e.

InvertPhrase (x: String) : String
InvertPhrase "yeah" = "noo"
InvertPhrase "noo"  = "yeah"
InvertPhrase _      = "?"

That corresponds to something like this in JavaScript:

let invertPhrase = (x) => {
    switch(x) {
        case "yeah": return "noo"
        case "noo": return "yeah"
        default: return "?"
    }
}

As I started to learn more about this pattern, that piece of code in Idris in the example earlier began to make perfect sense. As we saw, it essentially states that if the first argument, of the type Bool, evaluates to True, then it returns the type Int. Yes, in Idris, types can be manipulated as First Class Citizens. Conversely, if the argument is evaluated to False, it returns the type String. This feature quickly became one of my favorites, primarily due to the incredible possibilities it unlocked.

Lean4 has a couple of similarities to TypeScript since TypeScript is a language with a pretty powerful type system that allows us to make computations, parsers, and many other things at the type level of the application. However, unlike Lean4, TypeScript has a lot of unsoundness problems that would lead to proofs of false statements, making it not suitable for theorem proving. Despite that, we can say TypeScript has some kind of Dependent Types, which are building blocks of theorem proving in some languages like Agda, Coq, and Lean4, i.e.

type IntOrString<T extends boolean> = T extends true ? number : string;

const int: IntOrString<true> = 1;
const string: IntOrString<false> = "hello";

We can also make the Generic type depend on the value of a parameter, like the following:

type IntOrString<T extends boolean> = T extends true ? number : string;

function doSomeStuff<E extends boolean>(value: E): IntOrString<E> {
  return value ? 1 : "hello";
}

let int: number = doSomeStuff(true); // returns 1, that has type number
let string: string = doSomeStuff(false); // returns "hello" that has type string

This example shows how we can express the same code shown earlier but in a TypeScript. There are a lot of similarities between this and the Idris example, but one notable problem is that IntOrString here is a Type that receives T, and in Lean4 it’s simply a function that returns a type as first-class value.

One outstanding example is the printf function, which is challenging to give a type in mainstream languages like C without forcing the compiler to have hardcoded stuff to make it work. In Idris, however, this can achieved by creating a function that, given a String, generates a Type, i.e.,

printf "%d plus %d" 1 2        -- Compiles!
printf "%s and %f" "hello" 1.0 -- Compiles!
printf "%d" "incorrect"        -- Throws a type error, indicating that Int differs from String.

Unlocking the power of Monads

Later on, I was introduced to the concept of Monads. At that time, I tried to understand the cryptic phrase "A Monad is just a monoid in the category of endofunctors" which I still don’t understand till this day. However, after much practice with many types of Monad, I got the intuition to do it (at least in a programming context).

This revelation marked one of the most astonishing concepts I have ever encountered, opening doors to many possibilities. It empowers me to craft Domain Specific Languages with unconventional control flows, such as a Debug Monad capable of performing time travel, await/async, unix-pipe-like operations, and even some kind of dependency injection — all of this under a convenient piece of syntax that is the "do notation", i.e.

-- The Maybe Monad
option = do
    a <- doA
    b <- doB
    c <- doC
    pure (a + b + c)

In this example, we explore the Maybe Monad in Haskell. The Maybe Monad is a powerful construct that enriches our ability to represent the presence or lack of a value (just like null). For instance, a Maybe Int can be either Just or Nothing, however, we can use it in odd ways via the do notation.

In the code snippet, we showcase an odd case for the Maybe, if doA, doB, and doC all have the type Maybe Int, our monadic operations become particularly interesting. When we use the <- operator to extract values from within Just, if any of them happen to be Nothing, the entire expression gracefully collapses into Nothing.

For instance, with doA = Just 1, doB = Just 2, and doC = Just 3, we get a = 1, b = 2, c = 3, and the result is the sum of these values. However, if even one of the three functions returns Nothing, the computation stops immediately, and the entire expression becomes Nothing. This behavior offers a good and precise way to control things that are required to run some other computation.

One practical example of the Maybe Monad is managing required environment variables, as illustrated in the following snippet:

getHostAndPort = do
    host <- getEnvironment "HOST" -- This function returns `Maybe String` but with `<-` we can get the string from the insides
    port <- getEnvironment "PORT"
    pure (Config host port)

In this snippet, we cannot start the application without the host or the port so if one of them Nothing in the environment then the entire getHostAndPort function will return Nothing, otherwise it will return something like Maybe (Config "0.0.0.0" "8080") with the entire configuration. We can use this Nothing value to render some warning for the user or to default to some predefined state.

stdinPipe = do
    putStr "> "          -- Prints the promptLine "> "
    res <- getLine       -- gets a line from Stdin
    yield res            -- uses the `yield` function that pauses this pipe and send the "res" variable to another pipe
    stdinPipe            -- makes an infinite loop by recursion

stdoutPipe = do
    res <- await        -- Await for a message from a upper pipe
    putStr res          -- Prints the res variable
    stdoutPipe          -- makes an infinite loop by recursion

-- This is a echo program made by the composition of the stdinPipe and stdoutPipe.
-- Everytime the stdinPipe yields, the control is passed to the stdoutPipe that
-- gets the message through the `await` function. All of these constructions are functions, not special keywords.

echo = stdinPipe |. stdoutPipe

This code snippet illustrates the usage of the PipeT monad transformer that allows something like the unix-pipe abstraction through the function |..

This journey through Haskell took me almost six months, during which I also learned a lot about Idris. I embarked on various projects, including the creation of a language called Yurei that implements the Algorithm-W, a virtual DOM called Maka, and contributions to the primary package manager and a GL library.

Transitioning from Idris to Lean4

After some disappointments with the Idris community and the state of the package manager that days after being chosen as the principal package manager got paused because the author was focusing on a PhD, I started to move to other languages, and Lean4 was one of them.

Lean4 offered a lot of advantages not found in other theorem prover languages, including a focus on the programmer’s perspective (as opposed to solely caring about mathematicians), a revolutionary garbage collector that enables functional programs to perform like imperative languages and purely uses mutable constructions, and tactics that are not present in languages like Idris, Agda, and Kind.

The transition between Idris and Lean4 revealed significant differences between these two. The primary distinction was that Lean4 introduced the concept of "Functional But in Place" offering a new approach to functional programming. In contrast, Idris features Quantitative Types, which serve a similar purpose of tracking where a type was used or not exactly once.

Consider the following code, which shows the creation of an Array. In Idris, it should be used exactly once, so this code would give us an error saying that array is used twice.

The purpose of this is to track when the array is shared, if it’s used exactly once in compile-time then it’s safe to mutate it, and you will have the same behavior every time. In contrast, Lean will copy one of the arrays (with a push operation complexity of O(n)), while the other push operation will potentially run in constant time. These represent two different ways for using mutability while maintaining a purely functional approach.

createArray = do
  array <- newArray
  arr1 <- push array 1
  arr2 <- push array 2
  pure ()

While Idris gets inspiration from Haskell, Lean4 exhibits a syntax like traditional ML languages like OCaml. The difference is significant for me, as Lean4’s syntax contains a lot of sugars for the usual things that we do in functional programming languages. Below, we have an example of code in Lean4:

def a : Int := 2

def not : (x: Bool) -> Bool
  | true = false
  | false = true

In Idris, the equivalent code would be:

a : Int
a = 2

not : Bool -> Bool
not true = false
not false = true

In Lean4, there’s no need to repeat variable names, you can simply use a | instead.

It’s worth noticing that even with Idris has an LSP, it’s not as good as the Lean4 extension for Visual Studio Code. The Lean4 extension offers features such as inline computations, widgets for visual proofs, improved error reporting, and numerous other capabilities that a simple LSP does not offer. Despite being unstable, the Lean4 extension is one of the most impressive pieces of tooling that I have already seen.

Intrigued by Lean4’s potential, I embarked on a new journey, discovering new ways to use these incredible features in a fun way. In the end, what truly accelerated my learning was hands-on projects. I tried to do some projects involving parser generators that taught me considerably about Lean4’s meta-programming. I also read books on finite automata machines and tried to write the proofs in Lean4, which taught me a lot about tactics.

theorem Nfa.Nfa_ε.NotAcceptsEntries: ∀ {σ: Type} {x : σ} {xs: List σ}, Nfa.Accepts (x :: xs) Nfa.Nfa_ε → False
  | sig, x, xs, p => absurd p.proof $ by
    let f : fun y => y = (@Nfa.state sig 2 Nfa.Nfa_ε) = (@Nfa.Nfa_ε.initial sig) := rfl
    simp [Set.in,  Nfa.Nfa_ε.LeadToFinal]
    intro x2
    apply Exists.elim
    . assumption
    . intro a
      rw [f, Nfa.Nfa_ε.LeadToFinal]
      simp [End, Nfa_ε]
      intro and
      let f := Eq.trans (Eq.symm and.left) and.right
      contradiction

Here is a proof about a non-deterministic finite automata not accepting entries when it’s empty. It’s easier to read for mathematicians than the usual proofs of Idris and Agda because of the style. It explicits the introduction of new variables, contradictions, rewrite rules, simplification, and assumption.

Beyond formal things, I learned how to use it in practical things like a HTTP Response that only lets us return a set of status codes like in:

-- This example compiles!
def someRoute : Response [Status.ok, Status.not_found] :=
    Response.ok

-- This example does not compiles because bad gateway is not in the list specified in the response type
def routeThatDoesNotCompiles : Response [Status.ok, Status.not_found] :=
    Response.bad_gateway

Using Lean4 in "practical" scenarios like this gave me a sense of accomplishment and demonstrated its utility beyond theoretical exercises. In the event called "Rinha de Backend" I used Lean4 to make a HTTP Server and a Postgres client, and I needed to deal a lot with native bindings and how things worked under the hood (by reading the compiler’s code). Reaching the current days, I’m thinking a lot about maintaining some HTTP libraries for Lean4 as it reaches a stable release.

The challenges of learning Lean4

Lean4 stands out as my preferred theorem prover, offering a unique blend between common language features and innovative ones like dependent types. However, currently, it comes with a set of challenges. One notable one is the instability of the LSP, which can lead to crashes. Additionally, Lean4’s language and standard library are unstable and change a lot to make it suitable for the compiler.

Lean4 is a great choice if you’re trying to learn innovative concepts, but it comes with a set of challenges. If you’re willing to learn Lean4, I recommend this book.

References

We want to work with you. Check out our "What We Do" section!