Efficient, Extensible, Expressive: Typed Tagless Final Interpreters in Rust

An explanation of typed tagless final interpreters by Carette et al, with examples in Rust. The main contribution of this post is to explain what "typed tagless final" means, and show that Rust with generic associated types (GATs) is a good fit for writing interpreters in the final style.

Programming involves using and creating APIs: for iterators (map, filter, collect, ...), plotting, data frames, testing, validation or simply formatting strings. When we're creating a new API, we typically think in terms of defining functions to allow users to work with our library. Our job as library implementers is to implement the functions and define the abstractions.

A different perspective is to think of these APIs as little languages, or domain-specific languages. As opposed to general-purpose languages, domain-specific languages or DSLs focus on solving one particular problem well.

From the DSL perspective, we're thinking in terms of defining language constructs, instead of functions. And our job as language implementers is to write an interpreter for the language. More specifically, we can think of it as embedding a DSL, say for formatting strings, into a general-purpose host language like Rust. With this perspective, the first argument to format!("Hello, {}!", "world"); is a small language to specify how to format values of various types: it even has a syntax.

In this post we'll focus extensively on the embedded language perspective - just keep in the back of your head that the applicability of these ideas is broader than interpreters, strictly speaking. As illustration, in the last part we describe a safe, type checked implementation of string formatting: just like with Rust's format strings, the arguments to the format specification are type checked, but we don't use macros or compiler extensions. Additionally, the same format specification can be used to do scanf, or parsing.

Full Rust code available in the companion repo.

Before we begin

This post contains dense Rust code, and not much in terms of visualizations. I assume proficiency with Rust, and familiarity with enums, generics, traits, and associated types.

The build-up and code examples follow Oleg Kiselyov's lecture notes on typed tagless final interpreters closely. All I did was translate the examples to Rust. As it turns out since Rust gained GATs (generic associated types) it's a pretty good vehicle for embedding languages in the final style.

This post has three parts.

Warming up: An introduction to writing interpreters for a simple language with addition, negation, and integers. We build up to the final style, address some of its apparent shortcomings, and show its extensibility advantages.
Higher-order fun: Writing an interpreter for a higher-order language, with functions, increases the complexity but also shows an important advantage of the final style.
Applying the final style - a translation of Kiselyov's type-safe formatting to Rust.

Part 1: Warming up

Consider first the typical way of defining a simple expression language as an embedded DSL. For reasons that will become clear later, the language we'll look at is just slightly more expressive than Hutton's razor: we'll write a calculator that can only add or negate integers. We'll introduce the language using the familiar style which Kiselyov calls "initial". Then, we'll build up to the final style, and discuss its trade-offs.

Expressions, initial style

We start with defining the abstract syntax tree for our language using a Rust enum:

enum Expr {
    Lit(i32),                   // integer literal
    Neg(Box<Expr>),             // negation
    Add(Box<Expr>, Box<Expr>),  // addition
}

With some helpful constructors:

impl Expr {
    fn lit(i: i32) -> Expr {
        Expr::Lit(i)
    }

    fn neg(r: Expr) -> Expr {
        Expr::Neg(Box::new(r))
    }

    fn add(r1: Expr, r2: Expr) -> Expr {
        Expr::Add(Box::new(r1), Box::new(r2))
    }
}

We can now define simple expressions in our language.

Expr::add(
    Expr::lit(8),
    Expr::neg(Expr::add(Expr::lit(1), Expr::lit(2))),
)

We can't do anything with those expressions yet. Let's try evaluating them by writing an interpreter.

impl Expr {
    fn eval(&self) -> i32 {
        match self {
            Expr::Lit(i) => *i,
            Expr::Neg(r) => -r.eval(),
            Expr::Add(r1, r2) => r1.eval() + r2.eval(),
        }
    }
}

Easy enough - we pattern match on Expr and recursively call eval. Calling eval on the example yields 5, as expected.

It's also straightforward to interpret the same expression differently. For example, we can view it:

impl Expr {
    fn view(&self) -> String {
        match self {
            Expr::Lit(i) => i.to_string(),
            Expr::Neg(r) => format!("(-{})", r.view()),
            Expr::Add(r1, r2) => format!("({} + {})", r1.view(), r2.view()),
        }
    }
}

Again, pattern matching is very useful here. Calling view on the example yields "(8 + (-(1 + 2)))".

Kiselyov calls this style of writing EDSL the "initial" style. Loosely speaking, the language is defined as an abstract syntax tree using an enum and interpreted using pattern matching on expressions.

Expressions, final style

Suppose we just want to evaluate expressions. We could do that more simply by working with a representation of integers in the host language Rust directly.

type Repr = i32;

fn lit(i: i32) -> Repr {
    i
}

fn neg(r: Repr) -> Repr {
    -r
}

fn add(r1: Repr, r2: Repr) -> Repr {
    r1 + r2
}

Compared with the initial style, we now have a function corresponding to each enum variant. Writing expressions in this style looks very similar to writing expressions in the initial style:

add(lit(8), neg(add(lit(1), lit(2))))

Calling this function immediately evaluates the expression. The expression is the evaluator is the interpreter. There is no intermediate representation for us to pattern match. The interpreter uses the host language operations (addition, negation) and data types (integers) directly.

However, it is not flexible enough. We can't pretty-print expressions, or manipulate them in any way except evaluate them. What we'd like to do is overload these functions so we can define new implementations, and choose a different representation. That's exactly the kind of thing traits are for.

trait ExprSym {
    type Repr;

    fn lit(i: i32) -> Self::Repr;
    fn neg(r: Self::Repr) -> Self::Repr;
    fn add(r1: Self::Repr, r2: Self::Repr) -> Self::Repr;
}

All we've done is moved the top-level functions to a trait and added an associated type for the representation.

We can now re-implement the evaluator, as an implementation of this trait:

struct Eval;
impl ExprSym for Eval {
    type Repr = i32;

    fn lit(i: i32) -> Self::Repr {
        i
    }

    fn neg(r: Self::Repr) -> Self::Repr {
        -r
    }

    fn add(r1: Self::Repr, r2: Self::Repr) -> Self::Repr {
        r1 + r2
    }
}

Building an expression now needs access to a type parameter E: ExprSym - the counterpart for Expr in the initial style - so I'll show the full function. Other than that, it looks remarkably similar to the initial approach:

fn tf1<E: ExprSym>() -> E::Repr {
    E::add(E::lit(8), E::neg(E::add(E::lit(1), E::lit(2))))
}

And we can implement a viewer now:

struct View;
impl ExprSym for View {
    type Repr = String;

    fn lit(i: i32) -> Self::Repr {
        i.to_string()
    }

    fn neg(r: Self::Repr) -> Self::Repr {
        format!("(-{})", r)
    }

    fn add(r1: Self::Repr, r2: Self::Repr) -> Self::Repr {
        format!("({} + {})", r1, r2)
    }
}

Again, compared to the initial style, there is no intermediate representation. Integers in the embedded language are Rust integers. Addition is Rust addition. The final style certainly looks like a zero-cost abstraction, while the initial style may not be - or at least it'd need much more compiler optimizations to get rid of the enum variants and pattern matching.

A Rust-specific snag

The eagle-eyed among you may have noticed I haven't defined a final counterpart for eval and view yet. With the code above, you don't need it - evaluate using tf1::<Eval>() and view using tf1::<View>(). The type argument selects the interpreter.

Still, it'd be nice to be able to write eval(tf1()) or view(tf1()) instead. Let's try:

fn exprsym_eval(e: i32) -> i32 {
    e
}

fn exprsym_view(e: String) -> String {
    e
}

These trivial implementations reveal the lack of intermediate representation in the final style. The idea is to tell the compiler which interpreter (Eval or View) we want by constraining the type of representation. Once we do that, the act of calling tf1 with the right interpreter has already done all the work, and we can just return the result. Alas:

exprsym_eval(tf1());

> error[E0282]: type annotations needed
> cannot infer type of the type parameter `E` declared on the function `tf1`

Rust can't figure out E from an E::Repr - in the type signature fn tf1<E: ExprSym>() -> E::Repr, the function exprsym_eval constrains E::Repr to be i32, but the Rust compiler is not smart enough to see that the only possible E for E::Repr == i32 is Eval.

It would be cool if the compiler could do this. In the general case, there isn't always a one-to-one mapping of an associated type E::Repr to an E: ExprSym, because there may be multiple implementations of ExprSym that have the same Repr, in which case the compiler would have the ask for a type annotation to disambiguate.

We can give the compiler a hand by making the reverse mapping from Repr to ExprSym ourselves. Unfortunately, that means some boilerplate:

// map from E::Repr to E
trait HasExprSym {
    type ES: ExprSym;
}

// i32 -> Eval
impl HasExprSym for i32 {
    type ES = Eval;
}

// String -> View
impl HasExprSym for String {
    type ES = View;
}

// Introduce new type argument T = E::Repr, and wire everything up.
fn tf1<E: ExprSym<Repr = T>, T: HasExprSym<ES = E>>() -> T {
    E::add(E::lit(8), E::neg(E::add(E::lit(1), E::lit(2))))
}

With that, exprsym_eval(tf1()) compiles and runs without trouble.

Finally solving the expression problem

The expression problem is an (in)famous extensibility problem in statically typed programming languages. There are a few subtleties, but the long and the short of it is that it's hard to come up with an abstraction that is easily extensible with both behaviors and representations. To illustrate, let's consider the initial style once again.

Adding new behaviors in the initial style corresponds to adding new interpreters of our language. Easy! We can for example write a new function Expr.count that counts the number of sub-expressions, similar to Expr.view and Expr.eval. However, suppose we want to support multiplication in the EDSL. Now we have to rewrite our enum to add a new variant and modify all existing interpreters to deal with the new variant. If we can't access the source code for the enum or the interpreters, we're stuck. Object-oriented approaches to this problem have a complementary problem: they are easy to extend with new representations, like multiplication, but adding new behaviors requires modification of all classes.

In the final style, this tension does not exist. We've already shown new interpreters are easy to add, by implementing the ExprSym trait. Adding multiplication can be done without changing existing code, and in a type-safe way.

First, define a sub-trait of ExprSym which adds the new operations, and implement it for all interpreters that need to support the new operation:

trait MulExprSym: ExprSym {
    fn mul(r1: Self::Repr, r2: Self::Repr) -> Self::Repr;
}

impl MulExprSym for Eval {
    fn mul(r1: Self::Repr, r2: Self::Repr) -> Self::Repr {
        r1 * r2
    }
}

That's it. We can now use the new operation and interpreter:

fn tfm1<E: MulExprSym<Repr = T>, T: HasExprSym<ES = E>>() -> T {
    E::add(E::lit(7), E::neg(E::mul(E::lit(1), E::lit(2))))
}

let final_style = exprsym_eval(tfm1());
assert_eq!(5, final_style);

We haven't implemented View for MulExprSym yet. Crucially, the compiler will tell us as much:

exprsym_view(tfm1())

> error[E0277]: the trait bound `View: MulExprSym` is not satisfied

This is an equivalent of pattern match exhaustiveness checking that comes with the final style for free, i.e. without any special-purpose compiler check or linter.

We never have to modify the original trait definition or implementations. They could live in another crate, we can still extend them without modifying the source. This is one of the main advantages of the final style over the initial style. Soon, we'll move on to higher-order languages, where we'll see another major advantage of the final style.

But what about pattern matching

Before we do that though, it's worth exploring whether the final style can express all of the things that the initial style can. We're especially worried about the loss of pattern matching in the final style.

For example, let's consider the case where we want to push down negation to literals, getting rid of double negation. You can think of this as an example of an optimization or rewriting pass.

impl Expr {
    fn push_neg(self) -> Expr {
        match &self {
            Expr::Lit(_) => self,
            Expr::Neg(content) => match content.as_ref() {
                Expr::Lit(_) => self,
                Expr::Neg(c) => c.to_owned().push_neg(),
                Expr::Add(r1, r2) => Expr::add(
                    Expr::Neg(r1.clone()).push_neg(),
                    Expr::Neg(r2.clone()).push_neg(),
                ),
            },
            Expr::Add(r1, r2) => Expr::add(r1.to_owned().push_neg(), r2.to_owned().push_neg()),
        }
    }
}

push_neg transforms an expression like (8 + (-(1 + 2))) into (8 + ((-1) + (-2))). The nested pattern match reveals a context dependence: we can no longer use the simple compositional structure of eval, by calling eval on sub-expressions and putting the results back together.

A compounding problem is that the result of push_neg is a new expression, which we can then interpret further in many ways, like view or eval. Can we write push_neg in the final style, without pattern matching, and produce a new expression that can then be further interpreted?

The only thing we can do in the final style is writing an interpreter by implementing the ExprSym trait, so that's what we'll do.

The key to this seemingly unsolvable problem is to make context dependence explicit. In this case, the interpreter must keep track of whether an expression occurs as part of another negation. I'll show the code first, then discuss it.

enum Ctx {
    Pos,
    Neg,
}

struct CtxFun<TRepr>(Box<dyn Fn(&Ctx) -> TRepr>);

struct PushNeg<T>(PhantomData<T>);
impl<T: ExprSym> ExprSym for PushNeg<T>
where
    for<'a> T: 'a,
{
    type Repr = CtxFun<T::Repr>;

    fn lit(i: i32) -> Self::Repr {
        CtxFun::new(move |ctx| match ctx {
            Ctx::Pos => T::lit(i),
            Ctx::Neg => T::neg(T::lit(i)),
        })
    }

    fn neg(r: Self::Repr) -> Self::Repr {
        CtxFun::new(move |ctx| match ctx {
            Ctx::Pos => r.0(&Ctx::Neg),
            Ctx::Neg => r.0(&Ctx::Pos),
        })
    }

    fn add(r1: Self::Repr, r2: Self::Repr) -> Self::Repr {
        CtxFun::new(move |ctx| T::add(r1.0(ctx), r2.0(ctx)))
    }
}

fn exprsym_push_neg<S: ExprSym>(e: CtxFun<S::Repr>) -> S::Repr {
    e.0(&Ctx::Pos)
}

We can verify the result:

exprsym_view(tf1())
> (8 + (-(1 + 2)))

exprsym_view(exprsym_push_neg(tf1()))

> (8 + ((-1) + (-2)))

One way to think about it is that PushNeg interprets an expression into a Rust function. The Rust function takes a context and produces a new expression, which can then be further interpreted. This fact is visible in the type type Repr = CtxFun<T::Repr>, as well as in the implementation of lit. To get out the new expression in the final style, exprsym_push_neg calls the function with Ctx::Pos.

As an intermediate step, it might help to write a version of the initial version that similarly takes a Ctx argument - the structure is then similar to the final style, with pattern matching on enum variants taking the place of implementing lit, neg and add trait methods.

The implementation type PushNeg<T> includes a type parameter. Eval and View have no such parameter. This type parameter is necessarily an ExprSym and is needed to produce a new final style expression that we can re-interpret. We gain another complementary insight by expanding a few functions:

// note this is a tf1 version that does not use HasExprSym.
// ctx_fun: Fn(&Ctx) -> String
let ctx_fun: CtxFun<String> = tf1::<PushNeg<View>>();
let view: String = ctx_fun(&Ctx::Pos)

The type arguments to tf1 can be understood as specifying a stack of interpreters: first, push negations down, then view the result. What tf1 returns is a function of those type arguments - in this case it's a CtxFun because of PushNeg, which returns a String, because of View. At run time, just like in the initial style, there are two passes: first tf1 is executed, producing a CtxFun. Second, the CtxFun is executed, which produces a String via the View interpreter.

Arguably, the context-dependence in the final style is much better visible - Ctx shows exactly the bit of context we need, not more, not less. It's also clearly total: it doesn't fail and terminates - in fact, there are no recursive calls at all.

Side note: it is once again annoying to help Rust infer the ExprSym from the representation. The trick with HasExprSym I described earlier results in much more boilerplate still, because Rust does not allow "unconstrained type parameter T" in trait implementations. The full code is in the repo but is otherwise not very illuminating.

Time for a recap

This section introduced the mind-boggling but wonderful final style. It will become truly mind-boggling and more wonderful in the next sections.

For now, let's summarize the salient points.

The final style is more direct than the initial style in using the host language, Rust. There is no intermediate representation.
Final style has significant extensibility advantages over the initial style. It effectively solves the expression problem.
The loss of pattern matching is not the restriction it may first appear. We can make the necessary context explicit, and write a correct, total interpreter.

Homework: it is straightforward to implement a final style interpreter that produces an initial style Expr, and vice versa. That shows that they are essentially equivalent. If this is not immediately clear to you, I recommend trying it - you will learn much more than by reading how it's done.

Part 2: Higher-order fun

All in all, the final style doesn't look so different from approaches like overloading numerical operators (e.g addition and multiplication) using the Add and Mul traits. One syntactic difference, that I was using an associated type instead of just implementing a trait on the representation type directly, seems trivial. However, for higher-order languages, which we'll discuss now, we need associated types.

What makes a language higher-order is that it treats functions as values. This makes it possible to pass functions as arguments to other functions and other fun stuff. Higher-order operations are quite common in embedded languages - for example, any data frame library worth its salt will have functions to map a given column, filter values using a predicate, and so on. Those are all higher-order operations.

Revisiting the initial style

Once again, let's start by adding functions to our little language in the initial style. I've removed addition and negation because they don't add anything new - as a result, we have a very minimalistic language that just has literals, functions, and function applications.

enum Expr {
    Var(VarId),    // variable - to refer to function arguments in function body
    Int(i32),      // literal integer
    Lam(Rc<Expr>), // function with given body
    App(Rc<Expr>, Rc<Expr>), // apply the given function to the given argument
}

Some people may recognize this as the lambda calculus. With a few helper constructor functions, which I've omitted, here's a simple expression in our language.

// (\× -> x) 1
Expr::app(
    Expr::lam(Expr::var(0)),
    Expr::int(1)
)

This applies the identity function to argument 1, so we'd expect it to evaluate to 1.

(I'm not sure if the reference counting Rcs are needed, but after struggling with lifetimes, ownership, and the borrow checker in the evaluator I caved. Since this is irrelevant to the topic of this post, I won't come back to it. I'd be interested if you can get rid of them, though.)

Now, if you think about writing an evaluator for this language, there are a few complications that we didn't have with the simple language from the previous section. First of all, it's possible to write nonsensical expressions:

// 1 2
Expr::app(
    Expr::int(1),
    Expr::int(2)
)

You can only apply functions, it's an error to apply an integer to anything. Even assuming the user only writes correct expressions, what is the result type of eval(Expr) -> ...? Since we've introduced a new type of value - a function - users can write:

// \x -> x
Expr::lam(Expr::var(0))

We have no choice but to introduce a new enum for eval's return type. Evaluation can produce an integer or a function:

enum Val {
    Int(i32),
    Fun(Rc<dyn Fn(Val) -> Val>),
}

type Env = Vec<Val>;

impl Expr {
    fn eval(env: Env, expr: Expr) -> Val {
        todo!()
    }
}

Just like evaluating an expression to an integer means evaluating to a Rust i32, so does evaluating to a function mean evaluating to a Rust Fn. The evaluator also needs an environment, to look up the values of any variables captured in closures. I'll just dump the code here without explanation - the details are not all that relevant:

fn eval(env: Env, expr: Expr) -> Val {
    match expr {
        Expr::Var(id) => env[id].clone(),
        Expr::Int(i) => Val::Int(i),
        Expr::Lam(e) => Val::Fun(Rc::new(move |x| {
            let mut envr = env.clone();
            envr.push(x);
            Expr::eval(envr, e.as_ref().clone())
        })),

        Expr::App(e1, e2) => {
            let eval_e1 = Expr::eval(env.clone(), e1.as_ref().clone());
            let eval_e2 = Expr::eval(env, e2.as_ref().clone());
            match eval_e1 {
                Val::Fun(f) => f(eval_e2),
                _ => panic!("Expected function"),
            }
        }
    }
}

What is important to note is that there are two ways this evaluator can fail:

There is an obvious panic when trying to apply a non-function.
Lookup of variables in the environment may fail - they must be numbered correctly.

If we'd like to make our language type-safe, we'd have to implement a type checker ourselves, turning our language into the simply typed lambda calculus.

This is all very annoying - especially given that Rust itself, our host language, already has a type system, functions, and all that jazz, which prevent all these errors at compile time, on the Rust level. Is there no way to leverage Rust's type system to make our embedded language type safe too?

Indeed there is - using the final style.

Finally higher-order

We'll proceed much as in the first-order case - instead of using an enum to define our language, we'll use a trait. The extra (and last) ingredient in the final style is that we'll use a generic associated type (GAT), which has been made stable since Rust 1.65. In Haskell, Kiselyov uses higher-kinded types instead, but as we'll see, they're pretty straightforward to translate. (And yes, this is the reason I used associated types for the first-order case too, although it wasn't necessarily the right approach there.)

type Fun<A, B> = Box<dyn Fn(A) -> B>;

trait ExprSym {
    type Repr<T>;

    fn int(i: i32) -> Self::Repr<i32>;
    fn add(a: &Self::Repr<i32>, b: &Self::Repr<i32>) -> Self::Repr<i32>;
    fn lam<A, B, F: Fn(Self::Repr<A>) -> Self::Repr<B>>(f: F) -> Self::Repr<Fun<A, B>>
    where
        for<'a> F: 'a;
    fn app<F: Fn(A) -> B, A, B>(f: Self::Repr<F>, arg: Self::Repr<A>) -> Self::Repr<B>;
}

(I had a few shenanigans with functions which makes this a bit more complicated than I would like. My first approach was to use impl Fn(A) -> B everywhere, instead of the Boxed function and the F type arguments, but Rust doesn't (yet?) support those in the places I need them.)

You'll probably want to stare at that for a bit. The salient point is that the Repr type now takes a generic parameter T. This effectively co-opts Rust's type system to work in the embedded language as well. add is an operation that takes two integers and returns an integer. app takes a function and an argument and returns the result of the function. Adding two functions, or applying an integer is disallowed by Rust's type system.

fn tf2a<T, E: ExprSym>() -> E::Repr<T> {
    // error[E0277]: expected a `std::ops::Fn<(_,)>` closure, found `i32`
     E::app(E::int(2), E::int(3))
}

At the risk of belaboring the point, in the final approach, there's no intermediate representation. Integers are Rust integers, variables are Rust variables, functions are Rust functions, and types are Rust types. The evaluator looks like the identity function.

struct Eval;
impl ExprSym for Eval {
    type Repr<T> = T;

    fn int(i: i32) -> Self::Repr<i32> {
        i
    }

    fn add(a: &Self::Repr<i32>, b: &Self::Repr<i32>) -> Self::Repr<i32> {
        a + b
    }

    fn lam<A, B, F: Fn(Self::Repr<A>) -> Self::Repr<B>>(f: F) -> Self::Repr<Box<dyn Fn(A) -> B>>
    where
        for<'a> F: 'a,
    {
        Box::new(f)
    }

    fn app<F: Fn(A) -> B, A, B>(f: Self::Repr<F>, arg: Self::Repr<A>) -> Self::Repr<B> {
        f(arg)
    }
}

A few examples of the evaluator in action:

fn th1<E: ExprSym>() -> E::Repr<i32> {
    E::add(&E::int(1), &E::int(2))
}

th1::<Eval>()
> 3

fn th2<E: ExprSym>() -> E::Repr<Fun<i32, i32>> {
    E::lam(|x| E::add(&x, &x))
}

th2::<Eval>()(2)
> 4

In th2, the interpreter produces a Rust function that we can call.

The implementation of a viewer is similar. The representation we choose is Fn(VarCounter) -> String so that we can generate unique names for variables. The full implementation is in the repo, here are some examples:

th1::<View>()(0)
> "(1 + 2)"

th2::<View>()(0)
> "(\\x0 -> (x0 + x0))"

Higher-order recap

This second part shows how the final style maximizes the effectiveness of the host language in the embedded language. It succeeds in co-opting most if not all of the constructs of the host language. That's great because it's exactly why we are embedding the DSL, instead of writing a new, separate language.

Furthermore, all the advantages of the previous section like extensibility carry over. Try adding multiplication as an exercise, by defining a new MulExprSym sub-trait of ExprSym.

I can also now explain what tagless means. Tagging refers to how, in the initial style, the enum "tags" different cases at runtime, and pattern matches on these tags at runtime. In the final style, there are no tags, we never introduce them. Instead, we rely on overloads via traits that are resolved at compile time.

As a result, the final code is easier to optimize and should compile as if the additions and functions were written in Rust directly. It's conceivable that Rust/LLVM is smart enough to compile to similar code from the initial style - though that seems to require more advanced optimizations.

Whatever the case, tagging has the annoying side-effect that types in the embedded language are erased in the host language - every expression becomes of type Expr, and so when writing expressions in the embedded language, the host language's type system becomes useless. In the final style, in contrast, the host language's type system is leveraged directly.

Part 3: Typed final tagless formatting

For the final part, we'll apply the final style to a more realistic problem: type-safe printing and parsing based on a format specification. I'll just quote Oleg Kiselyov's lecture notes here:

The typed formatting problem is to write type-safe versions of the familiar C functions printf and scanf. The polyvariadic function sprintf should take the formatting specification (the formatting pattern) and the values to format, and return the formatted string. The types and the number of sprintf’s arguments have to match the formatting specification. The typed sscanf takes the input string, the format specification, and the consumer function. It parses data from the string according to the formatting specification, passing them to the consumer. The number and the types of the arguments for the consumer function have to match the formatting specification. Since parsing is necessarily partial, sscanf should return the result of the consumer function in the Maybe monad.

I translated Kiselyov's Haskell implementation to safe Rust. In contrast with the remark from std.fmt.Arguments:

This structure represents a safely precompiled version of a format string and its arguments. This cannot be generated at runtime because it cannot safely be done, so no constructors are given and the fields are private to prevent modification.

A format specification and its arguments can be safely generated at runtime.

I'll first show a few examples of type-safe printing and scanning in Rust, and then explain how it works. The interface is not ergonomic at all and needs a macro or two to be fit for actual use. However, the macro would be thin - there'd be no need to validate the format specification. That the format pattern and arguments match is statically checked by the Rust type system. It doesn't need compiler plugins or elaborate macro-level checks.

The simple formatting language supports three constructs:

lit for strings,
char for characters,
int for integers,
compose to append two format specs.

It's bare-bones but implemented in the final style so easily extensible, without changing existing interpreters.

As another direct consequence of using the final style, the same pattern can be used for formatted printing or scanning - we write one interpreter for printing, and a separate one for scanning.

Here's the simplest pattern, just a string literal:

fn fmt1<F: FormattingSpec, A>() -> F::Repr<A, A> {
    F::lit("Hello, ")
}

sprintf(fmt1::<Print, _>())
> "Hello, "

sscanf("Hello, ", fmt1::<Scan, _>(), ())
> Some(())

sscanf("World", fmt1::<Scan, _>(), ())
> None

Taking it up a notch:

fn fmt2<F: FormattingSpec, A>() -> F::Repr<A, Fun<char, A>>
where
    for<'a> A: 'a,
{
    F::compose(F::lit("Hello, world"), F::char())
}

sprintf(fmt2::<Print, _>())('!')
> "Hello, world!"

sscanf("Hello, world?", fmt2::<Scan, _>(), new_fun(|x| x))
> Some('?')

And finally:

fn fmt3<F: FormattingSpec, A>() -> F::Repr<A, Fun<char, Fun<i32, A>>>
where
    for<'a> A: 'a,
{
    F::compose(
        F::lit("The value of "),
        F::compose(F::char(), F::compose(F::lit(" is "), F::int())),
    )
}

sprintf(fmt3::<Print, _>())('C')(67)
> "The value of C is 67"

sscanf(
    "The value of C is 67",
    fmt3::<Scan, _>(),
    new_fun(|c| new_fun(move |i| (c, i))),
);
> Some(('C', 67))

Note how the compiler can infer from the pattern fmt3 that sprintf takes a character and an integer - as expected, trying to pass anything else is an error. Similarly, the result of sscanf is an option containing a typed char and integer. You can also see those types in the return type of fmt3.

If you've never seen this before I bet it is quite surprising that it's possible to do type-safe formatting at all. Almost all languages either use untyped format strings or have special compiler plugins to make the format strings type-safe. The insight there is not related to the tagless final style - it has a rather long history, and you don't need advanced type system tricks to achieve it. Kiselyov's lecture notes have the necessary references. I'll give the highlights of the implementation here.

Here is the final style specification of the format specification:

// NOTE: lifetime constraints omitted for brevity
trait FormattingSpec {
    type Repr<A, B>;

    fn lit<A>(s: &str) -> Self::Repr<A, A>;
    fn int<A>() -> Self::Repr<A, Fun<i32, A>>;
    fn char<A>() -> Self::Repr<A, Fun<char, A>>;
    fn compose<A, B, C>(f: Self::Repr<B, C>, g: Self::Repr<A, B>) -> Self::Repr<A, C>;
}

One way to think of the types for the primitives lit, int, and char is that they take an "input" type A and if necessary append the type they represent. The crucial function compose then combines two such functions by "appending" function arguments from both its arguments. The formatting specification builds a nested list of functions - the function arguments are a polyvariadic list of sorts. (A polyvariadic list means a heterogeneous list: the heterogeneity of tuples with the ability to add as many elements as you like from Vec).

While the details of the implementation are certainly mind-expanding, they are not related to the final style per se. See the companion repo for more details and examples.

While typed sprintf was a solved problem, casting it to the final style allowed Kiselyov to implement typed scanf as well, sharing the same format specification.

The formatting specification is extensible with width specifiers, other primitives, and so on. We can also straightforwardly write new interpreters, for example, to output a more traditional format string from the specification. Finally, unlike Rust's format strings, the format specifications are first-class values: they can be combined at runtime, loaded from a file, and so on.

Thanks for reading! For more, subscribe to my newletter and follow me on Twitter or Mastodon.

References

Oleg Kiselyov's website: https://okmij.org/ftp/tagless-final/ is a treasure trove.
I recommend starting with the lecture notes, on which pretty much this entire post is based. For some reason, I couldn't find all the code examples for the lecture notes, but a Very Nice Person put them all on GitHub: https://github.com/michaelt/tagless.
Companion repo: https://github.com/kurtschelfthout/finally-tagless

Get Code