CPS conversion
Before presenting my actual argument, I will show two analogous implementations of CPS conversion 1 for a very simple language, without making any conclusions. The general approach is borrowed from “Compiling with Continuations” by Andrew W. Appel. No worries if you are not familiar with the idea; the only thing to focus your attention on is how the idea is implemented in both Rust and OCaml.
Here is CPS conversion written in Rust 2:
use std::cell::RefCell;
use std::ops::Deref;
use std::rc::Rc;
// A variable identifier of the lambda language `Term`.
type Var = String;
// The lambda language; direct style.
type Term = Rc<TermTree>;
enum TermTree {
Var(Var),
Fix(Vec<(Var, Vec<Var>, Term)>, Term),
Appl(Term, Vec<Term>),
Record(Vec<Term>),
Select(Term, u32),
}
use TermTree::*;
#[derive(Clone)]
enum CpsVar {
// Taken from the lambda term during CPS conversion.
CLamVar(Var),
// Generated uniquely during CPS conversion.
CGenVar(u32),
}
use CpsVar::*;
// The resulting CPS term.
enum CpsTerm {
CFix(Vec<(CpsVar, Vec<CpsVar>, CpsTerm)>, Box<CpsTerm>),
CAppl(CpsVar, Vec<CpsVar>),
CRecord(Vec<CpsVar>, Binder),
CSelect(CpsVar, u32, Binder),
CHalt(CpsVar),
}
use CpsTerm::*;
// Binds a unique `CpsVar` within `CpsTerm`.
type Binder = (CpsVar, Box<CpsTerm>);
// Generates a unique CPS variable given the current `i`.
fn gensym(i: RefCell<u32>) -> CpsVar {
let x = CGenVar(i.clone().into_inner());
i.replace_with(|&mut i| i + 1);
x
}
// Converts `Term` to `CpsTerm`, applying `finish` to the resulting
// CPS variable.
fn convert(gen: RefCell<u32>, finish: impl FnOnce(CpsVar) -> CpsTerm, term: Term) -> CpsTerm {
match term.deref() {
Var(x) => finish(CLamVar(x.to_string())),
Fix(defs, m) => CFix(
defs.iter()
.map(|def| convert_def(gen.clone(), def.clone()))
.collect(),
Box::new(convert(gen, finish, m.clone())),
),
Appl(f, args) => {
let ret_k = gensym(gen.clone());
let ret_k_x = gensym(gen.clone());
CFix(
vec![(ret_k.clone(), vec![ret_k_x.clone()], finish(ret_k_x))],
Box::new(convert(
gen.clone(),
|f_cps| {
convert_list(
gen,
|args_cps| {
CAppl(f_cps, args_cps.into_iter().chain(vec![ret_k]).collect())
},
args,
)
},
f.clone(),
)),
)
}
Record(fields) => convert_list(
gen.clone(),
|fields_cps| {
let x = gensym(gen);
CRecord(fields_cps, (x.clone(), Box::new(finish(x))))
},
fields,
),
Select(m, i) => convert(
gen.clone(),
|m_cps| {
let x = gensym(gen);
CSelect(m_cps, *i, (x.clone(), Box::new(finish(x))))
},
m.clone(),
),
}
}
// Converts `Vec<Term>` to `Vec<CpsVar>` and applies `finish` to it.
fn convert_list(
gen: RefCell<u32>,
finish: impl FnOnce(Vec<CpsVar>) -> CpsTerm,
terms: &[Term],
) -> CpsTerm {
fn go(
gen: RefCell<u32>,
finish: impl FnOnce(Vec<CpsVar>) -> CpsTerm,
mut acc: Vec<CpsVar>,
terms: &[Term],
) -> CpsTerm {
match terms.split_first() {
None => finish(acc),
Some((x, xs)) => convert(
gen.clone(),
|x_cps| {
acc.push(x_cps);
go(gen, finish, acc, xs)
},
x.clone(),
),
}
}
let acc = Vec::with_capacity(terms.len());
go(gen, finish, acc, terms)
}
// Converts a single function definition to its CPS form.
fn convert_def(
gen: RefCell<u32>,
(f, params, m): (Var, Vec<Var>, Term),
) -> (CpsVar, Vec<CpsVar>, CpsTerm) {
let k = gensym(gen.clone());
(
CLamVar(f),
params
.into_iter()
.map(CLamVar)
.chain(std::iter::once(k.clone()))
.collect(),
convert(gen, |m_cps| CAppl(k, vec![m_cps]), m),
)
}
The code is 145 lines long, including comments and blank lines.
The same algorithm in idiomatic OCaml 3:
(* A variable identifier of the lambda language [term]. *)
type var = string
(* The lambda language; direct style. *)
type term =
| Var of var
| Fix of (var * var list * term) list * term
| Appl of term * term list
| Record of term list
| Select of term * int
type cps_var =
(* Taken from the lambda term during CPS conversion. *)
| CLamVar of var
(* Generated uniquely during CPS conversion. *)
| CGenVar of int
(* The resulting CPS term. *)
type cps_term =
| CFix of (cps_var * cps_var list * cps_term) list * cps_term
| CAppl of cps_var * cps_var list
| CRecord of cps_var list * binder
| CSelect of cps_var * int * binder
| CHalt of cps_var
(* Binds a unique [cps_var] within [cps_term]. *)
and binder = cps_var * cps_term
(* Generates a unique CPS variable given the current [i]. *)
let gensym i =
let x = CGenVar !i in
i := !i + 1;
x
(* Converts [term] to [cps_term], applying [finish] to the resulting
CPS variable. *)
let rec convert gen finish = function
| Var x -> finish (CLamVar x)
| Fix (defs, m) -> CFix (List.map (convert_def gen) defs, convert gen finish m)
| Appl (f, args) ->
let ret_k = gensym gen in
let ret_k_x = gensym gen in
CFix
( [ (ret_k, [ ret_k_x ], finish ret_k_x) ],
f
|> convert gen (fun f_cps ->
args
|> convert_list gen (fun args_cps ->
CAppl (f_cps, args_cps @ [ ret_k ]))) )
| Record fields ->
fields
|> convert_list gen (fun fields_cps ->
let x = gensym gen in
CRecord (fields_cps, (x, finish x)))
| Select (m, i) ->
m
|> convert gen (fun m_cps ->
let x = gensym gen in
CSelect (m_cps, i, (x, finish x)))
(* Converts [term list] to [cps_var list] and applies [finish] to it. *)
and convert_list gen finish =
let rec go acc = function
| [] -> finish (List.rev acc)
| x :: xs -> x |> convert gen (fun x_cps -> go (x_cps :: acc) xs)
in
go []
(* Converts a single function definition to its CPS form. *)
and convert_def gen (f, params, m) =
let k = gensym gen in
( CLamVar f,
List.map (fun x -> CLamVar x) params @ [ k ],
m |> convert gen (fun m_cps -> CAppl (k, [ m_cps ])) )
The code is 74 lines long, including comments and blank lines. This is ~2.0 times shorter than the Rust version.
Comparing the two implementations
Compiler development is characterized by:
- a lot of recursively defined data structures,
- a lot of complex data transformation.
How do Rust and OCaml handle these two aspects? Below is a brief summary:
- Recursive data structures:
- OCaml: recursive data structures are supported natively.
- Rust: we need to imitate data recursion by wrapping
recursive occurences of
TermTree
andCpsTerm
intoRc
s 4 andBox
es.
- Complex data transformation:
- OCaml:
- Recursion is a common practice. OCaml has tail-call optimization and “Tail Modulo Constructor (TMC)” optimization.
- Pattern matching is made very ergonomic. With
function
, we can pattern-match the “last parameter” 5 of a function without introducing any extra indentation. (function
also lets us omit the last parameter with oftentimes a dummy name liketerm
; if you think the parameter name is useful, you can write it in the signature.) Lists can be matched as simply as| [] -> ...
and| x :: xs -> ...
without further hussle. - The majority of standard data structures are immutable. This makes it easy to reason about the code.
- Rust:
- Recursion is uncommon. TCO is not
guaranteed (compare it with OCaml’s
[@tailcall]
and[@tail_mod_cons]
annotations). - Pattern matching requires extra indentation and the need to explicate the matched parameter. There are several ways to “match” vectors, but they all are more verbose than OCaml’s built-in syntax.
- The majority of standard data structures are mutable, which inclines
us towards the imperative style instead of the applicative style.
Iterators provide us with a hatch to write code in the pipelined
fashion, but first we need to
.iter()
/.iter_mut()
/.into_iter()
the data structure, perform the work, and then.collect()
.
- Recursion is uncommon. TCO is not
guaranteed (compare it with OCaml’s
- OCaml:
In addition to being syntactically more verbose than OCaml, Rust is a
language without garbage collection. This forces us to make certain
explicit choices about memory management: you can observe the plentitude
of plumbing with boxes, references (both &
and
Rc
), cloning, etc. Although it provides us with a greater
sense of how the code is executing, it brings very little value
to the algorithm itself.
Even mutation can be more challenging in Rust:
fn gensym(i: RefCell<u32>) -> CpsVar {
let x = CGenVar(i.clone().into_inner());
i.replace_with(|&mut i| i + 1);
x
}
In OCaml, it is just:
Why RefCell<u32>
instead of
&mut u32
? Because Rust requires us to have a single
mutable reference to a value at any given time. This is a very
reasonable requirement in multithreaded code, but we do not use more
than one thread in our algorithm. We need RefCell
just to
circumvent this superfluous limitation 6.
The last thing to note is the implementation of
convert_list
in Rust. Since fn
s are inherently
no more than code pointers, we need to pass gen
and
finish
explicitly on each call to go
7. In turn, this leads us to
duplicating the types of these variables in the signature of
go
(in Rust, there is no type inference of function
parameters). In contrast, OCaml captures gen
and
finish
automatically.
While the algorithm presented here is not very complex, it does already demonstrate the convenience of programming in a language from the ML family. However, let us see some more examples concerning type systems of both languages.
Type safety: GADTs
Resource management aside, OCaml’s type system is generally more expressive than that of Rust. For example, OCaml supports Generalized Algebraic Data Types (GADTs) to enforce certain invariants on the structure of data. Let us imagine an object language of booleans, integers, and operations upon them:
enum Term {
Bool(bool),
Not(Box<Term>),
And(Box<Term>, Box<Term>),
Int(i32),
Neg(Box<Term>),
Add(Box<Term>, Box<Term>),
}
enum Value {
Bool(bool),
Int(i32),
}
How do we write an evaluator for it? Here is a possible solution:
fn eval(term: &Term) -> Value {
use Term::*;
match term {
Bool(b) => Value::Bool(*b),
Not(m) => match eval(m) {
Value::Bool(b) => Value::Bool(!b),
_ => panic!("`Not` on a non-boolean value"),
},
And(m, n) => match (eval(m), eval(n)) {
(Value::Bool(b1), Value::Bool(b2)) => Value::Bool(b1 && b2),
_ => panic!("`And` on non-boolean values"),
},
Int(i) => Value::Int(*i),
Neg(m) => match eval(m) {
Value::Int(i) => Value::Int(-i),
_ => panic!("`Neg` on a non-integer value"),
},
Add(m, n) => match (eval(m), eval(n)) {
(Value::Int(i1), Value::Int(i2)) => Value::Int(i1 + i2),
_ => panic!("`Add` on non-integer values"),
},
}
}
The solution is simple enough; however, it is not very robust. What
happens if an input to eval
is ill-typed? Take the
following example:
fn main() {
use Term::*;
let term = Not(Box::new(And(Box::new(Bool(true)), Box::new(Int(42)))));
dbg!(eval(&term));
}
The program panics with “And
on non-boolean values”,
because the second operand of And
must necessarily be a
boolean, not an integer.
To prevent this kind of errors, we can use GADTs in OCaml:
type _ term =
| Bool : bool -> bool term
| Not : bool term -> bool term
| And : bool term * bool term -> bool term
| Int : int -> int term
| Neg : int term -> int term
| Add : int term * int term -> int term
let rec eval : type a. a term -> a = function
| Bool b -> b
| Not m -> not (eval m)
| And (m, n) ->
let b1, b2 = (eval m, eval n) in
b1 && b2
| Int i -> i
| Neg m -> -eval m
| Add (m, n) ->
let i1, i2 = (eval m, eval n) in
i1 + i2
Now what happens if we construct an ill-typed term?:
It just will not type-check!:
File "bin/main.ml", line 22, characters 35-41:
22 | let _term = Not (And (Bool true, Int 42)) in
^^^^^^
Error: This expression has type int term
but an expression was expected of type bool term
Type int is not compatible with type bool
This is possible because we essentially encoded the object language
type system in the definition of term
. OCaml knows that
And
accepts boolean-typed terms, not integer-typed ones. In
a real-world scenario, we can have an unrestricted term
akin to Rust’s Term
, which is produced by parsing and
elaborated further into a proper GADT-style term
. The
latter can be handled by eval
(or compile
,
whatever).
Type flexibility: First-class modules
Another neat feature of OCaml not present in Rust is first-class
modules. Can you imagine a module that is stored in a variable,
passed as a parameter, or returned from a regular function? This is what
first-class modules are about. Suppose that your object language
includes various fixed-size integers, such as
i8
/u8
, i16
/u16
, and
so on. With OCaml, you can represent them via (regular) modules:
fixed_ints.mli
(* [u8], [u16], etc. are defined by us. *)
module type S = sig
type t
val add : t -> t -> t
val sub : t -> t -> t
val mul : t -> t -> t
val div : t -> t -> t
val rem : t -> t -> t
(* Some more operations here. *)
end
module U8 : S with type t = u8
module U16 : S with type t = u16
module U32 : S with type t = u32
module U64 : S with type t = u64
module U128 : S with type t = u128
module I8 : S with type t = i8
module I16 : S with type t = i16
module I32 : S with type t = i32
module I64 : S with type t = i64
module I128 : S with type t = i128
In the AST, we can represent integer values as follows:
type generic =
| U8 of u8
| U16 of u16
| U32 of u32
| U64 of u64
| U128 of u128
| I8 of i8
| I16 of i16
| I32 of i32
| I64 of i64
| I128 of i128
Having so many possible combinations of arithmetical operators
add
/sub
/mul
/div
/rem
and variously typed operands, how to implement evaluation sanely?
Here is an idea:
- Define a function
pair_exn
that maps twogeneric
s into a first-class modulePair
. - Define a module
Pair
that implementsS
for a given pair of integers. - Define a function
do_int_bin_op
that acceptsPair
as a parameter and performs an operationop
on the pair of integers. - Call
do_int_bin_op
fromeval
.
In OCaml:
fixed_ints.mli
module type Pair = sig
type t
include S with type t := t
val pair : t * t
end
val pair_exn : generic * generic -> (module Pair)
The implementation of pair
would be:
fixed_ints.ml
let pair_exn =
let make (type a) (module S : S with type t = a) (x, y) =
(module struct
include S
let pair = x, y
end : Pair)
in
function
| U8 x, U8 y -> make (module U8) (x, y)
| U16 x, U16 y -> make (module U16) (x, y)
| U32 x, U32 y -> make (module U32) (x, y)
| U64 x, U64 y -> make (module U64) (x, y)
| U128 x, U128 y -> make (module U128) (x, y)
| I8 x, I8 y -> make (module I8) (x, y)
| I16 x, I16 y -> make (module I16) (x, y)
| I32 x, I32 y -> make (module I32) (x, y)
| I64 x, I64 y -> make (module I64) (x, y)
| I128 x, I128 y -> make (module I128) (x, y)
| _, _ -> raise (invalid_arg "Fixed_ints.pair_exn")
;;
Now we can write eval
as follows:
(* Somewhere within the definition of [eval]. *)
| IntBinOp (op, ty, m, n) ->
let x = extract_int_exn (eval m) in
let y = extract_int_exn (eval n) in
let (module Pair) = Fixed_ints.pair_exn (x, y) in
do_int_bin_op op (module Pair)
extract_int_exn
takes a value and extracts an integer
generic
, raising an exception if the value is not an
integer.
Finally, do_int_bin_op
is defined as follows:
let do_int_bin_op op (module S : Fixed_ints.Pair) =
let x, y = S.pair in
match op with
| Add -> S.add x y |> S.to_value
| Sub -> S.sub x y |> S.to_value
| Mul -> S.mul x y |> S.to_value
| Div -> S.div x y |> S.to_value
| Rem -> S.rem x y |> S.to_value
;;
S.to_value
converts a typed integer back to a value
holding generic
.
With the aid of first-class modules, we were able to implement
evaluation of fixed-size integers without much boilerplate. The best you
could do in Rust is to resort to macro_rules!
, which are
notorious for their hard-to-decipher syntax, shallow integration with
the rest of the language, and poor IDE support.
Final words
While Rust excels at resource management, OCaml turns out to be a more suitable choice for compiler development. We have not covered many other interesting features of it, such as polymorphic variants, custom binding operators, and effect handlers. Due to its completely static and flexible type system, OCaml has been historically used as a host language for many projects, including the Frama-C toolchain, the Coq theorem prover, and early versions of the Rust compiler itself.
OCaml is not without its flaws, though. The standard library and the
overall ecosystem is clearly inferior to that of Rust. The full set of
fixed-size integers found in Rust is not directly available in OCaml,
although it can be implemented with a combination of native OCaml
integers, the Int32
and Int64
modules from the
standard library, and C FFI. (Pro tip: do not use ocaml-stdint
,
it is unmaintained and is very buggy as of Aug 6, 2023. ocaml-integers
is a more robust alternative but it lacks support for Int8
,
Int16
, and 128-bit integers and has problems with
documentation.)
As Rust is gaining more and more popularity, more and more desperate developers from GitHub will start their compiler projects in it. I believe this can be a good decision either if 1) you are trying to learn Rust by writing “too many compilers” in it, or 2) you do really know what you are doing. If your intention is in compiler development itself, OCaml will save you a lot of time and undamaged nerves.
Other alternatives to consider is Haskell and various Lisp dialects. If you have already “tamed” Haskell (my congratulations and condolences), probably learning OCaml just for writing a compiler is not going to be worth it; if you have not, OCaml is a much more approachable language. Lisps can be very flexible, but they usually lack static type safety, opening a wide and horrible door to run-time errors.
Appendix: Getting started with OCaml
Here is an easy way to get started with OCaml:
- Install OCaml on Linux, macOS, *BSD, or Windows >>
- Install the Dune build
system:
opam install dune
. - Create a new project:
dune init project my_compiler
.
The directory my_compiler
will look like this:
my_compiler/
├── bin
│ ├── dune
│ └── main.ml
├── _build
│ └── log
├── dune-project
├── lib
│ └── dune
├── my_compiler.opam
└── test
├── dune
└── my_compiler.ml
bin/
is for setup code and CLI.lib/
is where most of the code lives.test/
is for tests.
I recommend alcotest
for
unit tests and ppx_deriving
for the deriving functionality (akin to #[derive(...)]
from
Rust). Install them as follows:
Edit my_compiler/lib/dune
as follows:
And my_compiler/test/dune
as follows:
- Type
dune build
to build the project. - Type
dune test
to run the tests. - Type
dune exec my_compiler
to execute the binary.
You can now create a file foo.ml
with a corresponding
foo.mli
in my_compiler/lib
and access it as
My_compiler.Foo
from bin/
and
lib/
.
For test coverage, consider using bisect_ppx
.
If you know Rust, you will find OCaml very familiar. I recommend the following resources for learning the language:
- OCaml Programming: Correct + Efficient + Beautiful
- Real World OCaml
- The official tutorial (can be read in an evening!)
References
CPS is the central representation of the compiler Standard ML of New Jersey.↩︎
Update: several people have suggested to use arenas (regions) instead of the approach I have demonstrated. I am well aware of the technique; however, I do not think that arenas would make a significant difference in code clarity and ergonomics. Flattening an AST has many performance benefits, such as spatial locality and cheap allocation and deallocation, but they add a little value to the overall discussion.↩︎
You can access the code itself and accompanying tests here.↩︎
Rc
was chosen to avoid expensive cloning ofTermTree
s in some places.↩︎To be precise, all functions are curried in OCaml, so
function
just “defines” a function with a single parameter and pattern-matches on it.↩︎Update: this is actually not true that this requirement is only needed in multithreaded code. However, I do still think it is superfluous in the code I have suggested.↩︎
Unfortunately, closures provide us with no solution here: they cannot be called recursively, at least without prior hoodoo.↩︎