# Lecture 3: Closure conversion, Defunctionalization

**Teacher**: François Pottier

# Closure conversion

Goal: compile a language with arbitrary first-class functions (i.e., $λ$-calculus) to a language with closed first-class functions (e.g. C, etc…)

In C: closures simulated/emulated by hand (sytems programming, etc…)

- compilation of functional programming languages
- explains first-class functions
- space and time cost;
- programming technique in languages without first-class functions (e.g. C).

```
let iter f t =
for i = 0 to Array.length t - 1 do f t.(i) done
let sum t =
let s = ref 0 in
let add x = (s := !s + x) in iter add t;
!s
```

⟶ How is this program transformed by a compiler without using local variables and nested functions?

⟶ the function `add`

could be returned as the output of a function

Procedural abstraction (in OOP): you always have to access the wanted through methods (direct acces to the data not given):

```
let make x =
let cell = ref x in
let get () = !cell
and set x=(cell:=x)in get, set
let () =
let get, set = make 3 in set (get() + 1)
```

Question: if you were to explain this program by hand (using only closed functions), how would you do it?

- Pass one more argument to
`get`

and`set`

so that they know how to find the data in the heap.

```
(* [...] *)
let get env () = !(env.cell)
and set env x = (env.cell := x) in
let env = {cell = cell} in
(get, env), (set, env)
let () =
let (cget, eget), (cset, eset) = make 3 in
cset eset (cget eget () + 1)
```

In OCamL, you could have

```
type (’a, ’b) closure = { code: ’a; cell: ’b }
```

and

```
let make x =
let cell = ref x in
let get (env, ()) = !(env.cell)
and set (env, x) = (env.cell := x) in
{ code = get; cell = cell }, { code = set; cell = cell }
let () =
let get, set = make 3 in
set.code (set, get.code (get, ()) + 1)
```

Here, `get`

and `set`

are closed functions.

```
let get (env, ()) = !(env.cell)
let set (env, x) = (env.cell := x) let make x =
let cell = ref x in
{ code = get; cell = cell }, { code = set; cell = cell }
let () =
let get, set = make 3 in
set.code (set, get.code (get, ()) + 1)
```

It combines code and data: in `{ code = get; cell = cell }`

:

- the first field contains a pointer to a (closed) function
- the second one contains a pointer to a piece of data allocated on the heap

Other example, we want to transform the non-closed function `f`

(that appears in `map`

as well):

```
let rec map f xs = match xs with
| [] -> []
| x :: xs ->
f x :: map f xs
let scale k xs =
map (fun x -> k * x) xs
```

`f x`

will be turned into`f.code (f, x)`

(get the code pointer, and passing the closure itself and the argument)`fun x -> k*x`

will be turned into`{code = fun (env,x) -> env.k * x; k = k}`

## Definition and proof (of closure conversion)

Application is not efficient this: we don’t want to duplicate $⟦t_1⟧$ (it would even be incorrect: if you had side-effects, they would be run twice):

Now, what about $λ$-abstractions?

If $\lbrace x_1, …, x_n\rbrace \, ≝ \, fv(λx.t)$:

### Soundness of closure conversion

Which semantics to use for the source calculus?

- small-step, substitution-based?
- big-step, substitution-based?
- big-step, environment-based?
- interpreter, with fuel, environment-based?

As we have environments in closure, **big-step, environment-based** is a no-brainer.

The target language should be simpler: after the translation, every $λ$-abstraction is closed! The semantics for this target language can be simplified as well, as every function is closed (Metal Semantics (closer to the machine), denoted by $↓↓_{cbv}$).

If the target programs could be non-deterministic, then we would have to check backward preservation: « the behaviors of the source program form a superset of the behaviors of the transformed program. ».

### Recursive functions

Applications don’t change as we don’t want to change how functions are called. As for $λ$-abstractions:

If $\lbrace f, x_1, …, x_n\rbrace \, ≝ \, fv(λx.t)$:

## Understanding programs through closure conversion

### Trick 1: difference lists

```
type tree =
| Leaf of int
| Node of tree * tree
```

Suppose you to retrieve the labels of all the leaves of the tree (fringe):

```
let rec fringe (t : tree) : int list = match t with
| Leaf i -> [ i ]
| Node (t1, t2) -> fringe t1 @ fringe t2
```

Nice looking piece of code, but very inefficient: quadratic complexity, because of `@`

Remedy: use difference lists:

```
type ’a diff =
’a list -> ’a list
let singleton (x : ’a) : ’a diff =
fun xs -> x :: xs
let concat (xs : ’a diff) (ys : ’a diff) : ’a diff =
fun zs -> xs (ys zs)
```

`concat`

becomes a constant time operation, as it’s function composition, which amounts to allocating a closure

And then, fringe:

```
let rec fringe_ (t : tree) : int diff =
match t with
| Leaf i -> singleton i
| Node (t1, t2) -> concat (fringe_ t1) (fringe_ t2)
let fringe t = fringe_ t []
```

Is it really more efficient? Complexity: $O(n)+O(n) = O(n)$

If we want to know what OCamL does, we can mentally closure-convert it.

Existential type for closure in OCamL:

```
type (’a, ’b) closure =
| Clo:
(’a * ’e -> ’b) (* A (closed) function... *)
* ’e (* ...and its environment... *)
-> (’a, ’b) closure (* ...together form a closure. *)
```

Invoking closures:

```
let apply (f : (’a, ’b) closure) (x : ’a) : ’b =
let Clo (code, env) = f in
code (x, env)
```

Now, closure conversion:

```
type ’a diff =
(’a list, ’a list) closure
let singleton_code =
fun (xs, x) -> x :: xs
let singleton (x : ’a) : ’a diff =
Clo (singleton_code, x)
let concat_code =
fun (zs, (xs, ys)) -> apply xs (apply ys zs)
let concat (xs : ’a diff) (ys : ’a diff) : ’a diff =
Clo (concat_code, (xs, ys))
let rec fringe_ (t : tree) : int diff =
match t with
| Leaf i -> singleton i
| Node (t1, t2) -> concat (fringe_ t1) (fringe_ t2)
let fringe t =
apply (fringe_ t) []
```

`fringe_`

copies the tree into a tree made up of closures.

But not so smart: linear time, but copying the tree at first seems to be useless (the constant factor lets a lot to be desired).

Beware of mutable local variables in your programming language: you can have unexpected behaviors with closures: e.g. in javascript

```
var messages = ["Wow!", "Hi!", "Closures are fun!"];
for (var i = 0; i < messages.length; i++) {
setTimeout(function () { say(messages[i]);
}, i * 1500);
}
```

yields three `undefined`

.

# Defunctionalization

Used in some compilers, like MLton (ML compiler).

If $\lbrace x_1, …, x_n \rbrace = fv(λx.t)$

## Leave a comment