Lecture 10: Abstract Machines
Reminder:
cf. pictures
Notations:
 $⟶$: rewriting for code ($λ$calculus)
 $\leadsto$: rewriting for abstract machines
 $\leadsto_{SEA}$: search
 $\leadsto_{SUB}$: substitute
 $\overline u^α$: $u$ wellformed and correctly $α$renamed
Code  Stack  Code  Stack  

$tu$  $π$  $⟶_{SEA}$  $t$  $u::π$ 
$λx.t$  $u::π$  $⟶_β$  $t \lbrace x ← u \rbrace$  $π$ 

s \leadsto_β s' ⟹ \underline{s} ⟶_β \underline{s'}

s ⟶_{SEA} s' ⟹ \underline{s} = \underline{s'}
 $s \leadsto_{SEA} s’$ terminates
 $⟶, \leadsto_{SEA}$ deterministic
 $s$ is final state then $\underline s$ is normal
Micro Abstract Machine
Code  Environment  Code  Environment  

$(λx.t)u r_1 ⋯ r_n$  $E$  $\leadsto_β$  $t r_1 ⋯ r_n$  $[x ← u] :: E$ 
$\underline{x} r_1 ⋯ r_n$  $E_1 :: [x ← u] :: E_2$  $\leadsto_{SUB}$  $u^α r_1 ⋯ r_n$  $E_1 :: [x ← u] :: E_2$ 
Ex: You may need two $\leadsto_β$ in a row:
Number of $\leadsto_β$: bounded by the “size” of the environment, provided it is not “weird”. We must ensure that we have:
Lemma: Let $s = (t, E)$ be a MicroAM reachable state.
Abs: if $λx. \overline u$ is a subterm of $\overline t$ or $E$, then $x$ occurs only in $\overline u$
Env scope: $E=E’::[x ← u]::E’’$ then $x$ is fresh wrt $u$ and $E’’$
So
Ex: Pay attention to renaming. Example of a mistake (we don’t rename in the second step):
So it reduces to $δδ$, which diverges, but it should in fact reduce to $δ$ (in $(λy.(λxy.xy)yIδ)$, not both $y$’s are bound by $δ$), which converges!
Milner Abstract Machines (MAM)
Simplified version of Krivine AM.
Code  Stack  Environment  Code  Stack  Environment  

$\overline t \overline u$  $π$  $E$  $\leadsto_{SEA}$  $\overline t$  $\overline u :: π$  $E$ 
$λx. \overline t$  $\overline u :: π$  $E$  $\leadsto_{β}$  $\overline t$  $π$  $[x ← \overline u] :: E$ 
$x$  $π$  $E_1 :: [x ← u] :: E_2$  $\leadsto_{SUB}$  $\overline u^α$  $π$  $E_1 :: [x ← u] :: E_2$ 
TODO: write $↓$
Complexity analysis
Let d: t_0 ⟶_{β_{cbn}}^k s be a derivation
 Input: the size $\vert t_0 \vert$ of the initial term
 Length: $\vert d \vert = k$
 Number of machine transitions
 Cost of a single transition
 Combine the two
Code  Environment  Code  Environment  

$(λx.t)u r_1 ⋯ r_k$  $E$  $\leadsto_β$  $t r_1 ⋯ r_k$  $[x ← u] :: E$ 
$\underline{x} r_1 ⋯ r_k$  $E_1 :: [x ← u] :: E_2$  $\leadsto_{SUB}$  $u^α r_1 ⋯ r_k$  $E_1 :: [x ← u] :: E_2$ 
Let $ρ: s \leadsto^\ast s’$
How do $\vert ρ \vert_{SUB}$ and $\vert ρ \vert_β$ compare?
The size of the environment is the number of $β$ transitions. But if
then, the $k ≤ n + \text{ size of the environment at } s_0$.
Therefore, if
then
where
So \vert ρ \vert_{SUB} = O(\vert ρ \vert_{β}^2)
Is this bound reached? Yes:
Subterm Invariant
The “equivalent” of the Hauptsatz in sequent calculus, or the subformula property.
Lemma (Subterm Invariant): Let ρ: (\overline{t_0}, ε, ε) \leadsto^\ast (\overline u, π, E) be an execution. Then $u$ and any code in $E$ and $π$ are subterms of the $t_0$ (up to $α$)
Proof: The only subtle proof step: $\leadsto_{SUB}$ (the only step where the machine duplicates): $u$ is duplicated.
This gives us a bound on the size of duplicated terms.
Recall
This lemma tells us: whenever
as in each step you can only duplicate subterms, then the size of $s’$ is bounded by
⟹ there’s no size explosion wrt the number of steps.
Warning: it could happen that the number of steps is itself exponential ⟹ we have to make sure that the number of transitions is reasonable (from $λ$ to $M$, before even going from $M$ to $RAM$)
by the subterm invariant.
But $\leadsto_{SUB}$ increases the size of the term: replaces a variable (size $1$) by a term (size $≥ 1$). But by the subterm invariant, this term is a subterm of $t_0$, so:
Recall that that AM take care of SEA(rch), SUB(stitution), and NAMES. But SEA is quadratic wrt to SUB ⟹ we can afford not to take it into account, it doesn’t impact much the complexity. NAMES impact even less the complexity.
Number of transitions  

SEA  $(\vert t_0 \vert + 1) \vert ρ \vert_β^2$ 
$β$  $\vert ρ \vert_β$ 
SUB  $\vert ρ \vert_β^2$ 
With pointers, the SEA and the $β$ transitions take constant time.
As for SUB: if we implement the environments as lists, you don’t have constant time access. But if variables are pointers, we can access the substituted term for $x$ in the environment in constant time ⟹ so SUB is bounded by $\vert t_0 \vert$ (as $u$ is a subterm thereof).
Therefore:
Number of transitions  Cost of single transition  Global cost  

SEA  $(\vert t_0 \vert + 1) \vert ρ \vert_β^2$  $O(1)$  $O((\vert t_0 \vert + 1) \vert ρ \vert_β^2)$ 
$β$  $\vert ρ \vert_β$  $O(1)$  $O(\vert ρ \vert_β)$ 
SUB  $\vert ρ \vert_β^2$  $O(\vert t_0 \vert)$  $O((\vert t_0 \vert + 1) \vert ρ \vert_β^2)$ 
digraph {
rankdir=LR;
λ > MAM[label="t₀ρ_β²"];
MAM > RAM > TM;
TM > λ[label="linear"];
}
CallbyValue evaluation
We saw that there’s an efficient abstract machine to implement callbyname evaluation. But reasonable cost models are not about finding such efficient machines.
But $λ$calculus is bigger, and there are terms where the evaluation strategy matters.
Ex: Duplicator:
⟹ CBN seems to be silly (we duplicate work), but we just showed that it is reasonable.
Ex: Erasor:
⟹ CBV seems to be silly, but we can show that it is reasonable too.
Being reasonable has nothing to do with finding an efficient strategy. It means that the overhead is not too complex (polynomial) ⟹ relative efficiency.
But still, are there non reasonable strategies? Yes (there’s one example in the literature: JeanJacques Lévy’s one).
Is there an optimal strategy (that takes the least number of steps)? No, the optimal strategy is not recursive.
But even though it is not recursive, we can have a notion of parallel optimal strategy, which is recursive (shown by Lévy). But Lévy didn’t know how to implement it. It was done a few years later by someone else. Question that arose: can take $k$ (the minimal number of steps for this optimal parallel strategy) as the complexity of $t$? It was proven that no, it’s not reasonable (this is an example of unreasonable strategy).
But it doesn’t make it useless for all that: just because it’s not reasonable doesn’t mean that it’s not efficient (hidden but wrong assumption: steps count as 1, i.e. there are reasonable). Nowadays, we still don’t know if it’s efficient or not.
Comparing the number of steps of strategies that are not reasonable doesn’t make sense.
Weak CallbyValue (CBV) $λ$calculus
NB: we consider only the weak version (we don’t reduce under $λ$’s) because generally it’s used with a programming application in mind
Harmony property: if $t$ is closed, then
 either t ⟶_{wβ_v} t'
 or $t$ is a value (an abstraction)
Proof: by induction on $t$.
So if $t$ is closed:
 either $t$ reduces to another term
 or it diverges
NB: if we remove variables from values, nothing changes.
In theoretical papers, values are defined as
In papers about abstract machines (practical), values are defined as
⟹ Why nothing changes? Because a variable can never be arguments: the only way to have a subterm of the form $tx$ is to bind the variable: $λx.tx$, and then we’re stuck: we don’t reduce under abstractions.
This is better than confluent, it has the diamond property:
⟹ we can always close the diagram in one step.
It fails when we’re allowed to duplicate redexes, as in $δ(II)$. But in CBV, we can’t: the only terms we can duplicate are values, that are normal.
On top of that, all reduction sequences have the same length.
Righttoleft strategy (the only rule that changes):
So contexts are given by:
AM:
We can rewrite the last ones in the same fashion, where $A$ is an applicative context:
Code  Environment  Code  Environment  

$A⟨(λx.t)⟩$  $E$  $\leadsto_β$  $A⟨t⟩$  $[x ← u] :: E$ 
$A⟨x⟩$  $E_1 :: [x ← u] :: E_2$  $\leadsto_{SUB}$  $A⟨u^α⟩$  $E_1 :: [x ← u] :: E_2$ 
Leave a comment