Lecture 1: Introduction
Introduction
\[\newcommand{\dom}{\mathop{\rm dom}\nolimits}\]Regular word languages
graph {
rankdir=LR;
"finite word automata" -- "regular word expressions", "MSO on words", "finite monoids";
}
-
regular expressions: denotational
-
finite automata: operational
-
MSO: logic
-
finite moinoids: algebraic, reason about the infixes of the language
- \[\begin{cases} φ: Σ^\ast ⟶ M \\ ε \mapsto 1 \\ u v w ⟼ φ(u)φ(v)φ(w) \end{cases}\]
- aperiodic monoids ⇔ first order logic
graph {
rankdir=LR;
"finite tree automata" -- "regular tree expressions", "MSO on trees", "finite algebra";
}
regular tree expressions, finite algebra: not that easy to manipulate…
Definitions
Tree
- A ranked alphabet:
-
is a pair $⟨𝔉, arity⟩$ where $arity: 𝔉 ⟶ ℕ$
notations:
- $f^{(n)}$ for $f ∈ 𝔉$ with arity $n$
- $𝔉_n ≝ \lbrace f ∈ 𝔉 \mid arity(f) = n\rbrace$, \(𝔉 ≝ \bigcup_{n∈ℕ} 𝔉_n\)
Rooted, ordered, labelled, finite trees
- as partial functions:
-
let $𝔉$ be a ranked alphabet. A partial function $t: ℕ_{>0}^\ast ⟶ 𝔉$ is a tree iff
-
${\rm dom} (t)$ is finite an non-empty
-
${\rm dom} (t)$ is prefixed-closed: \(∀p, p' ∈ ℕ_{>0}^\ast, \qquad pp' ∈ {\rm dom} (t) ⟹ p ∈ {\rm dom} (t)\)
-
labels are consistent with $𝔉$: $∀p ∈ {\rm dom} (t), \, t(p) ∈ 𝔉_n$ for some $n$ implies \(\lbrace pi ∈ {\rm dom} (t) \mid i∈ ℕ_{>0}^\ast \rbrace = \lbrace p1, \ldots, pn \rbrace\)
-
graph {
rankdir=TB;
"f^(2) | ε" -- "g^(1) | 1", "g^(1) | 2";
"g^(1) | 1" -- "a^(0) | 11";
"g^(1) | 2" -- "b^(0) | 21";
}
notation: $T(𝔉)$ is the set of trees labelled by $𝔉$
- the subtree of $t$ at $p ∈ {\rm dom} (t)$ is:
-
$t_{|p}$, defined by \(\begin{cases} {\rm dom} (t_{|p}) ≝ \lbrace p' \mid pp' ∈ {\rm dom} (t) \rbrace \\ t_{|p}(p') ≝ t(pp') \end{cases}\)
- A non-deterministic finite tree automaton (NFTA) is:
-
a tuple $𝒜 ≝ ⟨Q, 𝔉, Q_f, Δ⟩$ where
- $Q$ is a finite set of states
- $𝔉$ is a finite ranked alphabet
- $Q_f ⊆ Q$ is a set of final states
- $Δ ⊆ \bigcup_n Q × 𝔉_n × Q^n$
- A run of $𝒜$ on a tree $t$ is:
-
a tree in $T(Q × ℕ)$ where $arity(q, n) ≝ n$ such that
- $\dom ρ = \dom t$
- \[∀p ∈ \dom ρ = \dom t, \text{ if } ρ(p) = (q, n) \text{ for some } q, n \text{ then } ∃ (q, t(p), q_1, \ldots, q_n)∈ Δ \text{ s.t. } t(p) ∈ 𝔉_n\]
- \[∀ 1 ≤ i ≤ n, ρ(pi) = (q_i, n_i) \text{ for some } n_i\]
A run $ρ$ is accepting if $ρ(ε) ∈ Q_f$
- The language of $𝒜$ is:
- \[L_𝒜 ≝ \lbrace t ∈ T(𝔉) \mid ∃ \text{ accepting run on } t \rbrace\]
Example:
-
$Q ≝ \lbrace q_f, q_g, q_a, q_b \rbrace$
-
$Q_f ≝ \lbrace q_f \rbrace$
-
$𝔉 ≝ \lbrace f^{(2)}, g^{(1)}, a^{(0)}, b^{(0)}\rbrace$
- \[Δ ≝ \lbrace (q_f, f^{(2)}, q_g, q_g), \\ (q_g, g^{(1)}, q_g), \\ (q_g, g^{(1)}, q_a), \\ (q_g, g^{(1)}, q_b), \\ (q_a, a^{(0)}), \\ (q_b, b^{(0)}) \rbrace\]
graph {
rankdir=TB;
q_g1[label= "g^(1) | q_g"];
q_g2[label= "g^(1) | q_g"];
"f^(2) | q_f" -- q_g1, q_g2;
q_g1 -- "a^(0) | q_a";
q_g2 -- "b^(0) | q_b";
}
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
g3[label= "g"];
g4[label= "g"];
b1[label= "⋮"];
b2[label= "⋮"];
f -- g1, g2;
g1 -- b1 -- g3;
g2 -- b2 -- g4;
g3 -- "a or b";
g4 -- "a || b";
}
Example:
\(𝔉 ≝ \lbrace ∨, ∧, ¬, \top, \bot \rbrace\) (with obvious arities)
$𝒜$ s.t. $L(𝒜)$ is the set of Boolean formulae that evaluate to true
-
$Q ≝ \lbrace q_0, q_1 \rbrace$, $Q_f ≝ \lbrace q_1 \rbrace$
- \[Δ ≝ \lbrace (q_1, \top), \\ (q_0, \bot), \\ (q_1, ¬, q_0), \\ (q_0, ¬, q_1), \\ (q_1, ∧, q_1, q_1), (q_0, ∧, q_1, q_0), \\ (q_1, ∧, q_1, q_1), (q_0, ∧, q_0, q_0), \\ (q_1, ∧, q_0, q_1), (q_0, ∧, q_1, q_0), \\ \vdots \\ \rbrace\]
Bottom-up
- Inductive definition:
-
let $𝔉$ be a ranked alphabet s.t.
- $a^{(0)} ∈ 𝔉_0$ is a tree
- if $t_1, \ldots, t_n$ are trees and $f^{(n)} ∈ 𝔉_n$, then $f^{(n)}(t_1, \ldots, t_n)$ is a tree
Let $𝒳$ be a countable set of variables $𝒳 ∩ 𝔉 = ∅$.
We set $arity(x) = 0, ∀ x∈𝒳$.
A tree in $T(𝔉 ∪ 𝒳)$ is called a term and we rather write $T(𝔉, 𝒳)$ in that case.
- A substitution:
-
is a function $σ : 𝒳 ⟶ T(𝔉, 𝒳)$ with $\lbrace x∈𝒳 \mid σ(x) ≠ x\rbrace$ finite
It defines a function $T(𝔉, 𝒳) ⟶ T(𝔉, 𝒳)$ by congruence:
- $x σ ≝ σ(x)$
- $f^{(n)}(t_1, \ldots, t_n)σ ≝ f^{(n)}(t_1σ, \ldots, t_nσ)$
thus $a^{(0)} σ = a^{(0)}$
Example:
if \(σ : \begin{cases} x ⟼ a^{(0)} \\ y ⟼ g(x) \\ \end{cases}\)
$t$:
graph {
rankdir=TB;
g1[label= "g"];
f -- g1, y;
g1 -- x;
}
⇓
$tσ$:
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
f -- g1, g2;
g1 -- a;
g2 -- x;
}
- A term $t∈ T(𝔉, 𝒳)$ is linear:
-
if every variable of $𝒳$ appears at most once in $t$
- A context:
-
is a term in $T(𝔉, \lbrace \square \rbrace)$ where $\square$ occurs exactly once.
notation :
- $tσ$ is an instance of $t$.
- a term in $T(𝔉, 𝒳)$ with no variables is a ground term.
Example:
$t$:
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
f -- g1, g2;
g1 -- a;
g2 -- b;
}
⇓
$t = C[t’] ≝ C σ$ where $C σ$ for $σ(\square) = g(b)$ and
$C ≝ $
graph {
rankdir=TB;
g2[label= "g"];
f -- □, g2;
g2 -- a;
}
and
$t’ ≝ $
graph {
rankdir=TB;
g -- b;
}
notations: $C(𝔉)$ for the set of contexts over $𝔉$
- A term rewriting system $R$ over $T(𝔉, 𝒳)$:
-
is a set of pairs $l ⟶ r$ with $l, r ∈ T(𝔉, 𝒳)$ and $vars(r) ⊆ vars(l)$, where \(vars(t) ≝ \lbrace x ∈ 𝒳 \mid ∃ p ∈ \dom t; \; t(p) = x \rbrace\)
$R$ defines a rewriting relation $⟶_R ⊆ T(𝔉) × T(𝔉)$ by:
- $t ⟶_R t’$
- iff $∃ C ∈ C(𝔉)$ and a substitution $σ$ s.t. $t ≝ C[lσ]$ and $t’ ≝ C[rσ]$
Example:
\[g(x) ⟶ g(g(x))\]$t$:
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
f -- g1, g2;
g1 -- a;
g2 -- b;
}
1) $C = f(\square, g(b))$, $σ= x ⟼ a$
$t ⟶_R$
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
g3[label= "g"];
f -- g1, g2;
g1 -- g3;
g2 -- b;
g3 -- a;
}
2) $C = f(g(a), \square)$, $σ= x ⟼ b$
$t ⟶_R$
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
g3[label= "g"];
f -- g1, g2;
g1 -- a;
g2 -- g3;
g3 -- b;
}
Example 2: It doesn’t work there:
\[f(x, x) ⟶ x\]$t$:
graph {
rankdir=TB;
g1[label= "g"];
g2[label= "g"];
f -- g1, g2;
g1 -- a;
g2 -- b;
}
$⟶_R$
graph {
rankdir=TB;
g -- a;
}
where $C = \square$, $σ : x ⟼ g(a)$
Bottom-up tree automata
Given a NFTA $𝒜 ≝ ⟨Q, 𝔉, Q_f, Δ⟩$, we define the top-down rewrite rules by:
- $f^{(n)}(q_1, \ldots, q_n) ⟶_𝒜 q$ for all $(q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ$ and using $arity(q) = 0$ for all $q∈Q$
Proposition: \(L(𝒜) ≝ \lbrace t ∈ T(𝔉) \mid ∃ q∈Q_f; t⟶_𝒜^\ast q \rbrace\)
Example: when it comes to the previous example related to logical formulas:
\[\top ⟶ q_1 \\ ¬ ⟶ q_0 \\ ∨(q_0, q_1) ⟶ q_1 \\ \vdots\]Other view:
$arity(q) ≝ 1$ for all $q∈ Q$
and
\[f^{(n)}(q_1(x_1), \ldots, q_n(x_n)) ⟶_𝒜 q(f^{(n)}(x_1, \ldots, x_n))\]in that view:
\[L(𝒜) ≝ \lbrace t ∈ T(𝔉) \mid ∃q ∈ Q_f; \; t ⟶_𝒜^\ast q(t) \rbrace\]- a NFTA is complete:
-
if $∀n, ∀f∈ 𝔉_n, ∀q_1, \ldots, q_n ∈ Q, \; ∃ q∈Q$ s.t. \((q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ\)
NB: If $𝒜$ is complete, for all $t$, there exists a $q$ s.t. $t ⟶_𝒜^\ast q$
- a NFTA is deterministic:
-
if $∀n, ∀f∈ 𝔉_n, ∀q_1, \ldots, q_n ∈ Q, \; ∃ q∈Q$ s.t. \(\vert \lbrace q∈ Q \mid (q, f^{(n)}, q_1, \ldots, q_n)∈ Δ \rbrace \vert ≤ 1\)
NB: If $𝒜$ is deterministic, for all $t$, there exists at most one $q$ s.t. $t ⟶_𝒜^\ast q$
NB: If $𝒜$ is complete and deterministic, for all $t$, there exists a unique $q$ s.t. $t ⟶_𝒜^\ast q$
Closure properties
As for word automata, one can complete and determinize any NFTA.
Proposition: Given a NTFA $𝒜$, we can construct an equivalent complete NTFA $𝒜’$.
\[Q' ≝ Q \sqcup \lbrace sink \rbrace\]
We send any missing transition to the sink:
\[Δ' ≝ Δ ∪ \lbrace (sink, f^{(n)}, q_1, \ldots, q_n) \mid f∈ 𝔉_n, \; q_1, \ldots, q_n ∈ Q'\rbrace\]\[t ⟶_{𝒜'}^\ast E ⟺ E ≝ \lbrace q ∈ Q \mid t ⟶_𝒜^\ast q\rbrace\]Proposition: Given a NTFA $𝒜$, we can construct an equivalent complete and deterministic TFA $𝒜’$.
- $Q’ ≝ 2^Q$
- $Q_f’ ≝ \lbrace E ⊆ Q \mid E ∩ Q_f ≠ ∅ \rbrace$
- \[Δ' ≝ \Big\lbrace (E, f^{(n)}, E_1, \ldots, E_n) \mid E_1, \ldots, E_n ⊆ Q \text{ and } \\ E ≝ \lbrace q ∈ Q \mid ∃(q_1, \ldots, q_n) ∈ E_1 × ⋯ × E_n; \; (q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ \rbrace \Big\rbrace\]
Closure properties
Boolean closure quite similar to the word case:
-
Union is disjoint union of tree automata
-
Complementation: determinize, then complement $Q_f$
-
Intersection: product automaton
Top-down tree automata
Given a NFTA $𝒜 ≝ ⟨Q, 𝔉, Q_f, Δ⟩$, we define the bottom-up rewrite rules by:
- $q ⟶_𝒜 f^{(n)}(q_1, \ldots, q_n)$ for all $(q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ$ and using $arity(q) = 0$ for all $q∈Q$
Proposition: \(L(𝒜) ≝ \lbrace t ∈ T(𝔉) \mid ∃ q∈Q_f; q ⟶_𝒜^\ast t \rbrace\)
Example:
If the language is \(\lbrace f(a, b), f(b, a) \rbrace\)
then
\[Δ ≝ \lbrace (q_f, f, q_1, q_2), (q_f, f, q_2, q_1), (q_1, a), (q_2, b) \rbrace\](chap. 1, 3, 8)
Leave a comment