Lecture 2: Languages that aren’t regular tree ones
graph {
rankdir=TB;
f[label="f^(2)"];
g1[label="g^(3)"];
h1, h2, h3[label="h^(2)"];
a1, a2, a3[label="a^(0)"];
g2, g3[label="g^(3)"];
d1, d2[label="d^(0)"];
b1, b2, b3[label="b^(0)"];
c1, c2, c3[label="c^(0)"];
bl1, bl2[label="⋮"];
f -- g1, h1;
g1 -- a1, g2, b1;
g2 -- bl1 -- a2, g3, b2;
g3 -- a3, d1, b3;
h1 -- h2, c1;
h2 -- bl2 -- h3, c2;
h3 -- d2, c3;
}
- Yield (the language of the leaves):
- \[Yield(a^{(0)}) ≝ a \\ Yield(f^{(n)}(t_1, \ldots, t_n)) ≝ Yield(t_1) ⋯ Yield(t_n) \\ Yield(L) = \bigcup_{t∈L} Yield(t)\]
Theorem: If $L⊆ T(𝔉)$ is regular then $Yield(L) ⊆ 𝔉_0^\ast$ is context-free
Proof: Let $𝒜$ be a NTFA with $L = L(𝒜)$.
We construct a CFG $G$ s.t. $Yield(L) = L(G)$
- $𝒜 =(Q, 𝔉, Δ, Q_f)$
- $G ≝ (N, Σ, P, S)$
where:
- $N ≝ Q \sqcup \lbrace S \rbrace$
- $Σ ≝ 𝔉_0$
- \[P ≝ \lbrace S ⟶ q \mid q ∈ Q_f \rbrace \\ ∪ \lbrace q ⟶ q_1 ⋯ q_n \mid ∃ f ∈ 𝔉_n, (q, f, q_1, \ldots, q_n) ∈ Δ, n>0\rbrace \\ ∪ \lbrace q ⟶ a \mid (q,a) ∈ Δ \rbrace\]
Theorem: If $G$ is a CFG, then its set of derivation trees $T(G)$ is regular.
- $G ≝ (N, Σ, P, S)$
- $𝒜 =(Q, 𝔉, Δ, Q_f)$
- $𝔉_0 ≝ Σ ∪ \lbrace ε^{(0)} \rbrace$
Ex:
$S ⟶ a S b \mid ε$
- $Q ≝ N ∪ Σ ∪ \lbrace ε \rbrace$
- $Q_f ≝ \lbrace S \rbrace$
- \[Δ ≝ \lbrace (A, A^{(1)}, ε) \mid A ⟶ ε ∈ P \rbrace \\ ∪ \lbrace (A, A^{(n)}, X_1, \ldots, X_n) \mid n>0, A ⟶ X_1 ⋯ X_n ∈ P \rbrace \\ ∪ \lbrace (a, a^{(0)}) \mid a ∈ Σ \rbrace \\ ∪ \lbrace (ε, ε^{(0)}) \rbrace\]
NB: it’s local tree language, since the state is known based on the current node.
Other techniques to show that a language is not regular
\[L_2 ≝ \lbrace f(g(\square))^n \circ d, h(\square)^n \circ d \mid n> 0 \rbrace \\ Yield(L_2) = \lbrace dd \rbrace\]By contradiction: Assume $L_2$ is regular. Then $∃𝒜$ a NTFA with $L = L(𝒜)$.
Let $N ≝ \vert Q \vert$ the number of states of $𝒜$ and consider the tree
\[t ≝ f(g(\square)^{N+1} \circ d, h(\square)^{N+1} \circ d) ∈ L_2\]As $t∈ L_2 ≝ L(𝒜)$, it has an accepting run $ρ$ with $ρ(ε) ∈ Q_f$ and $ρ(1 1^n) = q_n$ for all $0 ≤ n < N+1$.
Hence there exist $0 ≤ i < j < N+1$ s.t. $q_i =q_j$ and the tree
\[t' ≝ f(g(\square)^{N+1-j-i} \circ d, h(\square)^{N+1} \circ d) ∈ L(𝒜)\]but $t’ ∉ L_2$
Pumping Lemma:
Let $L ⊆ T(𝔉)$ be a regular tree language. Then $∃ N ∈ ℕ$ s.t.
\[∀t ∈ L, \; Height(t) > N\]implies there exist a context $C$, a non-trivial context $C’$ and a tree $t’$ s.t.
\[C[C'[t']] = t\]and
\[∀n, C[C'^n[t']] ∈ L\]Application to the previous example:
$C$ is
-
either: trivial, then one can repeat the $f$ node ⟶ not in the language
-
or: in the left (or tight, wlog) branch ⟶ then same end than in the previous demonstration
Tree homomorphisms
- Tree homomorphism:
- \[h : ∀n, \; 𝔉_n ⟶ T(𝔉', X_n) \quad \text{ where } X_n ≝ \lbrace x_1, \ldots, x_n \rbrace\]
Ex:
\[φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, c^{(0)})\\ d^{(0)} ⟼ d^{(0)} \end{cases}\] \[𝔉 ≝ \lbrace f^{(2)}, g^{(1)}, h^{(1)}, d^{(0)} \rbrace \\ 𝔉' ≝ \lbrace f^{(2)}, g^{(3)}, h^{(2)}, a^{(0)}, b^{(0)}, c^{(0)}, d^{(0)} \rbrace\]$φ$ defines a tree homomorphism:
\[φ : \begin{cases} T(𝔉) ⟶ T(𝔉') \\ f(t_1, \ldots, t_n)\mapsto φ(f(t_1, \ldots, t_n)) ≝ \underbrace{φ(f)}_{\text{term in } T(𝔉', X_n)}\underbrace{[x_1 ↦ φ(t_1), \ldots, x_n ↦ φ(t_n)]}_{\text{ substitution } X_n ⟶ T(𝔉')} \end{cases}\]Ex:
\[\begin{align*} φ(f(g(d)), h(h(d))) & = f(φ(g(d)), φ(h(h(d)))) \\ & = f(g(a, φ(d), b), h(φ(h(d)), c) \\ & = f(g(a, d, b), h(h(φ(d), c), c)) \end{align*}\]⟶ \(φ(L_2) = L_1\)
$φ$ is linear: each variable $x_1$ appears at most once on the right
Ex:
- $𝔉 ≝ \lbrace f^{(1)}, g^{(1)}, a^{(0)} \rbrace$
- $𝔉’ ≝ \lbrace f^{(2)}, g^{(1)}, a^{(0)} \rbrace$
NB: tree homomorphisms do not preserve regularity.
Th: linear tree homomorphisms preserve regularity.
Proof: Let $L = L(𝒜)$ for $𝒜 ≝ (Q, 𝔉, Δ, Q_f)$ and $φ$ be a linear tree homomorphism $T(𝔉) ⟶ T(𝔉’)$
We construct a NTFA $𝒜’ ≝ (Q’, 𝔉’, Δ’, Q_f’)$ s.t.
\[L(𝒜') = φ(L(𝒜))\]One will replace each variable by a state of the automaton.
- \[Q' ≝ \lbrace tσ \mid ∃ f∈ 𝔉_n; ∃ C ∈ C(𝔉', X_n); \; φ(f) = C[t], σ: X_n ⟶ Q \rbrace ∪ Q\]
- \[Δ' ≝ \lbrace (q, g^{(k)}, t_1[x_i ⟼ q_i], \ldots, t_k[x_i ⟼ q_i]) \mid (q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ, \quad φ(f) = g^{(k)}(t_1, \ldots, t_k) \rbrace \qquad (⊛)\\ ∪ \lbrace f^{(n)}(t_1, \ldots, t_n), f^{(n)}, t_1, \ldots, t_n \mid f^{(n)}(t_1, \ldots, t_n) ∈ Q' \rbrace \qquad (⊛⊛)\]
Ex:
\[φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, g^{(1)}(c^{(0)}))\\ d^{(0)} ⟼ d^{(0)} \end{cases}\] \[𝒜 ≝ \begin{cases} q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(1)}(q_1) \\ q_1 ⟼ d^{(0)} q_2 ⟼ h^{(1)}(q_2) q_1 ⟼ d^{(0)} \end{cases}\] \[𝒜' ≝ \begin{cases} (⊛) \\ q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(3)}(a, q_1, b) \\ q_1 ⟼ d^{(0)} \\ q_2 ⟼ h^{(2)}(q_2, g(c)) \\ q_2 ⟼ d^{(0)} \\ (⊛⊛) \\ a ⟼ a^{(0)} \\ b ⟼ b^{(0)} \\ g(c) ⟼ g^{(1)}(c) c ⟼ c^{(0)} \end{cases}\]$h(L(𝒜)) ⊆ L(𝒜’)$
$∀n, ∀q∈Q:$
\[q ⟶^n_𝒜 t \text{ implies } q ⟶^\ast_𝒜 φ(t)\]-
Base case: $q ⟶_𝒜 a^{(0)}$ and indeed $φ(a) = t'$ and $q ⟶_{𝒜'} t' ⟶^\ast_{𝒜'} t'$
-
Induction step: $q ⟶_𝒜 f(q_1, \ldots, q_n)$ and $q_i ⟶^{n_i}_𝒜 t_i \quad ∀ 1 ≤ i ≤ n, n_i < n$
By def: $φ(f(t_1, \ldots, t_n)) = φ(f)[x_i ⟼ φ(t_i)]$:
let $φ(f) = g^{(h)}(t_1’, \ldots, t_k’)$ thus $φ(f(t_1, \ldots, t_n)) = g^{(k)}(t_1’[x_i ⟼ φ(t_i)], \ldots, t_k’[x_i ⟼ φ(t_i)])$
Then $q ⟶_{𝒜’} g^{(k)}(t_1’[x_i ⟼ t_i], \ldots, t_k’[x_i ⟼ t_i])$
$∀ 1 ≤ j ≤k, \; t_j’[x_i ⟼ q_i] ⟶^\ast_{𝒜’} t_j’[x_i ⟼ q_i]$
By I.H. $q_i ⟶^\ast_{𝒜’} φ(t_i)$
$L’(𝒜) ⊆ φ(L(𝒜))$
exercise
For this reciproqual: linearity is necessary, since we will go bottom-up and for instance in
\[φ(f^{(1)}) = f^{(2)}(x_1, x_1)\]each $x_1$ will lead to $q_1$, but the trees recognized on the left and on the right might be different.
Th: Regular tree languages are closed under inverse tree homomorphisms:
\[L_2 = φ^{-1}(L_1)\]
Decision problems
Uniform Membership
-
input: NFTA $𝒜$, $t ∈ T(𝔉)$
-
question: $t ∈ L(𝒜)$
-
Build $𝒜_t$ with $L(𝒜_t) = \lbrace t \rbrace$
-
Build $𝒜’$ with $L(𝒜’) = L(𝒜) ∩ \lbrace t \rbrace$
-
Check whether $L(𝒜’) ≠ ∅$
⟶ Reduction from Uniform Membership to Emptiness
In linear time for DFTA
$∈ Log DCFL$: the set of language for which there exists a logspace reduction du deterministic context-free grammars.
OR: in alternative Logspace: ALogspace = P
For words: in NLogspace, since you know that an accepted word is of length bounded by the number of states
Emptiness
P-complete
-
input: NTFA $𝒜$
-
question: $L(𝒜) = ∅$ ?
(Reduces to Horn-SAT)
- compute co-accessible states in P-Time: \(w ≝ \lbrace q ∈ Q \mid ∃ t ∈ T(𝔉); q ⟶^\ast_𝒜 \rbrace\)
check this algorithm
- compute accessible and co-accessible states:
in NL
Universality
-
input: a NTFA $𝒜$
-
question: $L(𝒜) = T(𝔉)$ ?
EXP-complete:
in EXP:
- determinize $𝒜$: $O(2^n)$
- complement: $O(2^n)$
- test for emptiness: $poly(2^n)$
⟶ In the word-case: PSPACE-complete (check this algorithm) ⟶ here: APSPACE-complete = EXP-complete
Leave a comment