# Lecture 2: Languages that aren’t regular tree ones

$L_1 ≝ \lbrace f(g(a, \square, b)^n \circ d, h(\square, c)^n \circ d) \mid n>0 \rbrace$
  graph {
rankdir=TB;
f[label="f^(2)"];
g1[label="g^(3)"];
h1, h2, h3[label="h^(2)"];
a1, a2, a3[label="a^(0)"];
g2, g3[label="g^(3)"];
d1, d2[label="d^(0)"];
b1, b2, b3[label="b^(0)"];
c1, c2, c3[label="c^(0)"];
bl1, bl2[label="⋮"];
f -- g1, h1;
g1 -- a1, g2, b1;
g2 -- bl1 -- a2, g3, b2;
g3 -- a3, d1, b3;
h1 -- h2, c1;
h2 -- bl2 -- h3, c2;
h3 -- d2, c3;
}

Yield (the language of the leaves):
$Yield(a^{(0)}) ≝ a \\ Yield(f^{(n)}(t_1, \ldots, t_n)) ≝ Yield(t_1) ⋯ Yield(t_n) \\ Yield(L) = \bigcup_{t∈L} Yield(t)$
$Yield(L_1) = \lbrace a^n d b^n d c^n \mid n>0 \rbrace$

Theorem: If $L⊆ T(𝔉)$ is regular then $Yield(L) ⊆ 𝔉_0^\ast$ is context-free

Proof: Let $𝒜$ be a NTFA with $L = L(𝒜)$.

We construct a CFG $G$ s.t. $Yield(L) = L(G)$

• $𝒜 =(Q, 𝔉, Δ, Q_f)$
• $G ≝ (N, Σ, P, S)$

where:

• $N ≝ Q \sqcup \lbrace S \rbrace$
• $Σ ≝ 𝔉_0$
• $P ≝ \lbrace S ⟶ q \mid q ∈ Q_f \rbrace \\ ∪ \lbrace q ⟶ q_1 ⋯ q_n \mid ∃ f ∈ 𝔉_n, (q, f, q_1, \ldots, q_n) ∈ Δ, n>0\rbrace \\ ∪ \lbrace q ⟶ a \mid (q,a) ∈ Δ \rbrace$

Theorem: If $G$ is a CFG, then its set of derivation trees $T(G)$ is regular.

• $G ≝ (N, Σ, P, S)$
• $𝒜 =(Q, 𝔉, Δ, Q_f)$
• $𝔉_0 ≝ Σ ∪ \lbrace ε^{(0)} \rbrace$
$∀n > 0, \; 𝔉_n ≝ \lbrace A^{(n)} \mid A ∈ N, n>0, ∃ A ⟶ X_1 ⋯ X_n ∈ P\rbrace\\ ∪ \lbrace A^{(1)} \mid A ∈ N, A ⟶ ε ∈ P \rbrace$

Ex:

$S ⟶ a S b \mid ε$

• $Q ≝ N ∪ Σ ∪ \lbrace ε \rbrace$
• $Q_f ≝ \lbrace S \rbrace$
• $Δ ≝ \lbrace (A, A^{(1)}, ε) \mid A ⟶ ε ∈ P \rbrace \\ ∪ \lbrace (A, A^{(n)}, X_1, \ldots, X_n) \mid n>0, A ⟶ X_1 ⋯ X_n ∈ P \rbrace \\ ∪ \lbrace (a, a^{(0)}) \mid a ∈ Σ \rbrace \\ ∪ \lbrace (ε, ε^{(0)}) \rbrace$

NB: it’s local tree language, since the state is known based on the current node.

## Other techniques to show that a language is not regular

$L_2 ≝ \lbrace f(g(\square))^n \circ d, h(\square)^n \circ d \mid n> 0 \rbrace \\ Yield(L_2) = \lbrace dd \rbrace$

By contradiction: Assume $L_2$ is regular. Then $∃𝒜$ a NTFA with $L = L(𝒜)$.

Let $N ≝ \vert Q \vert$ the number of states of $𝒜$ and consider the tree

$t ≝ f(g(\square)^{N+1} \circ d, h(\square)^{N+1} \circ d) ∈ L_2$

As $t∈ L_2 ≝ L(𝒜)$, it has an accepting run $ρ$ with $ρ(ε) ∈ Q_f$ and $ρ(1 1^n) = q_n$ for all $0 ≤ n < N+1$.

Hence there exist $0 ≤ i < j < N+1$ s.t. $q_i =q_j$ and the tree

$t' ≝ f(g(\square)^{N+1-j-i} \circ d, h(\square)^{N+1} \circ d) ∈ L(𝒜)$

but $t’ ∉ L_2$

Pumping Lemma:

Let $L ⊆ T(𝔉)$ be a regular tree language. Then $∃ N ∈ ℕ$ s.t.

$∀t ∈ L, \; Height(t) > N$

implies there exist a context $C$, a non-trivial context $C’$ and a tree $t’$ s.t.

$C[C'[t']] = t$

and

$∀n, C[C'^n[t']] ∈ L$

Application to the previous example:

$C$ is

• either: trivial, then one can repeat the $f$ node ⟶ not in the language

• or: in the left (or tight, wlog) branch ⟶ then same end than in the previous demonstration

## Tree homomorphisms

Tree homomorphism:
$h : ∀n, \; 𝔉_n ⟶ T(𝔉', X_n) \quad \text{ where } X_n ≝ \lbrace x_1, \ldots, x_n \rbrace$

Ex:

$φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, c^{(0)})\\ d^{(0)} ⟼ d^{(0)} \end{cases}$ $𝔉 ≝ \lbrace f^{(2)}, g^{(1)}, h^{(1)}, d^{(0)} \rbrace \\ 𝔉' ≝ \lbrace f^{(2)}, g^{(3)}, h^{(2)}, a^{(0)}, b^{(0)}, c^{(0)}, d^{(0)} \rbrace$

$φ$ defines a tree homomorphism:

$φ : \begin{cases} T(𝔉) ⟶ T(𝔉') \\ f(t_1, \ldots, t_n)\mapsto φ(f(t_1, \ldots, t_n)) ≝ \underbrace{φ(f)}_{\text{term in } T(𝔉', X_n)}\underbrace{[x_1 ↦ φ(t_1), \ldots, x_n ↦ φ(t_n)]}_{\text{ substitution } X_n ⟶ T(𝔉')} \end{cases}$

Ex:

\begin{align*} φ(f(g(d)), h(h(d))) & = f(φ(g(d)), φ(h(h(d)))) \\ & = f(g(a, φ(d), b), h(φ(h(d)), c) \\ & = f(g(a, d, b), h(h(φ(d), c), c)) \end{align*}

⟶ $φ(L_2) = L_1$

$φ$ is linear: each variable $x_1$ appears at most once on the right

Ex:

• $𝔉 ≝ \lbrace f^{(1)}, g^{(1)}, a^{(0)} \rbrace$
• $𝔉’ ≝ \lbrace f^{(2)}, g^{(1)}, a^{(0)} \rbrace$
$\begin{cases} f^{(1)} ⟼ f^{(2)}(x_1, x_2) \\ g^{(1)} ⟼ g^{(1)}(x_1) \\ a^{(0)} ⟼ a^{(0)} \end{cases}$ $h(\lbrace f(g(\square)^n \circ a) \mid n≥0 \rbrace) = \lbrace f(g(\square)^n \circ a, g(\square)^n \circ a) \mid n≥0 \rbrace$

NB: tree homomorphisms do not preserve regularity.

Th: linear tree homomorphisms preserve regularity.

Proof: Let $L = L(𝒜)$ for $𝒜 ≝ (Q, 𝔉, Δ, Q_f)$ and $φ$ be a linear tree homomorphism $T(𝔉) ⟶ T(𝔉’)$

We construct a NTFA $𝒜’ ≝ (Q’, 𝔉’, Δ’, Q_f’)$ s.t.

$L(𝒜') = φ(L(𝒜))$

One will replace each variable by a state of the automaton.

• $Q' ≝ \lbrace tσ \mid ∃ f∈ 𝔉_n; ∃ C ∈ C(𝔉', X_n); \; φ(f) = C[t], σ: X_n ⟶ Q \rbrace ∪ Q$
• $Δ' ≝ \lbrace (q, g^{(k)}, t_1[x_i ⟼ q_i], \ldots, t_k[x_i ⟼ q_i]) \mid (q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ, \quad φ(f) = g^{(k)}(t_1, \ldots, t_k) \rbrace \qquad (⊛)\\ ∪ \lbrace f^{(n)}(t_1, \ldots, t_n), f^{(n)}, t_1, \ldots, t_n \mid f^{(n)}(t_1, \ldots, t_n) ∈ Q' \rbrace \qquad (⊛⊛)$

Ex:

$φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, g^{(1)}(c^{(0)}))\\ d^{(0)} ⟼ d^{(0)} \end{cases}$ $𝒜 ≝ \begin{cases} q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(1)}(q_1) \\ q_1 ⟼ d^{(0)} q_2 ⟼ h^{(1)}(q_2) q_1 ⟼ d^{(0)} \end{cases}$ $𝒜' ≝ \begin{cases} (⊛) \\ q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(3)}(a, q_1, b) \\ q_1 ⟼ d^{(0)} \\ q_2 ⟼ h^{(2)}(q_2, g(c)) \\ q_2 ⟼ d^{(0)} \\ (⊛⊛) \\ a ⟼ a^{(0)} \\ b ⟼ b^{(0)} \\ g(c) ⟼ g^{(1)}(c) c ⟼ c^{(0)} \end{cases}$

### $h(L(𝒜)) ⊆ L(𝒜’)$

$∀n, ∀q∈Q:$

$q ⟶^n_𝒜 t \text{ implies } q ⟶^\ast_𝒜 φ(t)$
• Base case: $q ⟶_𝒜 a^{(0)}$ and indeed $φ(a) = t'$ and $q ⟶_{𝒜'} t' ⟶^\ast_{𝒜'} t'$

• Induction step: $q ⟶_𝒜 f(q_1, \ldots, q_n)$ and $q_i ⟶^{n_i}_𝒜 t_i \quad ∀ 1 ≤ i ≤ n, n_i < n$

By def: $φ(f(t_1, \ldots, t_n)) = φ(f)[x_i ⟼ φ(t_i)]$:

let $φ(f) = g^{(h)}(t_1’, \ldots, t_k’)$ thus $φ(f(t_1, \ldots, t_n)) = g^{(k)}(t_1’[x_i ⟼ φ(t_i)], \ldots, t_k’[x_i ⟼ φ(t_i)])$

Then $q ⟶_{𝒜’} g^{(k)}(t_1’[x_i ⟼ t_i], \ldots, t_k’[x_i ⟼ t_i])$

$∀ 1 ≤ j ≤k, \; t_j’[x_i ⟼ q_i] ⟶^\ast_{𝒜’} t_j’[x_i ⟼ q_i]$

By I.H. $q_i ⟶^\ast_{𝒜’} φ(t_i)$

### $L’(𝒜) ⊆ φ(L(𝒜))$

exercise

For this reciproqual: linearity is necessary, since we will go bottom-up and for instance in

$φ(f^{(1)}) = f^{(2)}(x_1, x_1)$

each $x_1$ will lead to $q_1$, but the trees recognized on the left and on the right might be different.

Th: Regular tree languages are closed under inverse tree homomorphisms:

$L_2 = φ^{-1}(L_1)$

# Decision problems

### Uniform Membership

• input: NFTA $𝒜$, $t ∈ T(𝔉)$

• question: $t ∈ L(𝒜)$

1. Build $𝒜_t$ with $L(𝒜_t) = \lbrace t \rbrace$

2. Build $𝒜’$ with $L(𝒜’) = L(𝒜) ∩ \lbrace t \rbrace$

3. Check whether $L(𝒜’) ≠ ∅$

⟶ Reduction from Uniform Membership to Emptiness

In linear time for DFTA

$∈ Log DCFL$: the set of language for which there exists a logspace reduction du deterministic context-free grammars.

OR: in alternative Logspace: ALogspace = P

For words: in NLogspace, since you know that an accepted word is of length bounded by the number of states

### Emptiness

P-complete

• input: NTFA $𝒜$

• question: $L(𝒜) = ∅$ ?

(Reduces to Horn-SAT)

1. compute co-accessible states in P-Time: $w ≝ \lbrace q ∈ Q \mid ∃ t ∈ T(𝔉); q ⟶^\ast_𝒜 \rbrace$

check this algorithm

1. compute accessible and co-accessible states:
$Q' ≝ \lbrace q ∈ w \mid ∃ q_f ∈ Q_f; ∃ C ∈ C(𝔉 ∪ w); q_f ⟶^\ast_𝒜 C[q] \rbrace$

in NL

### Universality

• input: a NTFA $𝒜$

• question: $L(𝒜) = T(𝔉)$ ?

EXP-complete:

in EXP:

• determinize $𝒜$: $O(2^n)$
• complement: $O(2^n)$
• test for emptiness: $poly(2^n)$

⟶ In the word-case: PSPACE-complete (check this algorithm) ⟶ here: APSPACE-complete = EXP-complete

Tags:

Updated: