Lecture 2: Languages that aren’t regular tree ones

\[L_1 ≝ \lbrace f(g(a, \square, b)^n \circ d, h(\square, c)^n \circ d) \mid n>0 \rbrace\]
  graph {
    rankdir=TB;
    f[label="f^(2)"];
    g1[label="g^(3)"];
    h1, h2, h3[label="h^(2)"];
    a1, a2, a3[label="a^(0)"];
    g2, g3[label="g^(3)"];
    d1, d2[label="d^(0)"];
    b1, b2, b3[label="b^(0)"];
    c1, c2, c3[label="c^(0)"];
    bl1, bl2[label="⋮"];
    f -- g1, h1;
    g1 -- a1, g2, b1;
    g2 -- bl1 -- a2, g3, b2;
    g3 -- a3, d1, b3;
    h1 -- h2, c1;
    h2 -- bl2 -- h3, c2;
    h3 -- d2, c3;
  }
Yield (the language of the leaves):
\[Yield(a^{(0)}) ≝ a \\ Yield(f^{(n)}(t_1, \ldots, t_n)) ≝ Yield(t_1) ⋯ Yield(t_n) \\ Yield(L) = \bigcup_{t∈L} Yield(t)\]
\[Yield(L_1) = \lbrace a^n d b^n d c^n \mid n>0 \rbrace\]

Theorem: If $L⊆ T(𝔉)$ is regular then $Yield(L) ⊆ 𝔉_0^\ast$ is context-free

Proof: Let $𝒜$ be a NTFA with $L = L(𝒜)$.

We construct a CFG $G$ s.t. $Yield(L) = L(G)$

  • $𝒜 =(Q, 𝔉, Δ, Q_f)$
  • $G ≝ (N, Σ, P, S)$

where:

  • $N ≝ Q \sqcup \lbrace S \rbrace$
  • $Σ ≝ 𝔉_0$
  • \[P ≝ \lbrace S ⟶ q \mid q ∈ Q_f \rbrace \\ ∪ \lbrace q ⟶ q_1 ⋯ q_n \mid ∃ f ∈ 𝔉_n, (q, f, q_1, \ldots, q_n) ∈ Δ, n>0\rbrace \\ ∪ \lbrace q ⟶ a \mid (q,a) ∈ Δ \rbrace\]

Theorem: If $G$ is a CFG, then its set of derivation trees $T(G)$ is regular.

  • $G ≝ (N, Σ, P, S)$
  • $𝒜 =(Q, 𝔉, Δ, Q_f)$
  • $𝔉_0 ≝ Σ ∪ \lbrace ε^{(0)} \rbrace$
\[∀n > 0, \; 𝔉_n ≝ \lbrace A^{(n)} \mid A ∈ N, n>0, ∃ A ⟶ X_1 ⋯ X_n ∈ P\rbrace\\ ∪ \lbrace A^{(1)} \mid A ∈ N, A ⟶ ε ∈ P \rbrace\]

Ex:

$S ⟶ a S b \mid ε$

  • $Q ≝ N ∪ Σ ∪ \lbrace ε \rbrace$
  • $Q_f ≝ \lbrace S \rbrace$
  • \[Δ ≝ \lbrace (A, A^{(1)}, ε) \mid A ⟶ ε ∈ P \rbrace \\ ∪ \lbrace (A, A^{(n)}, X_1, \ldots, X_n) \mid n>0, A ⟶ X_1 ⋯ X_n ∈ P \rbrace \\ ∪ \lbrace (a, a^{(0)}) \mid a ∈ Σ \rbrace \\ ∪ \lbrace (ε, ε^{(0)}) \rbrace\]

NB: it’s local tree language, since the state is known based on the current node.


Other techniques to show that a language is not regular

\[L_2 ≝ \lbrace f(g(\square))^n \circ d, h(\square)^n \circ d \mid n> 0 \rbrace \\ Yield(L_2) = \lbrace dd \rbrace\]

By contradiction: Assume $L_2$ is regular. Then $∃𝒜$ a NTFA with $L = L(𝒜)$.

Let $N ≝ \vert Q \vert$ the number of states of $𝒜$ and consider the tree

\[t ≝ f(g(\square)^{N+1} \circ d, h(\square)^{N+1} \circ d) ∈ L_2\]

As $t∈ L_2 ≝ L(𝒜)$, it has an accepting run $ρ$ with $ρ(ε) ∈ Q_f$ and $ρ(1 1^n) = q_n$ for all $0 ≤ n < N+1$.

Hence there exist $0 ≤ i < j < N+1$ s.t. $q_i =q_j$ and the tree

\[t' ≝ f(g(\square)^{N+1-j-i} \circ d, h(\square)^{N+1} \circ d) ∈ L(𝒜)\]

but $t’ ∉ L_2$


Pumping Lemma:

Let $L ⊆ T(𝔉)$ be a regular tree language. Then $∃ N ∈ ℕ$ s.t.

\[∀t ∈ L, \; Height(t) > N\]

implies there exist a context $C$, a non-trivial context $C’$ and a tree $t’$ s.t.

\[C[C'[t']] = t\]

and

\[∀n, C[C'^n[t']] ∈ L\]

Application to the previous example:

$C$ is

  • either: trivial, then one can repeat the $f$ node ⟶ not in the language

  • or: in the left (or tight, wlog) branch ⟶ then same end than in the previous demonstration

Tree homomorphisms

Tree homomorphism:
\[h : ∀n, \; 𝔉_n ⟶ T(𝔉', X_n) \quad \text{ where } X_n ≝ \lbrace x_1, \ldots, x_n \rbrace\]

Ex:

\[φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, c^{(0)})\\ d^{(0)} ⟼ d^{(0)} \end{cases}\] \[𝔉 ≝ \lbrace f^{(2)}, g^{(1)}, h^{(1)}, d^{(0)} \rbrace \\ 𝔉' ≝ \lbrace f^{(2)}, g^{(3)}, h^{(2)}, a^{(0)}, b^{(0)}, c^{(0)}, d^{(0)} \rbrace\]

$φ$ defines a tree homomorphism:

\[φ : \begin{cases} T(𝔉) ⟶ T(𝔉') \\ f(t_1, \ldots, t_n)\mapsto φ(f(t_1, \ldots, t_n)) ≝ \underbrace{φ(f)}_{\text{term in } T(𝔉', X_n)}\underbrace{[x_1 ↦ φ(t_1), \ldots, x_n ↦ φ(t_n)]}_{\text{ substitution } X_n ⟶ T(𝔉')} \end{cases}\]

Ex:

\[\begin{align*} φ(f(g(d)), h(h(d))) & = f(φ(g(d)), φ(h(h(d)))) \\ & = f(g(a, φ(d), b), h(φ(h(d)), c) \\ & = f(g(a, d, b), h(h(φ(d), c), c)) \end{align*}\]

⟶ \(φ(L_2) = L_1\)

$φ$ is linear: each variable $x_1$ appears at most once on the right

Ex:

  • $𝔉 ≝ \lbrace f^{(1)}, g^{(1)}, a^{(0)} \rbrace$
  • $𝔉’ ≝ \lbrace f^{(2)}, g^{(1)}, a^{(0)} \rbrace$
\[\begin{cases} f^{(1)} ⟼ f^{(2)}(x_1, x_2) \\ g^{(1)} ⟼ g^{(1)}(x_1) \\ a^{(0)} ⟼ a^{(0)} \end{cases}\] \[h(\lbrace f(g(\square)^n \circ a) \mid n≥0 \rbrace) = \lbrace f(g(\square)^n \circ a, g(\square)^n \circ a) \mid n≥0 \rbrace\]

NB: tree homomorphisms do not preserve regularity.

Th: linear tree homomorphisms preserve regularity.

Proof: Let $L = L(𝒜)$ for $𝒜 ≝ (Q, 𝔉, Δ, Q_f)$ and $φ$ be a linear tree homomorphism $T(𝔉) ⟶ T(𝔉’)$

We construct a NTFA $𝒜’ ≝ (Q’, 𝔉’, Δ’, Q_f’)$ s.t.

\[L(𝒜') = φ(L(𝒜))\]

One will replace each variable by a state of the automaton.

  • \[Q' ≝ \lbrace tσ \mid ∃ f∈ 𝔉_n; ∃ C ∈ C(𝔉', X_n); \; φ(f) = C[t], σ: X_n ⟶ Q \rbrace ∪ Q\]
  • \[Δ' ≝ \lbrace (q, g^{(k)}, t_1[x_i ⟼ q_i], \ldots, t_k[x_i ⟼ q_i]) \mid (q, f^{(n)}, q_1, \ldots, q_n) ∈ Δ, \quad φ(f) = g^{(k)}(t_1, \ldots, t_k) \rbrace \qquad (⊛)\\ ∪ \lbrace f^{(n)}(t_1, \ldots, t_n), f^{(n)}, t_1, \ldots, t_n \mid f^{(n)}(t_1, \ldots, t_n) ∈ Q' \rbrace \qquad (⊛⊛)\]

Ex:

\[φ : \begin{cases} f^{(2)} ⟼ f^{(2)}(x_1, x_2)\\ g^{(1)} ⟼ g^{(3)}(a^{(0)}, x_1, b^{(0)})\\ h^{(1)} ⟼ h^{(2)}(x_1, g^{(1)}(c^{(0)}))\\ d^{(0)} ⟼ d^{(0)} \end{cases}\] \[𝒜 ≝ \begin{cases} q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(1)}(q_1) \\ q_1 ⟼ d^{(0)} q_2 ⟼ h^{(1)}(q_2) q_1 ⟼ d^{(0)} \end{cases}\] \[𝒜' ≝ \begin{cases} (⊛) \\ q ⟼ f^{(2)}(q_1, q_2) \\ q_1 ⟼ g^{(3)}(a, q_1, b) \\ q_1 ⟼ d^{(0)} \\ q_2 ⟼ h^{(2)}(q_2, g(c)) \\ q_2 ⟼ d^{(0)} \\ (⊛⊛) \\ a ⟼ a^{(0)} \\ b ⟼ b^{(0)} \\ g(c) ⟼ g^{(1)}(c) c ⟼ c^{(0)} \end{cases}\]

$h(L(𝒜)) ⊆ L(𝒜’)$

$∀n, ∀q∈Q:$

\[q ⟶^n_𝒜 t \text{ implies } q ⟶^\ast_𝒜 φ(t)\]
  • Base case: $q ⟶_𝒜 a^{(0)}$ and indeed $φ(a) = t'$ and $q ⟶_{𝒜'} t' ⟶^\ast_{𝒜'} t'$

  • Induction step: $q ⟶_𝒜 f(q_1, \ldots, q_n)$ and $q_i ⟶^{n_i}_𝒜 t_i \quad ∀ 1 ≤ i ≤ n, n_i < n$

    By def: $φ(f(t_1, \ldots, t_n)) = φ(f)[x_i ⟼ φ(t_i)]$:

    let $φ(f) = g^{(h)}(t_1’, \ldots, t_k’)$ thus $φ(f(t_1, \ldots, t_n)) = g^{(k)}(t_1’[x_i ⟼ φ(t_i)], \ldots, t_k’[x_i ⟼ φ(t_i)])$

    Then $q ⟶_{𝒜’} g^{(k)}(t_1’[x_i ⟼ t_i], \ldots, t_k’[x_i ⟼ t_i])$

    $∀ 1 ≤ j ≤k, \; t_j’[x_i ⟼ q_i] ⟶^\ast_{𝒜’} t_j’[x_i ⟼ q_i]$

    By I.H. $q_i ⟶^\ast_{𝒜’} φ(t_i)$

$L’(𝒜) ⊆ φ(L(𝒜))$

exercise


For this reciproqual: linearity is necessary, since we will go bottom-up and for instance in

\[φ(f^{(1)}) = f^{(2)}(x_1, x_1)\]

each $x_1$ will lead to $q_1$, but the trees recognized on the left and on the right might be different.


Th: Regular tree languages are closed under inverse tree homomorphisms:

\[L_2 = φ^{-1}(L_1)\]

Decision problems

Uniform Membership

  • input: NFTA $𝒜$, $t ∈ T(𝔉)$

  • question: $t ∈ L(𝒜)$

  1. Build $𝒜_t$ with $L(𝒜_t) = \lbrace t \rbrace$

  2. Build $𝒜’$ with $L(𝒜’) = L(𝒜) ∩ \lbrace t \rbrace$

  3. Check whether $L(𝒜’) ≠ ∅$

⟶ Reduction from Uniform Membership to Emptiness

In linear time for DFTA

$∈ Log DCFL$: the set of language for which there exists a logspace reduction du deterministic context-free grammars.

OR: in alternative Logspace: ALogspace = P

For words: in NLogspace, since you know that an accepted word is of length bounded by the number of states

Emptiness

P-complete

  • input: NTFA $𝒜$

  • question: $L(𝒜) = ∅$ ?

(Reduces to Horn-SAT)

  1. compute co-accessible states in P-Time: \(w ≝ \lbrace q ∈ Q \mid ∃ t ∈ T(𝔉); q ⟶^\ast_𝒜 \rbrace\)

check this algorithm

  1. compute accessible and co-accessible states:
\[Q' ≝ \lbrace q ∈ w \mid ∃ q_f ∈ Q_f; ∃ C ∈ C(𝔉 ∪ w); q_f ⟶^\ast_𝒜 C[q] \rbrace\]

in NL

Universality

  • input: a NTFA $𝒜$

  • question: $L(𝒜) = T(𝔉)$ ?

EXP-complete:

in EXP:

  • determinize $𝒜$: $O(2^n)$
  • complement: $O(2^n)$
  • test for emptiness: $poly(2^n)$

⟶ In the word-case: PSPACE-complete (check this algorithm) ⟶ here: APSPACE-complete = EXP-complete

Leave a comment