Lecture 3: Monadic Second Order Logic

  digraph {
    rankdir=TB;
    NFTA -> EMSO -> MSO;
    MSO -> WSkS, NFTA;
    WSkS -> NFTA;
  }
  • $EMSO$: Existential MSO
  • $WSkS$: Weak monadic second order with $k$ successors

MSO on finite trees

  • $𝒳_1, 𝒳_2$ two countable sets of variables
  • $𝔉$ finite ranked alphabet
MSO formulae defined by abstract syntax:

For $x,y∈𝒳_1, \; X ∈ 𝒳_2$: \(ψ ≝ \underbrace{P_f(x)}_{f ∈ 𝔉} \mid \underbrace{x \downarrow_i y}_{1 ≤ i ≤ k, \text{ where } k \text{ is the maximal arity in } 𝔉} \mid x = y \\ \mid x ∈ X \mid ¬ ψ \mid ψ ∧ ψ \mid ∃ x. ψ \mid ∃ X. ψ\)

Trees as relational structures

$t$ as a function from $Pos(t) ⟶ 𝔉$ defines a relational structure

\[⟨Pos(t), \; (\downarrow_i)_{1≤i≤k}, \; (P_f)_{f ∈𝔉}⟩\]

where the $\downarrow_i$ have arity 2 and $P_f$ arity 1:

  • $\downarrow_i^t ⊆ Pos(t) × Pos(t)$
  • $P_f^t ⊆ Pos(t)$

We define them by:

  • \[\downarrow_i^t ≝ \lbrace (p, pi) ∈ Pos(t) × Pos(t) \rbrace\]
  • \[P_f^t ≝ \lbrace p ∈ Pos(t) \mid t(p) = f \rbrace\]

Semantics

  • $v_1: 𝒳_1 ⟶ Pos(t)$
  • $v_2: 𝒳_2 ⟶ 2^{Pos(t)}$

valuations

$t ⊨_{v_1, v_2} ψ$ in the following cases:

  1. $t ⊨_{v_1, v_2} P_f(x)$ iff $v_1(x)∈ P_f^t$

  2. $t ⊨_{v_1, v_2} x \downarrow_i y$ iff $(v_1(x), v_2(x))∈ \downarrow_i^t$

  3. $t ⊨_{v_1, v_2} x=y$ iff $v_1(x) = v_1(y)$

  4. $t ⊨_{v_1, v_2} x ∈ X$ iff $v_1(x)∈ v_2(X)$

  5. $t ⊨_{v_1, v_2} ¬ψ$ iff $t \not⊨_{v_1, v_2} ψ$

  6. $t ⊨_{v_1, v_2} ψ ∧ φ$ iff $t ⊨_{v_1, v_2} ψ$ and $t ⊨_{v_1, v_2} φ$

  7. $t ⊨_{v_1, v_2} ∃x. ψ$ iff $∃p ∈ Pos(t)$ s.t. $t ⊨_{v_1[x ⟼ p], v_2} ψ$

  8. $t ⊨_{v_1, v_2} ∃X. ψ$ iff $∃P ⊆ Pos(t)$ s.t. $t ⊨_{v_1, v_2[X ⟼ P]} ψ$


  • FO (first order logic) is defined similarly, without $∃ X. ψ$ nor $x ∈ X$

  • EMSO is defined by the formulae $∃X_1, X_2, \ldots, X_n. ψ$ where

    • $n≥0$
    • $ψ$ is FO + $(x ∈ X_i)$, that is: FO + the ability to create new unary predicates
  • $fv_1(ψ)$ (resp. $fv_2(ψ)$) the sets of free first-order (resp. second-order) variables of $ψ$

    • of course, $t ⊨_{v_1, v_2} ψ$ only depends on the definitions of $v_1, v_2$ on $fv_1(ψ), fv_2(ψ)$
The language of $ψ$ is:
\[L(ψ) ≝ \lbrace t ∈ T(𝔉) \mid ∃v_1: fv_1 ⟶ Pos(t), \; v_2: fv_2 ⟶ 2^{Pos(t)} \text{ s.t. } t ⊨_{v_1, v_2} ψ \rbrace\]

NB:

  • with the convention that: \(L(ψ(x_1, \ldots, x_n, X_1, \ldots, X_m)) = L(∃x_1, \ldots, x_n, ∃ X_1, \ldots, X_m. ψ)\)
  • $∀, ∨, ⟹$ are defined as usual
$ψ$ is a sentence:

iff $fv_1(ψ), fv_2(ψ)$ are empty

Ex:

$v_1(x)$ is the parent of $v_1(x)$:
\[x \downarrow y ≝ \bigvee_{1≤ i ≤k} x \downarrow_i y\]
$v_1(x) = ε$:
\[root(x) ≝ ¬ ∃y. y \downarrow x\]
$v_2(X) ⊆ v_2(Y)$:
\[X ⊆ Y ≝ ∀x. x∈X ⟹ x ∈ Y\]
$v_2(X) ∩ v_2(Y) = v_2(Z)$:
\[X ∩ Y = Z ≝ ∀x. \Big(x∈X ∧ x ∈ Y ⟹ x ∈ Z\Big) \\ ∧ \Big(x∈Z ⟹ x ∈ X ∩ x∈Y\Big)\]
$X = \bigcup_{1 ≤ i ≤ n} X_i$:
\[X = \bigcup_{1 ≤ i ≤ n} X_i ≝ ∀x. \Big(x ∈ X ⟹ \bigvee_{1 ≤ i ≤n} x ∈ X_i\Big) ∧ \Big(\bigvee_{1 ≤ i ≤n} x ∈ X_i ⟹ x∈X\Big)\]
$v_1(y)$ is a strict descendent of $v_1(x)$:
\[∀X. x∈X ∧ \downarrow\text{-closed}(X) ⟹ y∈X\]

where

\[\downarrow\text{-closed}(X) ≝ ∀z. z∈X ⟹ \Big(∀z'. z \downarrow z' ⟹ z'∈X\Big)\]

it will be the case for the least fixed point:

\[\downarrow_+(p) ≝ \lbrace p' ∈ Pos(t) \mid p \downarrow_+ p'\rbrace\]

which is the smallest set $P$ s.t $p∈P$ and $P$ is $\downarrow$-closed.

NB: $x \downarrow_+ y ≝ ∃X. x ∈ X \ ∧ \downarrow\text{-closed}(X) \ ∧ ∀z. z \downarrow x ⟹ z∈X \ ∧ y∈X ∧ y≠x$ doesn’t work


\[leaf(x) ≝ ¬ ∃y. x \downarrow y\] \[branch(X) ≝ ∃x. leaf(x) ∧ ∀ y. y∈X ⟺ y \downarrow_\ast x\]
“in every branch, every $a$-labelled node has a $b$-labelled parent”:
\[∀X. branch(X) ⟹ \Big(∀x. x∈X ∧ P_a(x) ⟹ ∃y. y \downarrow x ∧ P_b(y) ∧ y ∈ X\Big)\]

Th: Let $L$ be a recognizable tree language over $𝔉$. Then \(L = L(ψ)\) for a EMSO sentence $ψ$

Proof:

$L = L(𝒜)$ for a NTFA $𝒜 ≝ ⟨Q, 𝔉, Δ, Q_f⟩$

$t∈L$ iff there exists an accepting run $ρ: Pos(t)⟶ Q$

Let $Q ≝ \lbrace q_1, \ldots, q_n \rbrace$. We define

\[ψ ≝ ∃ X_{q_1}, \ldots, X_{q_n}. ψ'\]

where

\[t ⊨_{v_1, v_2} ψ'\]

means that “$t$ + labels in $Q$ form an accepting run $ρ$”

\[∀n, partition(X, X_1, \ldots, X_n) ≝ X = \bigcup_{1 ≤ i ≤ n} X_i \\ ∧ ∃ Z. ∀x, x∉Z \\ ∧ \bigwedge_{1 ≤ i < j ≤ n} X_i ∩ X_j = Z\]

(“$Z = ∅$”, “the $X_i$’s are disjoint”)

\[ψ' ≝ ∃ X. ∀x. x∈ X \qquad "X = Pos(t)"\\ ∧ partition(X, X_{q_1}, \ldots, X_{q_n}) \\ ∧ ∀x. root(x) ⟹ \bigvee_{q ∈ Q_f} x ∈ X_q \qquad "ρ(ε) = Q_f" \\ ∧ ∀x. \bigwedge_{f ∈ 𝔉} \bigwedge_{q ∈ Q} \Bigg(P_f(x) ∧ x ∈ X_q \\ ⟹ \bigvee_{(q, f, q_1, \ldots, q_r) ∈ Δ} \bigwedge_{1 ≤ i ≤ r} ∃x_i. x \downarrow_i x_i ∧ x_i ∈ X_{q_i} \Bigg) \qquad "ρ \text{ consistent with } Δ"\]

Th: Let $ψ$ be a MSO sentence over $𝔉$. Then $L(ψ)$ is recognized by a NTFA of size $tower(\vert ψ\vert)$

where \(tower(n) ≝ 2^{2^{\vdots^{2^n} }}\) (tower of size $n$)

Proof:

valuated tree over $fv_1(ψ)$ and $fv_2(ψ)$:

is a tree over $𝔉 × \lbrace 0,1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$ where $(f, \overline{b})$ is of arity $arity(f)$

\[\begin{cases} t' ≝ val_{v_1, v_2}(t) \\ t = π_1(t') \end{cases}\] \[V(ψ) ≝ \lbrace t ∈ T(𝔉× \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}) \mid t = val_{v_1, v_2}(t') ∧ t' ⊨_{v_1, v_2} ψ\rbrace\]

the set of valuated trees of $ψ$.

NB: if $ψ$ is a sentence, then $L(ψ) = V(ψ)$

Claim: If $ψ$ is an MSO formula over $𝔉$, then $V(ψ)$ is recognized by a NTFA $𝒜_ψ$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$

Proof:

By induction over $ψ$. We need a few basic constructions on NTFA first.

$𝒜_{singleton}$ with language \(\lbrace t ∈ T(𝔉 × \lbrace 0, 1 \rbrace) \mid \vert \lbrace p ∈ Pos(t) \mid ∃f∈𝔉; \; t(p) = (f, 1) \rbrace \vert = 1\rbrace\)

$Q = \lbrace q_0, q_1 \rbrace, Q_f = \lbrace q_1 \rbrace$

for all $f∈𝔉$:

  • $\Big((f, 0)^{(n)}, \underbrace{q_0, \ldots, q_0}_{n \text{ times}}\Big) ⟶ q_0$
  • $\Big((f, 0)^{(n)}, \underbrace{q_0, \ldots, q_1, \ldots, q_0}_{n \text{ times, with one } \; q_1}\Big) ⟶ q_1$
  • $\Big((f, 1)^{(n)}, \underbrace{q_0, \ldots, q_0}_{n \text{ times }}\Big) ⟶ q_1$

$π_{i, n}$: linear tree homomorphism:
\[π_{i, n}(f, b_0, \ldots, b_i, \ldots, b_n)^{(r)} ≝ (f, b_0, \ldots, b_{i-1}, b_{i+1}, \ldots, b_n)^{(r)}(x_1, \ldots, x_r)\]

Base case: $𝒜_{P_f(x)}$ over $𝔉 × \lbrace 0, 1 \rbrace$

\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]
  • $(g, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$ for all $g∈ 𝔉$
  • $(f, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$
  • $(g, 0)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$

Base case: $𝒜_{x \downarrow_i y}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{y}$

\[Q = \lbrace q_0, q_x, q_y \rbrace, \; Q_f = \lbrace q_x \rbrace\]

for all $f∈ 𝔉$

  • $(f, 0, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
  • $(f, 0, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_y$
  • $(f, 1, 0)^{(n)}(q_0, \ldots, \underbrace{q_y}_{i\text{-th position}}, \ldots, q_0) ⟶ q_x$ where $n ≥ i, \; f ∈ 𝔉_n$
  • $(f, 0, 0)^{(n)}(q_0, \ldots, q_x, \ldots, q_0) ⟶ q_x$

Base case: $𝒜_{x = y}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{y}$

\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]

for all $f∈ 𝔉$

  • $(f, 0, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
  • $(f, 1, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$
  • $(f, 0, 0)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$

Base case: $𝒜_{x ∈ X}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{X}$

\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]

for all $f∈ 𝔉, \; b∈ \lbrace 0, 1 \rbrace$

  • $(f, 0, b)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
  • $(f, 0, b)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$
  • $(f, 1, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$

Induction step: $𝒜_{¬ ψ}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$

by induction hypothesis, we have constructed $𝒜_ψ$. We complement it over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$ to obtain $𝒜_{¬ ψ}$ (exponential blows up!)

Induction step: $𝒜_{ψ ∧ ψ’}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ ∧ ψ’) \vert}$

by induction hypothesis, we have constructed $𝒜_ψ$ and $𝒜_{ψ’}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ) \vert}$ and $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ’) \vert}$

For each variable in $fv_1(ψ ∧ ψ') \backslash fv_1(ψ) ∪ fv_2(ψ ∧ ψ') \backslash fv_2(ψ')$, we build

\[π^{-1} ⋯ π^{-1} π^{-1}_{\text{the right index}} (𝒜_{ψ'}) ≝ 𝒜'_{ψ'}\]

and do the same for the missing variables ⟶ $𝒜_ψ’$, over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ ∧ ψ') \vert}$ then construct $𝒜_{ψ ∧ ψ'}$ for the intersection of $𝒜'_ψ$ and $𝒜'_{ψ'}$.

Induction step: $𝒜_{∃x. ψ}$

by induction hypothesis, we have constructed $𝒜_ψ$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$

We define $𝒜_{∃x. ψ}$ as the intersection of

\[π_{\text{the right index}} \Big(𝒜_ψ \text{ with } π^{-1} ⋯ π^{-1} π^{-1} (𝒜_{singleton}) \Big)\]

Induction step: $𝒜_{∃X. ψ}$

by induction hypothesis, we have constructed $𝒜_ψ$.

We define $𝒜_{∃X. ψ}$ as by

\[π_{\text{right index}} (𝒜_ψ)\]

$WSkS$: Weak monadic second order with $k$ successors

\[φ ≝ x = y \underbrace{i}_{1 ≤ i ≤k} \mid x = ε \mid x ∈ X \mid ε ∈ X \\ \mid ¬ φ \mid φ ∧ φ \mid ∃x. φ \mid ∃ X. φ\]

we are quantifying over strings in $\lbrace 1, \ldots, k \rbrace^\ast$.

Semantics

  • $v_1: 𝒳_1 ⟶ \lbrace 1, \ldots, k \rbrace^\ast$
  • $v_2: 𝒳_2 ⟶ P_{fin}(\lbrace 1, \ldots, k \rbrace^\ast)$ (finite subsets)

$t ⊨_{v_1, v_2} ψ$ in the following cases:

  1. $t ⊨_{v_1, v_2} x = yi$ iff $v_1(x) = v_1(y)\cdot i$

  2. $t ⊨_{v_1, v_2} x=ε$ iff $v_1(x) = ε$

  3. $t ⊨_{v_1, v_2} ε ∈ X$ iff $ε∈ v_2(X)$

the rest is similar to what has been done before.

Interpreting MSO into WSkS

read the section 3.3 of TATA, especially “coding of trees” p.89

  digraph {
    rankdir=TB;
    NFTA -> EMSO[label="poly"];
    EMSO -> MSO[label="⊆"];
    MSO -> WSkS[label="poly"];
    MSO -> NFTA[label="tower"];
    WSkS -> NFTA[label="tower"];
  }

Leave a comment