Lecture 3: Monadic Second Order Logic
digraph {
rankdir=TB;
NFTA -> EMSO -> MSO;
MSO -> WSkS, NFTA;
WSkS -> NFTA;
}
- $EMSO$: Existential MSO
- $WSkS$: Weak monadic second order with $k$ successors
MSO on finite trees
- $𝒳_1, 𝒳_2$ two countable sets of variables
- $𝔉$ finite ranked alphabet
- MSO formulae defined by abstract syntax:
-
For $x,y∈𝒳_1, \; X ∈ 𝒳_2$: \(ψ ≝ \underbrace{P_f(x)}_{f ∈ 𝔉} \mid \underbrace{x \downarrow_i y}_{1 ≤ i ≤ k, \text{ where } k \text{ is the maximal arity in } 𝔉} \mid x = y \\ \mid x ∈ X \mid ¬ ψ \mid ψ ∧ ψ \mid ∃ x. ψ \mid ∃ X. ψ\)
Trees as relational structures
$t$ as a function from $Pos(t) ⟶ 𝔉$ defines a relational structure
\[⟨Pos(t), \; (\downarrow_i)_{1≤i≤k}, \; (P_f)_{f ∈𝔉}⟩\]where the $\downarrow_i$ have arity 2 and $P_f$ arity 1:
- $\downarrow_i^t ⊆ Pos(t) × Pos(t)$
- $P_f^t ⊆ Pos(t)$
We define them by:
- \[\downarrow_i^t ≝ \lbrace (p, pi) ∈ Pos(t) × Pos(t) \rbrace\]
- \[P_f^t ≝ \lbrace p ∈ Pos(t) \mid t(p) = f \rbrace\]
Semantics
- $v_1: 𝒳_1 ⟶ Pos(t)$
- $v_2: 𝒳_2 ⟶ 2^{Pos(t)}$
valuations
$t ⊨_{v_1, v_2} ψ$ in the following cases:
-
$t ⊨_{v_1, v_2} P_f(x)$ iff $v_1(x)∈ P_f^t$
-
$t ⊨_{v_1, v_2} x \downarrow_i y$ iff $(v_1(x), v_2(x))∈ \downarrow_i^t$
-
$t ⊨_{v_1, v_2} x=y$ iff $v_1(x) = v_1(y)$
-
$t ⊨_{v_1, v_2} x ∈ X$ iff $v_1(x)∈ v_2(X)$
-
$t ⊨_{v_1, v_2} ¬ψ$ iff $t \not⊨_{v_1, v_2} ψ$
-
$t ⊨_{v_1, v_2} ψ ∧ φ$ iff $t ⊨_{v_1, v_2} ψ$ and $t ⊨_{v_1, v_2} φ$
-
$t ⊨_{v_1, v_2} ∃x. ψ$ iff $∃p ∈ Pos(t)$ s.t. $t ⊨_{v_1[x ⟼ p], v_2} ψ$
-
$t ⊨_{v_1, v_2} ∃X. ψ$ iff $∃P ⊆ Pos(t)$ s.t. $t ⊨_{v_1, v_2[X ⟼ P]} ψ$
-
FO (first order logic) is defined similarly, without $∃ X. ψ$ nor $x ∈ X$
-
EMSO is defined by the formulae $∃X_1, X_2, \ldots, X_n. ψ$ where
- $n≥0$
- $ψ$ is FO + $(x ∈ X_i)$, that is: FO + the ability to create new unary predicates
-
$fv_1(ψ)$ (resp. $fv_2(ψ)$) the sets of free first-order (resp. second-order) variables of $ψ$
- of course, $t ⊨_{v_1, v_2} ψ$ only depends on the definitions of $v_1, v_2$ on $fv_1(ψ), fv_2(ψ)$
- The language of $ψ$ is:
- \[L(ψ) ≝ \lbrace t ∈ T(𝔉) \mid ∃v_1: fv_1 ⟶ Pos(t), \; v_2: fv_2 ⟶ 2^{Pos(t)} \text{ s.t. } t ⊨_{v_1, v_2} ψ \rbrace\]
NB:
- with the convention that: \(L(ψ(x_1, \ldots, x_n, X_1, \ldots, X_m)) = L(∃x_1, \ldots, x_n, ∃ X_1, \ldots, X_m. ψ)\)
- $∀, ∨, ⟹$ are defined as usual
- $ψ$ is a sentence:
-
iff $fv_1(ψ), fv_2(ψ)$ are empty
Ex:
- $v_1(x)$ is the parent of $v_1(x)$:
- \[x \downarrow y ≝ \bigvee_{1≤ i ≤k} x \downarrow_i y\]
- $v_1(x) = ε$:
- \[root(x) ≝ ¬ ∃y. y \downarrow x\]
- $v_2(X) ⊆ v_2(Y)$:
- \[X ⊆ Y ≝ ∀x. x∈X ⟹ x ∈ Y\]
- $v_2(X) ∩ v_2(Y) = v_2(Z)$:
- \[X ∩ Y = Z ≝ ∀x. \Big(x∈X ∧ x ∈ Y ⟹ x ∈ Z\Big) \\ ∧ \Big(x∈Z ⟹ x ∈ X ∩ x∈Y\Big)\]
- $X = \bigcup_{1 ≤ i ≤ n} X_i$:
- \[X = \bigcup_{1 ≤ i ≤ n} X_i ≝ ∀x. \Big(x ∈ X ⟹ \bigvee_{1 ≤ i ≤n} x ∈ X_i\Big) ∧ \Big(\bigvee_{1 ≤ i ≤n} x ∈ X_i ⟹ x∈X\Big)\]
- $v_1(y)$ is a strict descendent of $v_1(x)$:
- \[∀X. x∈X ∧ \downarrow\text{-closed}(X) ⟹ y∈X\]
where
\[\downarrow\text{-closed}(X) ≝ ∀z. z∈X ⟹ \Big(∀z'. z \downarrow z' ⟹ z'∈X\Big)\]it will be the case for the least fixed point:
\[\downarrow_+(p) ≝ \lbrace p' ∈ Pos(t) \mid p \downarrow_+ p'\rbrace\]which is the smallest set $P$ s.t $p∈P$ and $P$ is $\downarrow$-closed.
NB: $x \downarrow_+ y ≝ ∃X. x ∈ X \ ∧ \downarrow\text{-closed}(X) \ ∧ ∀z. z \downarrow x ⟹ z∈X \ ∧ y∈X ∧ y≠x$ doesn’t work
\[leaf(x) ≝ ¬ ∃y. x \downarrow y\] \[branch(X) ≝ ∃x. leaf(x) ∧ ∀ y. y∈X ⟺ y \downarrow_\ast x\]
- “in every branch, every $a$-labelled node has a $b$-labelled parent”:
- \[∀X. branch(X) ⟹ \Big(∀x. x∈X ∧ P_a(x) ⟹ ∃y. y \downarrow x ∧ P_b(y) ∧ y ∈ X\Big)\]
Th: Let $L$ be a recognizable tree language over $𝔉$. Then \(L = L(ψ)\) for a EMSO sentence $ψ$
Proof:
$L = L(𝒜)$ for a NTFA $𝒜 ≝ ⟨Q, 𝔉, Δ, Q_f⟩$
$t∈L$ iff there exists an accepting run $ρ: Pos(t)⟶ Q$
Let $Q ≝ \lbrace q_1, \ldots, q_n \rbrace$. We define
\[ψ ≝ ∃ X_{q_1}, \ldots, X_{q_n}. ψ'\]where
\[t ⊨_{v_1, v_2} ψ'\]means that “$t$ + labels in $Q$ form an accepting run $ρ$”
\[∀n, partition(X, X_1, \ldots, X_n) ≝ X = \bigcup_{1 ≤ i ≤ n} X_i \\ ∧ ∃ Z. ∀x, x∉Z \\ ∧ \bigwedge_{1 ≤ i < j ≤ n} X_i ∩ X_j = Z\](“$Z = ∅$”, “the $X_i$’s are disjoint”)
\[ψ' ≝ ∃ X. ∀x. x∈ X \qquad "X = Pos(t)"\\ ∧ partition(X, X_{q_1}, \ldots, X_{q_n}) \\ ∧ ∀x. root(x) ⟹ \bigvee_{q ∈ Q_f} x ∈ X_q \qquad "ρ(ε) = Q_f" \\ ∧ ∀x. \bigwedge_{f ∈ 𝔉} \bigwedge_{q ∈ Q} \Bigg(P_f(x) ∧ x ∈ X_q \\ ⟹ \bigvee_{(q, f, q_1, \ldots, q_r) ∈ Δ} \bigwedge_{1 ≤ i ≤ r} ∃x_i. x \downarrow_i x_i ∧ x_i ∈ X_{q_i} \Bigg) \qquad "ρ \text{ consistent with } Δ"\]Th: Let $ψ$ be a MSO sentence over $𝔉$. Then $L(ψ)$ is recognized by a NTFA of size $tower(\vert ψ\vert)$
where \(tower(n) ≝ 2^{2^{\vdots^{2^n} }}\) (tower of size $n$)
Proof:
- valuated tree over $fv_1(ψ)$ and $fv_2(ψ)$:
-
is a tree over $𝔉 × \lbrace 0,1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$ where $(f, \overline{b})$ is of arity $arity(f)$
the set of valuated trees of $ψ$.
NB: if $ψ$ is a sentence, then $L(ψ) = V(ψ)$
Claim: If $ψ$ is an MSO formula over $𝔉$, then $V(ψ)$ is recognized by a NTFA $𝒜_ψ$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$
Proof:
By induction over $ψ$. We need a few basic constructions on NTFA first.
$𝒜_{singleton}$ with language \(\lbrace t ∈ T(𝔉 × \lbrace 0, 1 \rbrace) \mid \vert \lbrace p ∈ Pos(t) \mid ∃f∈𝔉; \; t(p) = (f, 1) \rbrace \vert = 1\rbrace\)
$Q = \lbrace q_0, q_1 \rbrace, Q_f = \lbrace q_1 \rbrace$
for all $f∈𝔉$:
- $\Big((f, 0)^{(n)}, \underbrace{q_0, \ldots, q_0}_{n \text{ times}}\Big) ⟶ q_0$
- $\Big((f, 0)^{(n)}, \underbrace{q_0, \ldots, q_1, \ldots, q_0}_{n \text{ times, with one } \; q_1}\Big) ⟶ q_1$
- $\Big((f, 1)^{(n)}, \underbrace{q_0, \ldots, q_0}_{n \text{ times }}\Big) ⟶ q_1$
- $π_{i, n}$: linear tree homomorphism:
- \[π_{i, n}(f, b_0, \ldots, b_i, \ldots, b_n)^{(r)} ≝ (f, b_0, \ldots, b_{i-1}, b_{i+1}, \ldots, b_n)^{(r)}(x_1, \ldots, x_r)\]
Base case: $𝒜_{P_f(x)}$ over $𝔉 × \lbrace 0, 1 \rbrace$
\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]- $(g, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$ for all $g∈ 𝔉$
- $(f, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$
- $(g, 0)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$
Base case: $𝒜_{x \downarrow_i y}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{y}$
\[Q = \lbrace q_0, q_x, q_y \rbrace, \; Q_f = \lbrace q_x \rbrace\]for all $f∈ 𝔉$
- $(f, 0, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
- $(f, 0, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_y$
- $(f, 1, 0)^{(n)}(q_0, \ldots, \underbrace{q_y}_{i\text{-th position}}, \ldots, q_0) ⟶ q_x$ where $n ≥ i, \; f ∈ 𝔉_n$
- $(f, 0, 0)^{(n)}(q_0, \ldots, q_x, \ldots, q_0) ⟶ q_x$
Base case: $𝒜_{x = y}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{y}$
\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]for all $f∈ 𝔉$
- $(f, 0, 0)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
- $(f, 1, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$
- $(f, 0, 0)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$
Base case: $𝒜_{x ∈ X}$ over $𝔉 × \underbrace{\lbrace 0, 1 \rbrace}_{x} × \underbrace{\lbrace 0, 1 \rbrace}_{X}$
\[Q = \lbrace q_0, q_1 \rbrace, \; Q_f = \lbrace q_1 \rbrace\]for all $f∈ 𝔉, \; b∈ \lbrace 0, 1 \rbrace$
- $(f, 0, b)^{(n)}(q_0, \ldots, q_0) ⟶ q_0$
- $(f, 0, b)^{(n)}(q_0, \ldots, q_1, \ldots, q_0) ⟶ q_1$
- $(f, 1, 1)^{(n)}(q_0, \ldots, q_0) ⟶ q_1$
Induction step: $𝒜_{¬ ψ}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$
by induction hypothesis, we have constructed $𝒜_ψ$. We complement it over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$ to obtain $𝒜_{¬ ψ}$ (exponential blows up!)
Induction step: $𝒜_{ψ ∧ ψ’}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ ∧ ψ’) \vert}$
by induction hypothesis, we have constructed $𝒜_ψ$ and $𝒜_{ψ’}$ over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ) \vert}$ and $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ’) \vert}$
For each variable in $fv_1(ψ ∧ ψ') \backslash fv_1(ψ) ∪ fv_2(ψ ∧ ψ') \backslash fv_2(ψ')$, we build
\[π^{-1} ⋯ π^{-1} π^{-1}_{\text{the right index}} (𝒜_{ψ'}) ≝ 𝒜'_{ψ'}\]and do the same for the missing variables ⟶ $𝒜_ψ’$, over $𝔉 × \lbrace 0, 1 \rbrace^{\sum\limits_{ i=1 }^2 \vert fv_i(ψ ∧ ψ') \vert}$ then construct $𝒜_{ψ ∧ ψ'}$ for the intersection of $𝒜'_ψ$ and $𝒜'_{ψ'}$.
Induction step: $𝒜_{∃x. ψ}$
by induction hypothesis, we have constructed $𝒜_ψ$ over $𝔉 × \lbrace 0, 1 \rbrace^{\vert fv_1(ψ) \vert + \vert fv_2(ψ) \vert}$
We define $𝒜_{∃x. ψ}$ as the intersection of
\[π_{\text{the right index}} \Big(𝒜_ψ \text{ with } π^{-1} ⋯ π^{-1} π^{-1} (𝒜_{singleton}) \Big)\]Induction step: $𝒜_{∃X. ψ}$
by induction hypothesis, we have constructed $𝒜_ψ$.
We define $𝒜_{∃X. ψ}$ as by
\[π_{\text{right index}} (𝒜_ψ)\]$WSkS$: Weak monadic second order with $k$ successors
\[φ ≝ x = y \underbrace{i}_{1 ≤ i ≤k} \mid x = ε \mid x ∈ X \mid ε ∈ X \\ \mid ¬ φ \mid φ ∧ φ \mid ∃x. φ \mid ∃ X. φ\]we are quantifying over strings in $\lbrace 1, \ldots, k \rbrace^\ast$.
Semantics
- $v_1: 𝒳_1 ⟶ \lbrace 1, \ldots, k \rbrace^\ast$
- $v_2: 𝒳_2 ⟶ P_{fin}(\lbrace 1, \ldots, k \rbrace^\ast)$ (finite subsets)
$t ⊨_{v_1, v_2} ψ$ in the following cases:
-
$t ⊨_{v_1, v_2} x = yi$ iff $v_1(x) = v_1(y)\cdot i$
-
$t ⊨_{v_1, v_2} x=ε$ iff $v_1(x) = ε$
-
$t ⊨_{v_1, v_2} ε ∈ X$ iff $ε∈ v_2(X)$
the rest is similar to what has been done before.
Interpreting MSO into WSkS
read the section 3.3 of TATA, especially “coding of trees” p.89
digraph {
rankdir=TB;
NFTA -> EMSO[label="poly"];
EMSO -> MSO[label="⊆"];
MSO -> WSkS[label="poly"];
MSO -> NFTA[label="tower"];
WSkS -> NFTA[label="tower"];
}
Leave a comment