Lab: K-means and expectation maximization algorithms
Lecturer: Lyudmila Kushnir
Problem 1. K-means clustering
Link of the associated Jupyter notebook
\[J = \sum\limits_{ n=1 }^N \sum\limits_{ k=1 }^K r_{nk} \vert x_n - μ_k \vert^2\]-
E-step:
assign each data point $x_n$ to the closest cluster
-
M-step:
\[\frac{∂J}{∂μ_k} = \frac{∂}{∂μ_n}\left( \sum\limits_{ n=1 }^N r_{nk} \vert x_n - μ_k \vert^2\right) = -2 \sum\limits_{ n=1 }^N r_{nk} (x_n - μ_k)\]So:
\[\begin{align*} & \frac{∂J}{∂μ_k} = 0 \\ ⟺ \quad & -2 \sum\limits_{ n=1 }^N r_{nk} (x_n - μ_k) = 0 \\ ⟺ \quad & μ_k = \frac{\sum\limits_{ n=1 }^N r_{nk} x_n}{\sum\limits_{ n=1 }^N r_{nk}}\\ \end{align*}\]
2. Expectation Maximization for Gaussian mixtures
For instance:
\[P(X) = \sum\limits_{ k=1 }^K π_k \frac{1}{\sqrt{2π σ_k^2}} \; \exp\left(-\frac{X-μ_k}{2σ_k^2}\right)\]Latent variable:
\[\textbf{Z} = \begin{pmatrix} 0 \\ \vdots \\ 1\\ \vdots\\ 0 \end{pmatrix}\]and $\textbf{Z}_k = 1$ with probability $π_k$
E-step:
\[γ_k(x_n) = P(z=k \mid x_n) = \frac{P(z=k) P(x_n \mid k)}{P(x_n)} = \frac{π_k 𝒩(x_n \mid μ_k, \textbf{Σ}_k)}{\sum\limits_{ j } π_j 𝒩(x_n \mid μ_j, \textbf{Σ}_j) }\]and then M-step:
\[μ_k = \frac{\sum\limits_{ n=1 }^N γ_k(x_n) x_n }{\sum\limits_{ n=1 }^N γ_k(x_n)}\\ \textbf{Σ}_k^{new} = \frac {\sum\limits_{ n=1 }^N γ_k(x_n) (x_n - μ_k) (x_n - μ_k)^T}{\sum\limits_{ n=1 }^N γ_k(x_n)}\\ π_k = \frac{\sum\limits_{ n=1 }^N γ_k(x_n)}{N}\]\[P(X \mid π, μ, \textbf{Σ}) = \prod\limits_{ n=1 }^N P(x_n \mid π, μ, \textbf{Σ})\\ = \prod\limits_{ n=1 }^N \Bigg(\sum\limits_{ k=1 }^K \underbrace{P(z_k)}_{≝ \; π_k} \; \underbrace{P(x_n \mid z_k, π, μ, \textbf{Σ})}_{≝ \; 𝒩(x_n \mid μ_k, \textbf{Σ}_k)}\Bigg)\]
Leave a comment