Lecture 2: Probabilistic approaches to neural computation: the Bayesian Brain
Lecturer: Lyudmila Kushnir
Uncertainty matters: the Bayesian Brain
All of our decisions are subject to uncertainty
The Bayesian Brain
-
$x$: Prior: $P(x)$
- prior expectation (ex: it’s unlikely to see an elephant in the street)
Objects ⟶ Receptors ⟶ Response
-
$s$: spike counts
-
Central Nervous System:
- has an internal model $x ⟶ s$
-
Posterior probability: $P(x \mid s)$
digraph {
rankdir=TB;
X1[label="X"]
X1 -> S[label=" Likelihood: P(S | X)"];
"Prior: P(X)"[shape=none];
"S" -> "Posterior: P(X | S)"[label=" CNS: has a model x ⟶ s "];
}
Poisson Variability in Cortex
Experiments: Variance spike count appears linear with respect to the mean spike count
\[p(\text{spike in } δt) = r δt\]- $r$: firing rate
The mean spike count (related to $r$) is a function of the stimulus
\[p(s) = \frac{(rΔt)^s \exp(-rΔt)}{s!}\]Assumption: spikes not correlated to each other
NB:
- Mean = Variance $= r Δt$
- As spikes are generated randomly, firing rate carries the information
Tuning curves
NB: $r$ is denoted by $f$
Asumption: the mean firing rate is a function of the simulus of objets in the world
- Tuning curve of some particular neuron:
-
when the mean firing rate depends on the value of the stimulus
How to guess the direction of the stimulus?
\[⟨s_i⟩ = f(x-x_i) ≝ f_i(x)\]Average pattern of activity:
-
x-axis: neurons, according to their preferred direction
-
y-axis: activity
$s$ (integer): activity pattern that we measure
\[p(s_i \mid x) = \frac{f_i(x)^{s_i} \exp(-f_i(x))}{s_i !}\]- Independent neurons:
- \[p(s \mid x) = \prod\limits_{ i } p(s_i \mid x)\]
- $x$: events
⇓ Likelihood: $p(s \mid x)$
- $s$: Sensory input neural activity in the sensory areas
⇓ $CNS$
- $\hat{x} = f(s)$
- Likelihood:
- \[p(s\mid \bullet)\]
NB: doesn’t sum to $1$ necessarily
\[\log(p(s\mid x)) = \sum\limits_{ i } \log(f_i(x)) s_i - \sum\limits_{ i } f_i(x)\]How to compute the maximum of likelihood?
⟶ taking the derivative and trying to solve for zero ⟹ too hard to do in practice
Instead, consider:
\[p(x \mid s) = \frac{p(s \mid x) p(x)}{p(s)}\]Then:
\[\log(p(x \mid s)) = L_0(x) + \sum\limits_{ i } \log(f_i(x)) s_i - \sum\limits_{ i } f_i(x)\]⟹ Log of posterior probability (for $x =x_j$):
\[L_j = \sum\limits_{ i } \underbrace{ w_{i, j}}_{\text{synaptic weight}} s_i - \underbrace{θ_j}_{\text{bias}}\]Ex: trying to jump over a hole
-
$x_j$: the hole width is $m$ meters
-
$L_j = \log(p(x_j \mid s)) = \sum\limits_{ i } w_{i,j} s_i - θ_j$
NB: the $w_{ij}$ weights are part of the internal model of the brain, they correspond to the synaptic strength
Cue combination
Cue combination is equivalent to summing activities
Idepentendent stimuli ⟹ Product of proba ⟹ Sum of log posteriors
Multi-dimensional stimulus: population code
\[x^1, x^2 ⟶ s\] \[L_{j, k} = \log(p(x_j^1, x_k^2 \mid s)) = \sum\limits_{ i } \underbrace{\log(f_i(x_j^1, x_k^2))}_{W_{j,k}} s_i - \Big(\underbrace{L_0(x_j^1, x_k^2) - \sum\limits_{ i } f_i(x_j^1, x_k^2)}_{θ_{j,k}} \Big)\]Alternative neural code for uncertainty: sampling code
\[x^1, x^2 ⟶ s ⟶ x_s^1, x_s^2\]With the posterior $p(x_1, x_2 \mid s)$: we get $x_s^1, x_s^2$, whose variability represent what happens with $x_1$ and $x_2$ (which are uncertain) in the external world.
Ex: you can infer that whenever $x_1$ is active, $x_2$ is too.
NB: variability is no longer a Poisson distribution ⟶ it mimics the posteriors, which is not necessarily Poisson
Ex: if the posterior is very narrow, the uncertainty is low, and the variability of the activity is also very narrow.
Experimental evidence backing this up: The prior tends to the average posterior of natural stimuli
Population code: increasses the response gain
Sample code: decreases the response variance
Sampling: not clear how to implement, easy computations
digraph {
rankdir=TB;
"x^1", "x^2" -> s -> "x^1_s", "x^2_s";
"x^1_s" -> y;
"x^2_s" -> y;
}
- Chain rule:
- \[p(y \mid s) = \sum\limits_{ x } p(y \mid x) p(x \mid s)\]
Ex: back to jumping over a hole
-
$x_1$: distance between the edges
-
$x^2$: alligators are there
-
$y$ is active as much as there is danger:
\[y = (x^1 > \underbrace{d}_{\text{my threshold, how far I can jump}}) \text{ and } x^2\]
digraph {
rankdir=TB;
"x^3" -> "y^1";
}
-
$x_3$: collection of sensory evidence that a tiger is there or not
-
$y_1$: the tiger is there
NB: the activity of the $x^i_s$ are sampled from the posterior distribution, whereas before (in the population code), they were an average: the log posterior.
Leave a comment