Lecture 2: Probabilistic approaches to neural computation: the Bayesian Brain
Lecturer: Lyudmila Kushnir
Uncertainty matters: the Bayesian Brain
All of our decisions are subject to uncertainty
The Bayesian Brain

$x$: Prior: $P(x)$
 prior expectation (ex: it’s unlikely to see an elephant in the street)
Objects ⟶ Receptors ⟶ Response

$s$: spike counts

Central Nervous System:
 has an internal model $x ⟶ s$

Posterior probability: $P(x \mid s)$
digraph {
rankdir=TB;
X1[label="X"]
X1 > S[label=" Likelihood: P(S  X)"];
"Prior: P(X)"[shape=none];
"S" > "Posterior: P(X  S)"[label=" CNS: has a model x ⟶ s "];
}
Poisson Variability in Cortex
Experiments: Variance spike count appears linear with respect to the mean spike count
\[p(\text{spike in } δt) = r δt\] $r$: firing rate
The mean spike count (related to $r$) is a function of the stimulus
\[p(s) = \frac{(rΔt)^s \exp(rΔt)}{s!}\]Assumption: spikes not correlated to each other
NB:
 Mean = Variance $= r Δt$
 As spikes are generated randomly, firing rate carries the information
Tuning curves
NB: $r$ is denoted by $f$
Asumption: the mean firing rate is a function of the simulus of objets in the world
 Tuning curve of some particular neuron:

when the mean firing rate depends on the value of the stimulus
How to guess the direction of the stimulus?
\[⟨s_i⟩ = f(xx_i) ≝ f_i(x)\]Average pattern of activity:

xaxis: neurons, according to their preferred direction

yaxis: activity
$s$ (integer): activity pattern that we measure
\[p(s_i \mid x) = \frac{f_i(x)^{s_i} \exp(f_i(x))}{s_i !}\] Independent neurons:
 \[p(s \mid x) = \prod\limits_{ i } p(s_i \mid x)\]
 $x$: events
⇓ Likelihood: $p(s \mid x)$
 $s$: Sensory input neural activity in the sensory areas
⇓ $CNS$
 $\hat{x} = f(s)$
 Likelihood:
 \[p(s\mid \bullet)\]
NB: doesn’t sum to $1$ necessarily
\[\log(p(s\mid x)) = \sum\limits_{ i } \log(f_i(x)) s_i  \sum\limits_{ i } f_i(x)\]How to compute the maximum of likelihood?
⟶ taking the derivative and trying to solve for zero ⟹ too hard to do in practice
Instead, consider:
\[p(x \mid s) = \frac{p(s \mid x) p(x)}{p(s)}\]Then:
\[\log(p(x \mid s)) = L_0(x) + \sum\limits_{ i } \log(f_i(x)) s_i  \sum\limits_{ i } f_i(x)\]⟹ Log of posterior probability (for $x =x_j$):
\[L_j = \sum\limits_{ i } \underbrace{ w_{i, j}}_{\text{synaptic weight}} s_i  \underbrace{θ_j}_{\text{bias}}\]Ex: trying to jump over a hole

$x_j$: the hole width is $m$ meters

$L_j = \log(p(x_j \mid s)) = \sum\limits_{ i } w_{i,j} s_i  θ_j$
NB: the $w_{ij}$ weights are part of the internal model of the brain, they correspond to the synaptic strength
Cue combination
Cue combination is equivalent to summing activities
Idepentendent stimuli ⟹ Product of proba ⟹ Sum of log posteriors
Multidimensional stimulus: population code
\[x^1, x^2 ⟶ s\] \[L_{j, k} = \log(p(x_j^1, x_k^2 \mid s)) = \sum\limits_{ i } \underbrace{\log(f_i(x_j^1, x_k^2))}_{W_{j,k}} s_i  \Big(\underbrace{L_0(x_j^1, x_k^2)  \sum\limits_{ i } f_i(x_j^1, x_k^2)}_{θ_{j,k}} \Big)\]Alternative neural code for uncertainty: sampling code
\[x^1, x^2 ⟶ s ⟶ x_s^1, x_s^2\]With the posterior $p(x_1, x_2 \mid s)$: we get $x_s^1, x_s^2$, whose variability represent what happens with $x_1$ and $x_2$ (which are uncertain) in the external world.
Ex: you can infer that whenever $x_1$ is active, $x_2$ is too.
NB: variability is no longer a Poisson distribution ⟶ it mimics the posteriors, which is not necessarily Poisson
Ex: if the posterior is very narrow, the uncertainty is low, and the variability of the activity is also very narrow.
Experimental evidence backing this up: The prior tends to the average posterior of natural stimuli
Population code: increasses the response gain
Sample code: decreases the response variance
Sampling: not clear how to implement, easy computations
digraph {
rankdir=TB;
"x^1", "x^2" > s > "x^1_s", "x^2_s";
"x^1_s" > y;
"x^2_s" > y;
}
 Chain rule:
 \[p(y \mid s) = \sum\limits_{ x } p(y \mid x) p(x \mid s)\]
Ex: back to jumping over a hole

$x_1$: distance between the edges

$x^2$: alligators are there

$y$ is active as much as there is danger:
\[y = (x^1 > \underbrace{d}_{\text{my threshold, how far I can jump}}) \text{ and } x^2\]
digraph {
rankdir=TB;
"x^3" > "y^1";
}

$x_3$: collection of sensory evidence that a tiger is there or not

$y_1$: the tiger is there
NB: the activity of the $x^i_s$ are sampled from the posterior distribution, whereas before (in the population code), they were an average: the log posterior.
Leave a comment