Lecture 2: Rescorla-Wagner Rule

Classical conditioning

Pavlovian experiment: make a dog salivate by providing food whenever there’s a certain stimulus

Stimulus then Unconditional stimulus ⇒ Unconditional response

Then, add a new stimulus ⟶ the animal will learn to associate it with the food

Extinction:

still stimulus, but no food afterwards ⟶ the animal “unlearn” to pair the stimulus with the food

Rescorla-Wagner Rule

  • Stimulus: $u_i ∈ \lbrace 0, 1 \rbrace$
  • Reward: $r_i ∈ \lbrace 0, 1 \rbrace$

  • Predictor: $v_i ≝ w u_i$
Loss:
\[L_i ≝ δ_i^2 = (r_i - v_i)^2\]

Then, gradient descent:

\[w ← w - ε \underbrace{\frac{\partial}{\partial w} L_i}_{=-2u_i δ_i}\]

Rescorla-Wagner Rule:

\[w ← w + ε u_i δ_i\]

Partial Reinforcement

Reward ⟶ delivered with a certain probability $p$

⟹ the predicted function fluctuates around $p$ ⟶ this is what is actually seen in experiments

Multi-dimensional

\[\vec{v_i} ≝ ⟨\vec{w}, \vec{u}⟩ \\ \vec{w} ← \vec{w} + ε δ_i \vec{u_i}\]

Blocking

The annimal can’t learn the association between the second stimulus and the reward if the reward is already predicted by the first stimulus.

Secondary conditioning

You want to transfer what it learnt with a first stimulus to a second one (make the animal forget about the first stimulus and just focus on the second one).

In practice, the animal can learn without a reward, so the model isn’t fitted for in this case.

Leave a comment