Lecture 2: Rescorla-Wagner Rule
Classical conditioning
Pavlovian experiment: make a dog salivate by providing food whenever there’s a certain stimulus
Stimulus then Unconditional stimulus ⇒ Unconditional response
Then, add a new stimulus ⟶ the animal will learn to associate it with the food
- Extinction:
-
still stimulus, but no food afterwards ⟶ the animal “unlearn” to pair the stimulus with the food
Rescorla-Wagner Rule
- Stimulus: $u_i ∈ \lbrace 0, 1 \rbrace$
-
Reward: $r_i ∈ \lbrace 0, 1 \rbrace$
- Predictor: $v_i ≝ w u_i$
- Loss:
- \[L_i ≝ δ_i^2 = (r_i - v_i)^2\]
Then, gradient descent:
\[w ← w - ε \underbrace{\frac{\partial}{\partial w} L_i}_{=-2u_i δ_i}\]Rescorla-Wagner Rule:
\[w ← w + ε u_i δ_i\]
Partial Reinforcement
Reward ⟶ delivered with a certain probability $p$
⟹ the predicted function fluctuates around $p$ ⟶ this is what is actually seen in experiments
Multi-dimensional
\[\vec{v_i} ≝ ⟨\vec{w}, \vec{u}⟩ \\ \vec{w} ← \vec{w} + ε δ_i \vec{u_i}\]Blocking
The annimal can’t learn the association between the second stimulus and the reward if the reward is already predicted by the first stimulus.
Secondary conditioning
You want to transfer what it learnt with a first stimulus to a second one (make the animal forget about the first stimulus and just focus on the second one).
In practice, the animal can learn without a reward, so the model isn’t fitted for in this case.
Leave a comment