Lecture 4: Addiction
Addiction: affects all parts of the brain
-
Anhedonia hypothesis: the drug gives a high that is porgressively countered by an opponent process
-
Incentive sensitization: instrumental conditioning hypothesis, drug acting as a reinforcer (reinforcement learning)
-
Habit motivation hypothesis: drug taking gets transferred from ‘motivated’ to habitual behavior.
→ loss of cognitive control: memory pathology
Anhedonia hypothesis: homestasis
- taking a drug has high hedonic value
- addict feels better than homestasis norm
- absense of drug leads to a sharp fall below the homestasis norm
- addict takes the drug to redress the “withdrawal”
Hedonic allostasis:
- hedonic set point is affected by drug
- progressive decrease in the set point with drug intake: opponen process
- addict escalates drug seeking to redress withdrawal
BUT the most addictive drugs are not necessarily the most hedonic (ex: cocaine less addictive than nicotine)
Negative reinforcement is key (the absence of the drug makes the addict feel)
Incentive sensitization
You should separate liking (about hedonics) and wanting (about motivation)
Addicts want to take the drug, irrespective of whether they like it or not
→ reinforcement learning: the drug creates a pathological motivation: drug is a pharmalogical positive reinforcer
Habit motivation hypothesis
At the beginning, taking the drug is under motivational control, but progressively, it gets out of control and becomes a habit
As the drug taking continues: the control get transferred from motivational/reward circuits to action circuits (done automatically)
Biological basis
Key point: dopamine → the dopamine circuit is the common target drugs. Drugs boost the dopamine release.
Ex:
-
cocaine: inhibits DA reuptake
- habitual cocaine taking in rats can boost average dopamine as much as 300%
- leads to dose escalation: the amount of cocaine it takes increases from day to day to have the same effects
-
nicotine: acts on DA neurons
- Ventral pathway: critic (evalates as being rewarding or not)
- Dorsal pathway: actor (generates action policy)
Behavioral models of addiction
Addiction: really hard to model as a whole
- Gambling tasks (humans)
-
Linear-track motor sensitization task in rodents
- mouse on one side, something (food, other mouses, drugs, etc…) on the other side: does the mouse head to this “something”?
- cocaine impact increases the speed of the mouse to travel the distance
- “reward” threshold
-
self-administration task
- reduced model for drug taking/using
Reward threshold task
The rat can self stimulate the reward, by pushing a lever/rolling a roller ⟶ reinforcement
Now: adjust the strength of the stimulation and find out at what threshold does the rat stops pushing the lever
Then: make the rat take a drug, and see if the threshold has changed: if it has decreased ⟶ the drug is a booster (ex: it is the case for cocaine)
Self-administration
Have the rat P push a lever/poke its nose in a hole: the drug has the choice between 2 types of levers ⟶ which one does the rat end up choosing the most?
Computational models of self-administration
-
Positive Reinforcement: Reinforcement Learning (RL)
-
Negative Reinforcement Learning
-
Dynamical Model: progression to habit
-
Economic model of the rationality of addiction
Positive Reinforcement: DA rewards
- Phasic DA represents reward error
- Normal: error is learned away
- Cocaine persistently provokes phasic DA response
- Excessive TD error
Ex: dopamine self-administration by rats whenever a green light is swtiched on ⟶ positive prediction error all the time for this action!
\[V(t) = \int_{t=0}^∞ γ^{τ-t} E(R(τ)) dτ\\ δ(t) = γ^d [R(S_l) + V(S_l)] - V_k\]Action selection is value driven:
\[δ = \max \lbrace γ^d [R(S_l) + V(S_l)] - V_k + D(S_l), D(S_l)\rbrace\]This model says “In self-administration paradigm, drugs will always win” ⟶ not true in real life
Weird phenomenon: present saccarhose to cocaine-addict rats ⟶ they will favour saccarhose, and do cocaine from time to time.
\[δ_t^c = \max \lbrace r_{t+1} V_{S_{t+1}} - V_{S_t}+ D_{S_t}, D_{S_t}\rbrace - ρ_t\]
where $ρ_t$ tracks the average rewarding value of the environment
- Solves the problem of infinite values
- Explains why addicts become more impulsive in delay discouting tasks (DDT) after long-termm drug exposure
Problem: in this model: there is no term evaluating the future consequences of such and such action
-
conflict between what addict say or want and what they actually do
-
Sense of loss of control
Speculation: Prefrontal cortical control systems needs to work harder to inhibit the drug.
Leave a comment