Lecture 4: Addiction

Addiction: affects all parts of the brain

  • Anhedonia hypothesis: the drug gives a high that is porgressively countered by an opponent process

  • Incentive sensitization: instrumental conditioning hypothesis, drug acting as a reinforcer (reinforcement learning)

  • Habit motivation hypothesis: drug taking gets transferred from ‘motivated’ to habitual behavior.

→ loss of cognitive control: memory pathology

Anhedonia hypothesis: homestasis

  • taking a drug has high hedonic value
  • addict feels better than homestasis norm
  • absense of drug leads to a sharp fall below the homestasis norm
  • addict takes the drug to redress the “withdrawal”

Hedonic allostasis:

  • hedonic set point is affected by drug
  • progressive decrease in the set point with drug intake: opponen process
  • addict escalates drug seeking to redress withdrawal

BUT the most addictive drugs are not necessarily the most hedonic (ex: cocaine less addictive than nicotine)

Negative reinforcement is key (the absence of the drug makes the addict feel)

Incentive sensitization

You should separate liking (about hedonics) and wanting (about motivation)

Addicts want to take the drug, irrespective of whether they like it or not

reinforcement learning: the drug creates a pathological motivation: drug is a pharmalogical positive reinforcer

Habit motivation hypothesis

At the beginning, taking the drug is under motivational control, but progressively, it gets out of control and becomes a habit

As the drug taking continues: the control get transferred from motivational/reward circuits to action circuits (done automatically)

Biological basis

Key point: dopamine → the dopamine circuit is the common target drugs. Drugs boost the dopamine release.


  • cocaine: inhibits DA reuptake

    • habitual cocaine taking in rats can boost average dopamine as much as 300%
    • leads to dose escalation: the amount of cocaine it takes increases from day to day to have the same effects
  • nicotine: acts on DA neurons

  • Ventral pathway: critic (evalates as being rewarding or not)
  • Dorsal pathway: actor (generates action policy)

Behavioral models of addiction

Addiction: really hard to model as a whole

  • Gambling tasks (humans)
  • Linear-track motor sensitization task in rodents

    • mouse on one side, something (food, other mouses, drugs, etc…) on the other side: does the mouse head to this “something”?
    • cocaine impact increases the speed of the mouse to travel the distance
  • “reward” threshold
  • self-administration task

    • reduced model for drug taking/using

Reward threshold task

The rat can self stimulate the reward, by pushing a lever/rolling a roller ⟶ reinforcement

Now: adjust the strength of the stimulation and find out at what threshold does the rat stops pushing the lever

Then: make the rat take a drug, and see if the threshold has changed: if it has decreased ⟶ the drug is a booster (ex: it is the case for cocaine)


Have the rat P push a lever/poke its nose in a hole: the drug has the choice between 2 types of levers ⟶ which one does the rat end up choosing the most?

Computational models of self-administration

  • Positive Reinforcement: Reinforcement Learning (RL)

  • Negative Reinforcement Learning

  • Dynamical Model: progression to habit

  • Economic model of the rationality of addiction

Positive Reinforcement: DA rewards

  • Phasic DA represents reward error
  • Normal: error is learned away
  • Cocaine persistently provokes phasic DA response
  • Excessive TD error

Ex: dopamine self-administration by rats whenever a green light is swtiched on ⟶ positive prediction error all the time for this action!

\[V(t) = \int_{t=0}^∞ γ^{τ-t} E(R(τ)) dτ\\ δ(t) = γ^d [R(S_l) + V(S_l)] - V_k\]

Action selection is value driven:

\[δ = \max \lbrace γ^d [R(S_l) + V(S_l)] - V_k + D(S_l), D(S_l)\rbrace\]

This model says “In self-administration paradigm, drugs will always win” ⟶ not true in real life

Weird phenomenon: present saccarhose to cocaine-addict rats ⟶ they will favour saccarhose, and do cocaine from time to time.

\[δ_t^c = \max \lbrace r_{t+1} V_{S_{t+1}} - V_{S_t}+ D_{S_t}, D_{S_t}\rbrace - ρ_t\]

where $ρ_t$ tracks the average rewarding value of the environment

  • Solves the problem of infinite values
  • Explains why addicts become more impulsive in delay discouting tasks (DDT) after long-termm drug exposure

Problem: in this model: there is no term evaluating the future consequences of such and such action

  • conflict between what addict say or want and what they actually do

  • Sense of loss of control

Speculation: Prefrontal cortical control systems needs to work harder to inhibit the drug.

Leave a comment