Lecture 4: Neuromimetic Navigation Strategies

Teacher: Benoît Girard

Navigation: communities of roboticians and neuroscientists intermingled

Robotics/Neuroscience: more or less same algorithms/concepts, but not the same motivations

Ideas more general than navigation ⇒ decision making, etc… (it could have nothing to do with vision)

Neural basis for navigation:

Hippocampus
Basal ganglia: for Reinforcement Learning
Thalamus
Superior Colliculus: several layers, one of them being a map of the visual field ⟹ there’s a mapping between the surface of the SC and the visual field
- communicates with basal ganglia

Taxonomy: different kinds of strategies:

Target/Beacon approach: go straight toward the target
Stimulus-triggered response
Route-based
Map-based: place-triggered (from place cells ⟶ model-free, you don’t have the transition function) VS topological/metric (model-based)

⟶ Complexity of spatial information processing is increasing.

Simpler taxonomy: response-based (at least 6 different strategies) VS map-based navigation

Target approach

Ex: a rodent in a pool, looking for the platform (not to swim anymore). In this case, reaching the platform is rewarding in itself.

Hidden platform with a cue (ex: flag) to indicate it ⟹ aim = to select some cues among all, focus on the relevant ones (perceptual discrimination among all the visual cues).

The cues are not themselves linked to the relevant actions (the actions are up to the SC for instance)

Path integration

We remove all the allocentric information, the animal look for food in a dark room: when it finds food, it goes back straight to the nest (in a straight pathway) ⇒ the animal has done path integration.

⟹ there’s no learning involved (you sum your accumulated movements mentally) BUT you accumulate errors → not reliable for navigation (to use just when you have no choice)

Grid cells are likely to be involved in this

Praxis

1936: mutilated rats (deaf, blind, etc… → no allocentric input anymore) in a maze already trained to follow a complicated path by path-integration ⟶ end up finding their way in it

Praxis ⇒ supervized learning first, then use automated routine and idiothetic information

TD-learning with a discount factor → curse of dimensionality, too many repetitions needed

But advantage ⟶ easy computation (the price for it the long convergence time)

Other main drawback: to relearn something, it’s even longer than for the initialization

Other possibility: learn the graph of the location, then use this information to go from $A$ to $B$ ⟶ problem: shortest path linear in the number of vertices and edges ⇒ costly, compared to accessing the relevant values in an vector.

Planification

Q-learning not practical → no reasoning on the structure of space, you may need to do a long backpropagation to learn that a path you explore leads to a state you know is not advantageous.

Egocentric sequential strategy

Learn based on sequences of your sensory input, not the places/locations you encounter.

You no longer use location as an input, but sensory information.

Then: tradeoff to meet between location and sensory information.

Interaction models

Srategy combination: combine the Q-values estimations of differents system by summing them.

Ex: rat in a maze: the input might be, classically the intersections in the maze, but also the “shape of the corridor” (ex: T-shape intersection, straight corridor, etc…). Then, apply Q-learning to both of these and merge them.

Share on

Twitter Facebook Google+ LinkedIn