A model of learning that is decades-old is under fire with implications for AI.
The buzz of a notification or the ding of an email might inspire excitement - or dread.
In a famous experiment, Ivan Pavlov showed that dogs can be taught to salivate at the tick of a metronome or the sound of a harmonium.
This connection of cause to effect - known as associative, or reinforcement learning - is central to how most animals deal with the world.
Since the early 1970s the dominant theory of what is going on has been that animals learn by trial and error.
Associating a cue (a metronome) with a reward (food) happens as follows.
When a cue comes, the animal predicts when the reward will occur.
Then, it waits to see what arrives.
After that, it computes the difference between prediction and result - the error.
Finally, it uses that error estimate to update things to make better predictions in future.
Belief in this approach was itself reinforced in the late 20th century by two things.
One of these was the discovery that it is also good at solving engineering problems related to artificial intelligence (AI).
Deep neural networks learn by minimizing the error in their predictions.
The other reinforcing observation was a paper published in Science in 1997.
It noted that fluctuations in levels in the brain of dopamine, a chemical which carries signals between some nerve cells and was known to be associated with the experience of reward, looked like prediction-error signals.
Dopamine-generating cells are more active when the reward comes sooner than expected or is not expected at all, and are inhibited when the reward comes later or not at all - precisely what would happen if they were indeed such signals.
A nice story, then, of how science works.
But if a new paper, also published in Science, turns out to be correct, it is wrong.
Researchers have known for a while that some aspects of dopamine activity are inconsistent with the prediction-error model.
But, in part because it works so well for training artificial agents, these problems have been swept under the carpet.
The new study, by Huijeong Jeong and Vijay Namboodiri of the University of California, San Francisco, and a team of collaborators, has turned the world of neuroscience on its head.
It proposes a model of associative learning which suggests that researchers have got things backwards.
Their suggestion, moreover, is supported by an array of experiments.
The old model looks forward, associating cause with effect.
The new one does the opposite.
It associates effect with cause.
They think that when an animal receives a reward (or punishment), it looks back through its memory to work out what might have prompted this event.
Dopamine's role in the model is to flag events meaningful enough to act as causes for possible future rewards or punishments.