DOPAMINE: A SHATTERPROOF SIGNAL FOR LEARNING
by Neir Eshel
February 8th, 2016
Dopamine plays an outsized role in the public imagination, acting as a ‘happiness’ chemical, the drug that causes psychosis, or the pill that allows frozen people to move again, as in Oliver Sacks’ famous book, Awakenings. But 20 years ago, experiments with monkeys revealed a more specific role for dopamine: comparing outcomes with expectations. When an outcome is better than expected, dopamine neurons increase their activity. When an outcome is completely expected, dopamine neurons do not respond. And when an outcome is worse than expected, dopamine neurons go silent. This pattern of responses is deemed ‘reward prediction error’ and is thought to be a crucial way that we learn from our experiences. Positive prediction errors reinforce actions that lead to reward, while negative prediction errors prevent actions that lead to punishment.
In our new study, published this week in Nature Neuroscience, we explore how individual dopamine neurons make this calculation. Surprisingly, we discover that each neuron calculates prediction error in exactly the same way. Such a system is exceptionally robust and redundant, ensuring that the prediction error signal can be exploited by the broadest possible array of brain circuits to help us learn.
We recorded from neurons deep in the brain while thirsty mice performed simple tasks for water reward. Sometimes we delivered water out of the blue, completely unexpectedly. Other times we presented an odor that predicted water delivery. Every time this odor was presented, the mouse learned to expect water at a particular time in the future. By delivering different amounts of water, with or without the preceding odor, we could measure the precise method that dopamine neurons used to calculate prediction error. We then compared this method from neuron to neuron.
We found that dopamine neurons calculate prediction error through simple subtraction. This is consistent with previous computational theories, but quite rare to find in the brain. In most other settings, neurons appear to work through multiplication or division, rather than addition or subtraction. In this case, though, subtraction is the best method for a precise calculation, and the brain appears to have evolved accordingly.
Moreover, each neuron appears to perform this subtraction in exactly the same way. This is even true for dopamine neurons recorded on different days, from different mice. The only difference between neurons was in the magnitude of their responses to unexpected rewards. Given this information, the rest of that neuron’s response was perfectly predictable. Indeed, even the ‘noise’ in dopamine neurons’ responses—that is, the different activity they exhibit from trial to trial, when the stimuli remain the same—was correlated from neuron to neuron. This has two profound implications: 1) that different dopamine neurons likely have overlapping inputs, and 2) that the targets of dopamine release likely receive similar information, regardless of which dopamine neurons they contact.
The homogeneity of dopamine neuron responses reinforces the idea that dopamine neurons broadcast a common signal to the rest of the brain: namely, prediction error. Even if a group of dopamine neurons were to die, the signal would persist. Thus, the system beautifully ensures our ability to perform one fundamental task: learning from our experience.
PATHWAY FOR DISAPPOINTMENT
September 10th, 2015
Imagine you are a child hoping to get a teddy bear from your parents as a birthday gift. What if they gave you a box of candies instead? Or, worse, what if they forgot your birthday entirely? Naturally, you might feel disappointed. On the other hand, you might be quite pleased if your parents gave you the same candies as a surprise on another day. In this case, your response to a gift is dramatically influenced by your expectation. Our brains always compare the rewards we get with what we expected.
But how does this comparison happen in our brains? Neurons that use dopamine as a neurotransmitter ("dopamine neurons") seem to represent the difference between actual reward and expectation. For instance, dopamine neurons transiently pause their spontaneous firing when an expected reward is omitted. Interestingly, this response occurs when reward was expected but was not granted. In other words, when nothing happened! This signal -- a dip in activity -- occurs exactly when reward was expected to happen. More generally, dopamine neurons are known to signal error in reward prediction, a.k.a reward prediction errors. When the outcome is better than expected, dopamine neurons increase their firing rates. When the outcome is worse than expected, their firing rates decrease. How dopamine neurons generate these prediction errors remains unknown.
In our study published in Neuron, we examined the contribution of a region of the brain called the habenula to dopamine prediction error signals. The habenula has long been a mysterious area, located at the very center of the brain, bridging the forebrain and the midbrain. Recent studies revealed that neurons in the lateral habenula signal prediction errors, although the direction of the responses (excitation versus inhibition) was opposite that of dopamine neurons. Given the existence of an inhibitory projection from the lateral habenula onto dopamine neurons, it has been hypothesized that dopamine neurons may relay prediction error signals from the habenula. To test this hypothesis, we removed input from the habenula by making an electrolytic lesion and examined what aspects of prediction error signals were affected in dopamine neurons. We found that, in animals with habenula lesions, the dip caused by reward omission was largely diminished. Surprisingly, the dip caused by aversive stimuli (e.g. an air puff) was not affected or enhanced. Note that negative prediction error can occur, for example, (1) when not receiving an expected reward (disappointment) or (2) when receiving an unexpected negative outcome (punishment). Our study showed that these types of negative prediction error are regulated by different mechanisms.
In our previous study (Eshel et al., 2015), we found that reward expectation reduces reward responses in a subtractive fashion. While divisive gain changes are common in the nervous system, subtraction is rarely found in the brain and its mechanisms are unknown. A key feature of subtractive computation is that dopamine neurons reduce their activity below baseline when reward is smaller than expected. In this new study, we found a key mechanism that pushes down dopamine neuron firing below baseline.
Our study also opens doors for future research. We found that many aspects of prediction error signals in dopamine neurons remain intact after large lesions in the habenula. This implies that other inputs to dopamine neurons are also making important contributions to prediction error coding. Based on our anatomical mapping of dopamine inputs (Menegas et al., 2015; Watabe-Uchida et al., 2012), areas such as the striatum, lateral hypothalamus, and tegmental areas are at the top of the list for future investigation.
PREDICTIONS AND THE BRAIN
August 31st, 2015
Say you’re at a supermarket, staring at two cartons of ice cream: chocolate and caramel. Before making your choice, you try to predict which will be more delicious. Wasn’t the caramel a bit too sweet last time? Wait, wasn’t the chocolate a little bitter? You hem and haw, and then choose the one you expect to be better.
Our new study demonstrates how the brain makes this type of prediction and uses it to optimize decisions.
We recorded from neurons deep in the brain while mice performed simple tasks. The animals had to learn the association between different odors and different rewards. Rather than ice cream, the researchers used water, which was rewarding to the thirsty mice. Usually, the mice would receive the reward they expected. Occasionally, however, the reward would be bigger or smaller. In those cases when the outcome was different from predicted, the chemical dopamine became especially important. If reward was bigger than predicted, dopamine neurons increased their activity. If reward was smaller than predicted, dopamine neurons decreased their activity. And if reward was the same as predicted, the neurons made no changes. In this way, dopamine neurons calculated the difference between expected and actual reward.
This pattern of responses is called ‘reward prediction error’, and dopamine neurons have been known to calculate it for over 20 years. It is thought that this signal is crucial for animals, including humans, to improve their predictions over time, allowing us to maximize reward (and the chance for a truly delicious ice cream dessert). However, it was never known how dopamine neurons make this calculation. In particular, how do dopamine neurons know how much reward to expect?
In our paper, published this week in the journal Nature, we discovered that a group of neurons intermingled with dopamine neurons provide the expectation signal. A previous paper from our lab had shown that when reward was expected, these inhibitory neurons (called GABA neurons) became active. But it was unknown whether dopamine neurons use this signal to calculate prediction error. In the paper published this week, we artificially increased the activity of GABA neurons, using a technique called optogenetics that makes neurons sensitive to light shined through a fiber-optic in the brain. When we did so, we found that dopamine neuron activity was reduced, as if reward was expected, even though it was not. Conversely, if we artificially decreased the activity of the GABA neurons, dopamine neuron activity was increased, as if the previously expected reward had become surprising. In other words, shifting the level of activity in GABA neurons appeared to shift the level of expectation reflected by the dopamine neurons.
These manipulations also affected mouse behavior. When we artificially increased GABA neuron activity on both sides of the brain, thereby artificially increasing the level of expectation, mice acted as if they were disappointed by the reward they got. The same reward that used to cause high levels of anticipation no longer elicited any anticipation when GABA activity was increased.
Finally, we designed an experiment to understand exactly how this prediction error calculation is made. We gave the mice different sizes of reward and plotted how dopamine neurons respond to these different sizes. Then we taught the mice to expect reward, and watched how expectation shifts the dopamine response. It turns out that dopamine neurons simply subtract the expectation signal, which we now know comes from GABA neurons. This is consistent with classic learning theories, but actually quite surprising in the brain. There are very few other examples where neurons seem capable of pure addition or subtraction; instead, the brain generally works through multiplication or division. In this case, though, subtraction allows for a precise and consistent calculation, and appears to be exactly what the brain evolved to do.
Together, our experiments demonstrate how a small circuit deep in the brain makes a simple calculation that enables a crucially important behavior: learning what’s good and what isn’t.
"PERSONALIZED LESSON" MAY NOT BE DOPAMINE'S WAY
September 1st, 2015
Dopamine, originally referred to as a pleasure molecule, is now one of the most well known neurotransmitters. Dopamine neurons are thought to broadcast a teaching signal for reinforcement learning throughout the brain. Dopamine neurons in the midbrain encode reward prediction error, which is the discrepancy between our expectation and reality. This signal potentially guides our behavior to maximize rewards in the future.
In a previous study (Watabe-Uchida et al., 2012), we used a genetically modified rabies virus to label all of the monosynaptic inputs to dopamine neurons. We reasoned that finding the inputs to these neurons would help us understand how they function. We found that many brain areas project directly onto dopamine neurons, but wanted to further refine our map of this circuit.
In our new study, led by Mitsuko Watabe-Uchida, (Menegas et al., 2015), we labeled the inputs to dopamine neurons based on their projection target. The main projection target of midbrain dopamine neurons is the striatum. However, dopamine neurons also project to other brain areas such as the amygdala, habenula, and much of the cortex. If dopamine encodes a teaching signal that guides behavior, then each brain area might improve its “behavior” in parallel to other brain areas. The simplest way of doing this would be for each brain area to send an expectation signal to dopamine neurons and receive an error signal back from that same population.
Instead, we found that most populations of dopamine neurons (defined by their projection targets) have a surprisingly similar distribution of inputs and are not embedded in parallel circuits. So, each brain area probably does not learn independently.
However, we also found that dopamine neurons projecting to the tail of the striatum differ dramatically from other populations. While most dopamine neurons receive many inputs from regions involved in reinforcement learning, addiction, and appetitive behavior (such as the ventral part of the striatum and hypothalamus), dopamine neurons projecting to the tail of the striatum receive inputs preferentially from regions involved in motor function and arousal (such as the globus pallidus, subthalamic nucleus, and zona incerta). This result suggests that dopamine release in the tail of the striatum might have a unique function, while most other dopamine neurons may encode a teaching signal.
This new study (Menegas et al., 2015) used CLARITY, a method for making tissue optically transparent, to allow the brains to be imaged as whole volumes using a light-sheet microscope. These brains were then aligned in 3D so that they could be compared to each other using a standard set of region boundary definitions. This is a technical benchmark for future anatomical studies, demonstrating that intact whole-brain imaging can be used to compare the inputs of different populations of cells with high precision in an automated fashion.
In summary, we pioneered an automated imaging pipeline which helps to lower the hurdle for future systematic anatomy studies, and increase their consistency and efficiency. Using this technique, we uncovered organization of dopamine circuits and found a unique population of dopamine neurons: the tail of striatum-projecting dopamine neurons. What is the function of this group of dopamine neurons? We hope that our study opened the door to further investigation.
Congratulations to Marissa Shoji for winning a Hoopes prize for her thesis in Neurobiology, entitled " Characterization of the Activity of Glutamatergic Neurons in the Pedunculopontine Tegmentum During Decision-Making" Dept. News Here!
FEELING GOOD OR BAD LATELY? LISTEN TO SEROTONIN
by Mackenzie Amoroso and Jeremiah Cohen
March 10th, 2015
Serotonin is one of the most widespread and mysterious neurotransmitters in the brain. It has been proposed to be involved in many aspects of behavior, including regulating mood and our responses to aversive environmental events. Furthermore, it has also been proposed that a deficiency in serotonin plays a central role in depression. One of the major challenges in testing hypotheses about serotonin's function has been observing the activity of serotonin-releasing neurons during behavior. Historically, when we place a microelectrode into the midbrain structure that contains serotonin-releasing neurons, it was difficult to know whether the neuron under observation was releasing serotonin.
To address this problem, we used a combination of transgenic mice and optogenetics to identify serotonin neurons by their response to light stimulation. Then, we recorded the activity of these light-identified serotonin neurons as mice participated in a task in which the amount of reward or punishment available in the environment varied predictably over time. We found that 40% of serotonin neurons showed slow variations in the activity that correlated with the amount of reward in the environment. This was remarkable, as when we recorded the activity from light-identified dopamine neurons, which have long been thought to be involved in reward, they did not signal information on these slow timescales. In contrast, we found that all the dopamine neurons encoded only the immediate properties of the environment (for example, "I'm about to get a reward"), and only a fraction of the serotonin neurons signaled these immediately pending rewards. Taken together, serotonin neurons have the ability to signal reward and punishment on both slow and fast timescales. These results suggest that serotonin signals could be important for regulating our behavior on slow timescales, and may be involved in generating emotional states like mood.
August 6th, 2014 - Cell Reports article
Serotonin and dopamine are major neuromodulators. Here, we used a modified rabies virus to identify monosynaptic inputs to serotonin neurons in the dorsal and median raphe (DR and MR). We found that inputs to DR and MR serotonin neurons are spatially shifted in the forebrain, and MR serotonin neurons receive inputs from more medial structures. Then, we compared these data with inputs to dopamine neurons in the ventral tegmental area (VTA) and substantia nigra pars compacta (SNc). We found that DR serotonin neurons receive inputs from a remarkably similar set of areas as VTA dopamine neurons apart from the striatum, which preferentially targets dopamine neurons. Our results suggest three major input streams: a medial stream regulates MR serotonin neurons, an intermediate stream regulates DR serotonin and VTA dopamine neurons, and a lateral stream regulates SNc dopamine neurons. These results provide fundamental organizational principles of afferent control for serotonin and dopamine.