In computational neuroscience, models of dopaminergic modulation address the physiological and computational functions of the neuromodulator dopamine (DA) by implementing it into models of biological neurons and networks.
DA plays a highly important role in higher order motor control, goal-directed behavior, motivation, reinforcement learning, and a number of cognitive and executive functions such as working memory, planning, attention, behavioral and cognitive flexibility, inhibition of impulsive responses, and time perception (Schultz, 1998, Nieoullon, 2003, Goldman-Rakic, 2008, Dalley and Everitt, 2009). DA's fundamental part in learning, cognitive, and motor control is also reflected in the various serious nervous system diseases associated with impaired DA regulation, such as Parkinson’s disease, Schizophrenia, bipolar disorder, Huntington’s disease, attention-deficit hyperactivity disorder (ADHD), autism, restless legs syndrome (RLS), and addictions (Meyer-Lindenberg, 2010, Egan and Weinberger, 1997, Dalley and Everitt, 2009).
From electrophysiological experiments, DA is known to affect a number of neuronal and synaptic properties in various target areas such as the striatum, the hippocampus, and motor and frontal cortical regions, via different types of receptors often combined within the D1- and D2-receptor class (D1R and D2R) (see Dopamine modulation). In single neurons, DA changes neuronal excitability and signal integration by virtue of its effects on a variety of voltage-dependent currents. DA also enhances or suppresses various synaptic currents such as AMPA-, GABA- and NMDA-type currents. With regards to both intrinsic and synaptic currents, the D1 and D2 receptor classes may function largely antagonistically (Trantham-Davidson et al. 2004, West and Grace 2002, Gulledge and Jaffe, 1998): D2 receptors decrease neuronal excitability with relatively short latency (in vitro), while there is a delayed and prolonged increase mediated by D1R. Similarly, D1R enhance NMDA- and GABA-type currents, while D2R decrease them. These antagonistic physiological effects may be rooted in the differential regulation of intracellular proteins like adenylyl cyclase, cAMP and DARPP-32 through D1R and D2R (Greengard, 2001).
Contents |
In accordance with the different levels of abstraction at which neural systems models can be formulated, DA effects have been simulated at either quite abstract-computational or at quite detailed biophysical levels: In detailed biophysical models based on Hodgkin-Huxley-type conductances and compartmentalized morphological structures, the model parameters are usually closely related to electrophysiological experiments, so DA effects can be implemented quite directly. For instance, a ~40% change in NMDA peak conductance observed in vitro can be translated into a ~40% change of the parameter that determines the NMDA peak conductance in the model. More abstract models, on the other hand, such as connectionist models, start from specific assumptions about the net effects of DA, e.g. changes of the input-output relation of a neuron, the overall synaptic strength of particular connections, or DA’s presumed role as a prediction error signal in temporal difference learning.
Both modeling approaches may complement each other: While biophysical models make it possible to assess the detailed implications of DA on the dynamical properties of neurons and networks, it is not so easy to investigate how complex cognitive tasks are affected by DA within this framework. Thus it may be more suitable to study such effects at a higher level of model abstraction, perhaps directly derived through steps of simplification from a biophysical formulation.
One of the earliest proposals on the function of DA was that it may increase the signal-to-noise ratio in neural responses (Clarke Geffen and Geffen, 1987, Foote, 1987). This was deduced from the early experimental observation that DA application increases the response to injected currents, but appears to leave the spontaneous firing rate unaffected. Servan-Schreiber, Printz, and Cohen formalized this idea in a connectionist framework where neurons are described by a single difference equation which maps weighted inputs from other cells onto an output “firing rate” via a monotonically increasing activation function (Servan-Schreiber, Printz and Cohen, 1990). Choosing a sigmoid activation function,
\[ f_G(X) = \frac{1}{1 + exp(-(Gx+b))} \]
they proposed that DA would change the gain \(G\) of \(f_G(x)\ ,\) increasing the neuron’s response to strong inputs while diminishing it for weak or negative inputs, depending on where the bias b centers the curve (see Fig. 1A). They showed that increasing \(G\) would not change the performance of a single neuron in a signal detection task. However, in a network of neurons where the synaptic connections are also sources of noise, the contribution of this additional noise is decreased by increasing \(G\ ,\) and thus the performance of the network as a whole is enhanced.
This theory of DA gain modulation was influential for a number of connectionist models which addressed cognitive functions such as selective attention (Cohen and Servan-Schreiber, 1992, 1993), time perception (Shea-Brown, Rinzel, Rakitin, Malapani, 2006), or deficits in schizophrenia (Cohen and Servan-Schreiber, 1992, 1993, Braver and Cohen, 2000). For instance, one line of research proposed that DA sets a signal-to-noise ratio that is optimal for stochastic resonance, i.e. noise-induced increase of perceptual sensitivity (Li, von Oertzen and Lindenberger, 2006, Sikström, 2007).
The concept of DA gain modulation was recently re-evaluated in a biophysically highly realistic 189-compartment representation of a striatal medium spiny neuron (MSN) (Moyer, Wolf and Finkel, 2007). Simulated D1-class receptor activation led to an increased slope and right-shift of the single cell I/O function, resembling the gain increase assumed in connectionist models (Fig. 1B). Recent experimental observations in prefrontal cortex (PFC) pyramidal neurons recorded in brain slices in vitro also appear to support a D1-induced increase of the single neuron gain (Thurley, Senn and Lüscher, 2008). However, there were important differences compared to the increase in single neuron gain proposed in connectionist models: The gain increase was observed only for weak inputs, with a concomitant left-shift of the neuronal f/I curve, while stronger inputs actually lead to diminished outputs. This was because DA in fact induced stronger curvature (stronger nonlinearity, Fig. 1C) in the I/O curve, rather than just manipulating the gain.
Another proposed role for DA is the support of cellular bistability which refers to the co-existence of two stable states in the single cell dynamics, one of them usually being a quiescent resting state and the other a limit cycle, that is maintained spiking activity. Bistability has been considered a neural substrate for working memory, that is the active maintenance of task-related information through persistent activity over a delay period (Goldman-Rakic, 1995, Wang 1999, Durstewitz, Seamans and Sejnowski 2000b).
DA-induced bistability has been observed in synaptically isolated cells in the cerebellum, the entorhinal cortex and the striatum (Lavin et al. 2005, Sawaguchi and Goldman-Rakic, 1997). A biophysical model of striatal MSNs (Gruber, Solla, Surmeier, Houk, 2003) explained this effect on single cellular bistability by the DA-mediated enhancement of two voltage-dependent currents, IKIR and ICaL, the former hyperpolarizing the cell at low voltages while the latter depolarizes it at higher voltages. However, single cell bistability could not be reproduced in a 189-compartment-model of the MSN (Moyer, Wolf and Finkel, 2007).
At the network level, the mechanisms of DA modulation of persistent activity were first explicitly studied in biophysical model simulations which incorporated a number of electrophysiologically measured DA effects on both cellular and synaptic properties (Durstewitz et al. 1999, 2000a, Brunel and Wang 2001). The consistent result of these studies was that DA stabilizes persistent activity in a cell assembly against both synaptic noise and distractor stimuli represented by competing assemblies (Fig. 2). The synaptic contribution to this effect mainly comes from the DA-mediated enhancement of both NMDA and GABAA currents: NMDA currents enhance recurrent excitation and boost the persistence of activity by virtue of their slow inactivation time constant and voltage dependence, while increased GABAA currents help to suppress distracting activity by fostering synaptic competition. This stabilizing effect has received experimental support from animal and human studies at in-vitro, in-vivo, neuroimaging, and cognitive levels (Seamans et al. 2001, Durstewitz and Seamans, 2002, 2008, Lavin et al. 2005, Sawaguchi and Goldman-Rakic, 1997, Tost et al., 2006, Müller, von Cramon and Pollmann, 1998). It has also been incorporated into more abstract models aiming at cognitive effects of DA (Dreher et al. 2002, Tanaka 2006, Humphries et al. 2009).
Working memory is also connected to top-down driven focused attention, in the sense that a to-be-attended-to object is kept in active memory, thereby constantly biasing perception or action. Some connectionist studies have implemented this idea explicitly, e.g. by use of an “attention layer” which guides perception towards a specific spatial location (Servan-Schreiber, Carter, Bruno and Cohen, 1998). Sustained attention has also been considered in the context of the attention-deficit / hyperactivity disorder (see below).
Efficient working memory performance requires the robust online maintenance of task-related information in the face of distraction, as enhanced by D1R-stimulation according to the biophysical models and empirical studies cited above. On the other hand, flexible shifting between different stimuli, actions or rules may entail the opposite demand to adapt to changes in the environment and to avoid perseverant responding. Since D2R modulation tends to be antagonistic to D1 receptor effects in the PFC (Trantham-Davidson et al. 2004, West and Grace 2002, Gulledge and Jaffe, 1998), it has been proposed that cognitive flexibility is enhanced in a state that is dominated by D2R activation, both in empirical (Winterer and Weinberger 2003, Wang et al. 2004, Floresco and Magyar, 2006) and computational studies (Cohen, Braver and Brown 2002, Durstewitz 2007, Durstewitz and Seamans 2008, Rolls et al. 2008). More specifically, it has been suggested that the ratio between D1R and D2R activation may be crucial for the balance between stabilization and flexibility (Durstewitz 2007, Durstewitz and Seamans 2008). The D1/D2 balance can be thought of as a control parameter that modifies the “energy landscape” of cortical networks (Fig. 3A, similar to the notion of energy in the Hopfield network): Minima in this landscape correspond to attractors of the neural dynamics, and the local slope of the energy near a minimum as well as the size of the ”basin” surrounding the minimum determine the stability of the corresponding firing rate pattern – the deeper and steeper the “well” surrounding a given minimum (attractor), the harder it is to switch to another state. As apparent from Fig. 2A, those energy wells are much shallower in the D2 dominated state compared to D1 dominated states, allowing for switching and flexibility, at the cost of stability. This “dual-state theory” of D1 and D2 effects (Durstewitz and Seamans 2008) is consistent with a large body of experimental evidence (Durstewitz and Seamans 2008, Rolls et al. 2008) and has also been used to explain the deficits in schizophrenia (see below). It should be noted, however, that there is much less research on the function of D2 receptors, both experimentally and computationally, as compared to the D1 receptor.
Another unresolved question is how the balance between the D1 and D2 state is regulated, both at the biochemical and the functional level. Functionally, how does the brain know when to hold a given object in memory, and when to update the memory content to a new object? Biochemically, how could the same neuromodulator, DA, achieve differential activation of D1 vs. D2 receptors? With regards to this latter question, it appears that D1 and D2 receptors are preferentially activated at different DA concentrations (D2R at low and very high, D1R at intermediate concentrations, Kroener et al., 2009, Trantham-Davidson et al. 2004, Zheng et al. 1999) and with different time courses (D2R first, D1R later, Lapish et al. 2007, Schultz 2007). This led to the idea that the different states could be regulated by controlling the amount of DA release in a task-dependent manner, and that the temporal order of states may implement a kind of simulated annealing process with a D2R-dominated ‘exploration phase’ followed by a D1R-dominated ‘exploitation phase’ (Durstewitz 2006). On a more abstract, control-theory inspired level, there have been attempts to model the basal ganglia/ DA midbrain – cortex system in a closed-loop fashion (Durstewitz et al. 1999; Tanaka 2006, O'Reilly and Frank 2006) such that the level of DA would be self-regulated by the system. Another idea put forward in connectionist models has been to interpret phasic DA release as a gating signal (Cohen, Braver and Brown 2002, Montague, Hyman and Cohen, 2004, Braver and Cohen 1999, 2000) because it predicts unexpected reward (see below) and may thus allow for new information to enter working memory. However, others argued that using DA as a global gating signal would be too unspecific, and proposed instead a gating based on stimulus-specific reward associations in the striatum which are in turn also learned using DA signals (Hazy, Frank and O'Reilly, 2010, O'Reilly and Frank, 2006, Frank, Loughry and O'Reilly, 2001).
One of the most prominent functions attributed to the DA system is its role in reward prediction and reward-based learning. This is a rather wide field which we only briefly review here (see Reinforcement learning, Reward Signals, Schultz 2002, 2007, Wörgötter and Porr 2005, Montague, Hyman and Cohen 2004 for more in-depth reviews).
While a connection between DA and incentive learning, motivation, reward and pleasure has been made for a long time, a series of studies in the early 1990s by Schultz and colleagues took this idea to a different level by showing that the timing of phasic DA-neuron bursts in relation to stimuli and reinforcement signals seems ideal to encode the prediction of rewarding stimuli, or more specifically, an error signal between the actual and the predicted reward (Montague, Dayan and Sejnowski, 1996, Schultz, Dayan and Montague, 1997). Since then, this idea about DA function was used in a vast number of computational studies, mostly originating within the field of machine learning, where it served as the neural substrate for the prediction error signal that is needed in algorithms of reinforcement learning and temporal difference learning. While these models are typically formulated in quite abstract mathematical terms, researchers have more recently sought to establish connections between reinforcement and temporal difference learning and the experimental phenomenon of spike-timing dependent plasticity (STDP), arguing that all three forms of learning can be understood in a single theoretical framework (Izhikevich 2007, Wörgötter and Porr 2005). Thus, it may become possible to interpret the various elements of formal learning theories in biological terms.
Several researchers have also attempted to incorporate DA-based learning into network models of the basal ganglia, or into functional circuits involving several brain regions such as the thalamus, frontal cortex structures, striatum, and the cerebellum. For instance, Frank, O'Reilly, and colleagues (Frank 2005, 2006, Frank et al. 2007a, Cohen and Frank, 2009) suggested connectionist models aimed at understanding of how various components of the basal ganglia act together to promote action selection and decision making based on reward associations in the striatum which are constructed by DA-based reinforcement learning. The striatum is assumed to consist of a separate set of “Go” and “NoGo” cells for each of a number of possible actions, which compete with each other via synaptic inhibition. Depending on which of the populations is dominant, one of those actions succeeds to drive frontal motor systems, which results in a bias for action selection. The “Go” and “NoGo” associations are assumed to be learned separately, based on the preferential expression of D1 receptors in the former, and D2 receptors in the latter.
DA is known to play a major role in time perception. Brain disorders that have been related to the DA system like Parkinson's disease, schizophrenia, and Huntington’s disease all result in specific impairments in time perception which cannot be attributed to lower-level perceptual or motor deficits (Buhusi and Meck, 2005). These impairments take the form of both increased variability in time estimates, and a tendency to underestimate short durations (below one second) and overestimate longer durations. Moreover, pharmacological studies in both animals (Buhusi and Meck, 2005) and humans (Rammsayer 1999) showed that the perceived duration of a temporal interval can be manipulated by dopaminergic drugs in a manner that depends on the drug's affinity for the D2 receptor (Meck, 1983).
Although there is a plethora of computational models of time perception (Gibbon et al. 1997, Buonomano and Karmarkar 2002, Durstewitz 2003, Hass et al. 2008), only few made attempts to account for effects of DA. One of them is the “striatal beat model” (Matell and Meck, 2004), which bases time perception on a set of cortical oscillators. These are assumed to project to striatal spiny neurons which read out beats with slightly different frequencies. Phasic DA bursts serve both as a learning signal to associate certain patterns of cortical activity with specific intervals (see also Rivest, Kalaska and Bengio, 2010), and as a “starting gun” which resets the phase of the oscillators. Another study (Shea-Brown, Rinzel, Rakitin, Malapani, 2006), which is based on the idea that interval time is encoded through monotonically increasing firing rates (Durstewitz 2003, 2004), attempts to explain effects of Parkinson's disease on time perception by changes in the gain of this increase in reference to the early Servan-Schreiber model (Servan-Schreiber, Printz and Cohen, 1990). A more biological model of ramping firing rates related to interval timing (Durstewitz 2003) was recently re-evaluated building on the electrophysiological effects of D2 receptors, and could explain the biases in time estimation induced by the up- and down-regulation of this receptor (Hass, Farkhooi and Durstewitz, 2010).
The fact that DA misregulation has been strongly related to a number of mental diseases such as schizophrenia and Parkinson's disease is arguably one of the strongest motivations to study DA function (Carlsson, 1977, 2001, Meyer-Lindenberg et al. 2005, Meyer-Lindenberg and Weinberger 2006, Poewe 2008, Meyer-Lindenberg 2010, see also Dopamine and schizophrenia). Consequently, a number of computational models of DA modulation have been developed specifically to explain the cognitive, motor and motivational impairments found in those diseases.
The connectionist models of Cohen and coworkers were among the first to simulate the cognitive deficits in schizophrenia. These authors linked the cognitive symptoms of schizophrenia to reduced cortical DA levels (Levin 1984) and proposed that this would lead to a reduced gain and signal-to-noise ratio. Implementing these effects in the model induced similar performance deficits as observed in schizophrenia patients in attention-based tasks like the Stroop test (Cohen and Servan-Schreiber, 1992, 1993). In the “gating” version of the model (see above), they also explored the hypothesis that schizophrenia may be associated with a larger variability in DA signaling, rather than just with overall lower DA levels (Braver and Cohen 1999, 2000). Observations in biophysical models, on the other hand, led to the dual-state theory of D1R and D2R function (Durstewitz and Seamans 2008). In this framework, the cognitive deficits associated with positive symptoms in schizophrenia may be due to a pathological domination of the D2 state, resulting in problems of active memory stabilization, and potentially, the development of hallucinations, while negative symptoms could potentially be explained by a hyper-D1-state (Winterer and Weinberger 2003, Loh, Rolls and Deco, 2007, Durstewitz 2007, Durstewitz and Seamans 2008, Rolls et al. 2008). These ideas about D1- and D2-dominated states and their DA concentration dependence were also applied to explain the effects of various genetic polymorphisms related to the DA system and schizophrenia on cognitive function (Meyer-Lindenberg and Weinberger 2006; Meyer-Lindenberg 2010).
Several models also dealt with the cognitive, motor, and learning deficits in Parkinson's disease (PD). It has been shown, for instance, that deep-brain stimulation of the STN improves motor deficits in PD. However, this is accompanied by increased impulsivity, defined as the tendency to act before sufficient evidence for choosing that action has accrued (Frank 2006, Frank et al. 2007a). These effects indeed matched observations made in PD patients which had their STN temporally deactivated by deep-brain stimulation. Another network model study (Humphries et al., 2009) examined the GABAergic microcircuit formed by striatal spiny neurons and fast-spiking interneurons, with parameters fitted to experimental data. These authors found that under DA depletion a large number of synchronized neuron-clusters emerged as compared to a normal range of DA concentrations. They hypothesized that these synchronized clusters would occur in PD and could be the cause for motor control problems. Models based on mean-field analysis of firing-rates have also been proposed in this context (Shea-Brown, Rinzel, Rakitin, Malapani, 2006, Albada and Robinson, 2009).
Finally, there have been attempts to account for the symptoms in attention-deficit hyperactivity disorder (ADHD) and those associated with DA-receptor decline in normal and pathological aging on the basis of computational models. ADHD models have largely focused on the aspect of attentional gating (Frank et al. 2007b) and learning deficits due to impaired DA error signals (Williams and Dayan, 2005). Computational theories of aging, on the other hand, were based on the idea that DA regulates the neural signal-to-noise ratio in a manner that may be optimal for stochastic resonance (Li, von Oertzen and Lindenberger, 2006, Sikström, 2007). In this framework, DA depletion with age results in less stochastic resonance shifted to higher noise levels, and thus impaired performance in perceptual discrimination.
Internal references
Acknowledgements: This work was funded by grants from the Deutsche Forschungsgemeinschaft (DFG, Du 354/5-1 & 6-1) to D.D. and the Bundesministerium für Bildung und Forschung (BMBF, 01GQ1003B).