Neural Networks 41 (2013) 212–224 Contents lists available at SciVerse ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet 2013 Special Issue A spiking neuron model of the cortico-basal ganglia circuits for goal-directed and habitual action learning Fabian Chersi a,* , Marco Mirolli a , Giovanni Pezzulo b,a , Gianluca Baldassarre a a Institute of Cognitive Sciences and Technologies, National Research Council. Via San Martino della Battaglia 44, 00185 Roma, Italy b Institute of Computational Linguistics ‘‘Antonio Zampolli’’, National Research Council. Via Giuseppe Moruzzi 1, 56124 Pisa, Italy article info Keywords: Autonomous learning Goal-directed and habitual actions Motor sequences Basal ganglia Spiking neurons abstract Dual-system theories postulate that actions are supported either by a goal-directed or by a habit- driven response system. Neuroimaging and anatomo-functional studies have provided evidence that the prefrontal cortex plays a fundamental role in the first type of action control, while internal brain areas such as the basal ganglia are more active during habitual and overtrained responses. Additionally, it has been shown that areas of the cortex and the basal ganglia are connected through multiple parallel ‘‘channels’’, which are thought to function as an action selection mechanism resolving competitions between alternative options available in a given context. In this paper we propose a multi-layer network of spiking neurons that implements in detail the thalamo-cortical circuits that are believed to be involved in action learning and execution. A key feature of this model is that neurons are organized in small pools in the motor cortex and form independent loops with specific pools of the basal ganglia where inhibitory circuits implement a multistep selection mechanism. The described model has been validated utilizing it to control the actions of a virtual monkey that has to learn to turn on briefly flashing lights by pressing corresponding buttons on a board. When the animal is able to fluently execute the task the button–light associations are remapped so that it has to suppress its habitual behavior in order to execute goal-directed actions. The model nicely shows how sensory-motor associations for action sequences are formed at the cortico-basal ganglia level and how goal-directed decisions may override automatic motor responses. © 2012 Elsevier Ltd. All rights reserved. 1. Introduction A prominent feature of animal intelligence is the ability to process different types of stimuli at the same time and to effortlessly solve real-world tasks of varying complexity simultaneously or in rapid sequences. In order to do this the brain has developed two distinct mechanisms for action selection, with separate neurological bases. One process, termed ‘‘goal directed’’ (Dickinson & Balleine, 2000; Hommel, 2003), motivates action by the integration of an expectation that a given action will have a specific outcome and a desire for that outcome. On the other hand, stimulus-driven ‘‘habitual’’ actions occur as an automatic response to sensory inputs with which the action has become associated, for example through reinforcement learning (Balleine & Dickinson, 1998). Although apparently simple, the latter mechanism can produce very complex behavioral patterns combining basic learned responses (Donahoe, Burgos, & Palmer, 1993). * Corresponding author. Tel.: +39 06 44595206. E-mail address: fabian.chersi@istc.cnr.it (F. Chersi). In the first mechanism, the tendency to select a particular action depends on the currently predicted value of the outcome. On the contrary, once a habitual action is learned it is elicited in response to a stimulus irrespective of what the value of the outcome may be. In healthy individuals, the acknowledgment that the current situation suddenly cannot be solved in the habitual way evokes the generation of an alternative plan. On the contrary, adults with brain damage in areas such as the frontal lobes are usually prone to continue to unsuccessfully perform habitual actions elicited by known stimuli. Such patients provide a clear demonstration of how processes governing automatic action execution can operate independently of decision-making processes. Recently, experimental studies, in particular on rodents, have begun to elucidate the neural substrates underlying these types of behavioral control processes. More specifically, a series of works have shown that the dorsomedial striatum and the prelimbic cortex subserve goal-directed actions (Corbit & Balleine, 2003; Killcross & Coutureau, 2003; Miyachi, Hikosaka, & Lu, 2002; Yin, Knowlton, & Balleine, 2005), whereas habit formation is reflected in a shift in control toward the dorsolateral striatum (Yin & Knowlton, 2006; Yin, Knowlton, & Balleine, 2004). Importantly, 0893-6080/$ – see front matter © 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.neunet.2012.11.009