The Role of Amygdala in Devaluation: A Model Tested with a Simulated Rat Francesco Mannella Marco Mirolli Gianluca Baldassarre Laboratory of Autonomous Robotics and Artificial Life, Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche (LARAL-ISTC-CNR), Via San Martino della Battaglia 44, I-00185 Roma, Italy {francesco.mannella, marco.mirolli, gianluca.baldassare}@istc.cnr.it Abstract This paper presents an embodied biologically-plausible model investigating the relationships existing between Pavlovian and instrumental conditioning. The model is validated by successfully reproducing the pri- mary outcomes of instrumental-conditioning devaluation tests conducted with normal and amygdala-lesioned rats. These experiments are particularly important as they show how the sensitivity to motivational states exhib- ited by the Pavlovian system can transfer to instrumentally acquired behaviors. The results presented are relevant not only for neuroscience but also for robotics as they start to investigate how internal motivational systems, as those found in real organisms, might modulate the learning and performance of goal-directed actions in artificial machines, so to improve their behavioral flexibility. 1 Introduction Undoubtedly, living organisms’ behavior is charac- terized by a degree of autonomy and a flexibility that by far overcomes those of current robots. A way to tackle this problem is to attempt to under- stand the mechanisms underlying such properties so as to use them in designing robot’s controllers. This is particularly true for mechanisms regarding moti- vational and emotional regulation of behavior which plays a central role in humans’ and other organisms’ behavior but is often overlooked by cognitive sci- ence. Recently, machine learning and robotics com- munities have devoted increasing efforts to the study of autonomous development and learning in robots (Zlatev and Balkenius, 2001; Weng et al., 2001; Barto et al., 2004; Schembri et al., 2007, in press). Most of this literature builds upon the machine learn- ing framework of reinforcement learning (Sutton and Barto, 1998), which is intended to provide machines with the capacity to learn new behaviors on the ba- sis of rewarding stimuli. Interestingly, reinforcement learning algorithms have gained increasing interest within the empirical literature on animal behavior as they represent theoretical models that can furnish coherent explanations of several key empirical find- ings (Dayan and Balleine, 2002; Schultz, 2002). Notwithstanding their importance, the standard reinforcement learning models suffer of many limita- tions. From the machine learning point of view, they require a careful specification of task-specific extrin- sic reward functions and this limits their degree of autonomy (Barto et al., 2004). From the scientific point of view, they have been criticized for at least two reasons. (1) They do not take into account the role of internal motivations in modulating the effects of external rewards: if an agent, be it a real organism or a robot, has to engage in several different activ- ities, it needs to be endowed with a complex moti- vational system which is able not only to guide its learning processes, but also to modulate its behav- ior on the fly ; one of the most important empirical phenomena challenging the standard reinforcement learning framework, ‘devaluation’, demonstrates just this kind of effects. (2) They conflate the notions of classical/Pavlovian conditioning and instrumen- tal/operant conditioning although accumulating em- pirical evidence is indicating that these are differ- ent processes that rely on distinct neural systems and that interplay in complex ways overlooked by standard reinforcement learning models (as demon- strated, for example, by the empirical phenomena of ‘Pavlovian-Instrumental Transfer’ and ‘incentive learning’, Dayan and Balleine, 2002). This paper presents a novel computational model which is strongly rooted in the anatomy and physi- ology of the mammal brain and starts to addresses some of these issues. In particular, the model pre- sented here reproduces the results of an empirical ex- periment (Balleine et al., 2003) which demonstrates the phenomenon of devaluation in an instrumental conditioning task and proposes a coherent picture Berthouze, L., Prince, C. G., Littman, M., Kozima, H., and Balkenius, C. (2007). Proceedings of the Seventh International Conference on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems. Lund University Cognitive Studies, 135.