14 Reinforcement Learning, High-Level Cognition, and the Human Brain Massimo Silvetti and Tom Verguts Ghent University Belgium 1. Introduction Reinforcement learning (RL) has a rich history tracing throughout the history of psychology. Already in the late 19 th century Edward Thorndike proposed that if a stimulus is followed by a successful response, the stimulus-response bond will be strengthened. Consequently, the response will be emitted with greater likelihood upon later presentation of that same stimulus. This proposal already contains the two key principles of RL. The first principle concerns associative learning, the learning of associations between stimuli and responses. This theme was developed by John Watson. Building on the work of Ivan Pavlov, John Watson investigated the laws of classical conditioning, in particular, how a stimulus and a response become associated after repeated pairing. In the classical “Little Albert” experiment, Watson and Rayner (1920) repeatedly presented a rabbit together with a loud sound to the kid (little Albert); the rabbit initially evoked a neutral response, the loud sound initially evoked a fear response. After a while, also presentation of the rabbit alone evoked a fear response in the subject. In this same paper, the authors proposed that this principle of learning by association more generally is responsible for shaping (human) behavior. According to psychology handbooks John Watson hereby laid the foundation for behaviorism. The second principle is that reinforcement is key for human learning. Actions that are successful for the organism, will be strengthened and therefore repeated by the organism. This aspect was developed into a systematic research program by the second founder of behaviorism, Burrhus Skinner (e.g., Skinner, 1938). The importance of RL for explaining human behavior started to be debated from the late 1940s. Scientific criticism toward RL arrived from two main fronts. The first was internal, deriving from experimental findings and theoretical considerations within psychology itself. The second derived from external developments, in particular, advancements in information theory and control theory. These criticisms led to a disinterest for RL lasting several decades. However, in recent years, RL has been revived, leading to a remarkable interdiscplinary confluence between computer science, neurophysiology, and cognitive neuroscience. In the current chapter, we describe the relevant mid-20 th century criticisms and developments, and how these were considered and integrated in current versions of RL. In particular we focus on how RL can be used as a model for understanding high-level cognition. Finally, we link RL to the broader framework of neural Darwinism. www.intechopen.com