Learning Epistemic Actions in Model-Free Memory-Free Reinforcement Learning: experiments with a neuro-robotic model Dimitri Ognibene 1 , Nicola Catenacci Volpi 4 , Giovanni Pezzulo 2,3 , and Gianluca Baldassare 2 1 Personal Robotics Laboratory, Imperial College London, UK 2 Istituto di Scienze e Tecnologie della Cognizione, CNR, Italy 3 Istituto di Linguistica Computazionale “Antonio Zampolli”, CNR, Italy 4 IMT Institute for Advanced Studies, Lucca, Italy Abstract. Passive sensory processing is often insufficient to guide bio- logical organisms in complex environments. Rather, behaviourally rele- vant information can be accessed by performing so-called epistemic ac- tions that explicitly aim at unveiling hidden information. However, it is still unclear how an autonomous agent can learn epistemic actions and how it can use them adaptively. In this work, we propose a definition of epistemic actions for POMDPs that derive from their characterizations in cognitive science and classical planning literature. We give theoret- ical insights about how partial observability and epistemic actions can affect the learning process and performance in the extreme conditions of model-free and memory-free reinforcement learning where hidden in- formation cannot be represented. We finally investigate these concepts using an integrated eye-arm neural architecture for robot control, which can use its effectors to execute epistemic actions and can exploit the actively gathered information to efficiently accomplish a seek-and-reach task. 1 Introduction When an agent is executing a task in a non-deterministic and partially ob- servable environment its behavior is affected by its limited knowledge. Recent evidence in neuroscience [1–3] indicates that living organisms can take into con- sideration the confidence in their knowledge and execute actions that allow the decrease of uncertainty if they satisfy a value/cost trade-off. These actions are named epistemic actions in cognitive science and in the planning literature, and information-gathering actions in operation research. In robotics, epistemic actions have been applied in several tasks such as navigation (e.g., moving to positions where sensors can perceive to landmarks [4, 5]), active vision (e.g. moving the camera to acquire information given the limited This research was funded by the European Projects EFAA (G.A. FP7-ICT-270490), Goal-Leaders (G.A. FP7-ICT-270108), and IM-CLeVeR (G.A. FP7-ICT-IP-231722).