Eurographics Conference on Visualization (EuroVis) 2020 M. Gleicher, T. Landesberger von Antburg, and I. Viola (Guest Editors) Volume 39 (2020), Number 3 DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning T. Jaunet 1 R. Vuillemot 2 and C. Wolf 1,3 1 LIRIS, INSA-Lyon, France 2 LIRIS, ÃL’cole Centrale-Lyon, France 3 CITI, INRIA, France Field of View Action Distribution Current Time Selected Memory Trajectories t-SNE Projection Episodes Derived Metrics Timeline Memory Time-steps Elements 1 2 3 4 Figure 1: DRLViz displays a trained agent memory, which is a large temporal vector, as a horizontal heat-map ➀. Analysts can browse this memory following its temporal construction; filter according to movements of the agent and derived metrics we calculated ➁ (e.g., when an item is in the field of view ➂); and select the memory to filter elements and compare them ➃. Abstract We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explain-ability in the field of visual analytics. CCS Concepts • Human-centered computing → Visual analytics; • Theory of computation → Reinforcement learning; c 2020 The Author(s) Computer Graphics Forum c 2020 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd. arXiv:1909.02982v2 [cs.LG] 25 May 2020