Why, Who, What, When and How about Explainability in Human-Agent Systems JAAMAS Track Avi Rosenfeld Jerusalem College of Technology Jerusalem rosenfa@jct.ac.il Ariella Richardson Jerusalem College of Technology Jerusalem richards@jct.ac.il ABSTRACT This paper presents a survey of issues relating to explainability in Human-Agent Systems. We consider fundamental questions about the Why, Who, What, When and How of explainability. First, we defne explainability and its relationship to the related terms of interpretability, transparency, explicitness, and faithfulness. These defnitions allow us to answer why explainability is needed in the system, whom it is geared to and what explanations can be generated to meet this need. We then consider when the user should be presented with this information. Last, we consider how objective and subjective measures can be used to evaluate the entire system. This last question is the most encompassing as it needs to evaluate all other issues regarding explainability. KEYWORDS Humanśagent systems ; XAI ; Machine learning interpretability ; Machine learning transparency ACM Reference Format: Avi Rosenfeld and Ariella Richardson. 2020. Why, Who, What, When and How about Explainability in Human-Agent Systems. In Proc. of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), B. An, N. Yorke-Smith, A. El Fallah Seghrouchni, G. Sukthankar (eds.), Auckland, New Zealand, May 2020, IFAAMAS, 4 pages. 1 OVERVIEW As the feld of Artifcial Intelligence matures and becomes ubiqui- tous, there is a growing emergence of systems where people and agents work together. These systems, often called Human-Agent Systems or Human-Agent Cooperatives, have moved from theory to reality in the many forms, including digital personal assistants, recommendation systems, training and tutoring systems, service robots, chat bots, planning systems and self-driving cars [2ś4, 7, 10ś 12, 14ś18, 20ś23, 25, 26, 28]. One key question surrounding these systems is the type and quality of the information that must be shared between the agents and the human-users during their inter- actions. We focus on one aspect of this human-agent interaction Ð the internal level of explainability that agents using machine learning must have regarding the decisions they make. Our overall goal is to provide an extensive study of this issue in Human-Agent Sys- tems. Towards this goal, our frst step is to formally and clearly Proc. of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), B. An, N. Yorke-Smith, A. El Fallah Seghrouchni, G. Sukthankar (eds.), May 2020, Auckland, New Zealand. © 2020 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved. defne explainability, as well as the concepts of interpretability, transparency, explicitness, and faithfulness that make a system explainable. Through using these defnitions, we provide a clear taxonomy regarding the Why, Who, What, When, and How about explainability and stress the relationship of interpretability, trans- parency, explicitness, and faithfulness to each of these issues. This paper’s frst contribution is a clear defnition for explain- ability and for the related terms: interpretability and transparency. In defning these terms we also defne how explicitness and faith- fulness are used within the context of Human-Agent Systems. A summary of these defnitions is found in Table 1. In defning these terms, we focus on the features and records that are used as training input in the system, the supervised targets that need to be identifed, and the machine learning algorithm used by the agent. We defne L as the machine learning algorithm that is created from a set of training records, . Each record  ∈  contains values for a tuple of ordered features,  . Each feature is defned as  ∈  . Thus, the entire training set consists of  ×  . While this model naturally lends itself to tabular data, it can as easily be applied to other forms of input such as texts, whereby  are strings, or images whereby  are pixels. The objective of L is to properly ft  ×  with regard to the labeled targets  ∈  . To help visualize the relationship between explainability, in- terpretability and transparency, please note Figure 1. Note that interpretability includes six methods, including transparent models, and also the non-transparent possibilities of model and outcome tools, feature analysis, visualization methods, and prototype analy- sis. Feature analysis can serve as a basis for creating transparent models, on its own as a method of interpretability, or as a inter- pretable component within model, outcome and visualization tools. Similarly, visualization tools can help explain the entire model as a global solution or as a localized interpretable element for specifc outcomes of  ∈  . Prototype analysis uses  as the basis for in- terpretability, and not  , and can be used for visualization and/or outcome analysis of  ∈ . Interpretability is a means for providing explainability, as per these terms’ defnitions in Table 1. To date, many reasons have been suggested for making systems explainable [1, 5, 6, 8, 9, 13, 24]: to justify its decisions so the human participant can decide to accept them (provide control), to explain the agent’s choices and guarantee safety concerns are met, to build trust in the agent’s choices, especially if a mistake is suspected or the human operator does not have experience with the system, to explain the agent’s choices that ensure fair, ethical, and/or legal decisions are made, to explain the agent’s choices and better eval- uate or debug the system in previously unconsidered situations, JAAMAS Track Paper AAMAS 2020, May 9–13, Auckland, New Zealand 2161