Conceptual Explanations of Neural Network Prediction for Time Series Ferdinand K¨ usters IAV GmbH Gifhorn, Germany ferdinand.kuesters@iav.de Peter Schichtel IAV GmbH Gifhorn, Germany peter.schichtel@iav.de Sheraz Ahmed DFKI GmbH Kaiserslautern, Germany sheraz.ahmed@dfki.de Andreas Dengel DFKI GmbH Kaiserslautern, Germany andreas.dengel@dfki.de Abstract—Deep neural networks are black boxes by con- struction. Explanation and interpretation methods therefore are pivotal for a trustworthy application. Existing methods are mostly based on heatmapping and focus on locally determining the relevant input parts triggering the network prediction. However, these methods struggle to uncover global causes. While this is a rare case in the image or NLP modality, it is of high relevance in the time series domain. This paper presents a novel framework, i.e. Conceptual Expla- nation, designed to evaluate the effect of abstract (local or global) input features on the model behavior. The method is model- agnostic and allows utilizing expert knowledge. On three time series datasets Conceptual Explanation demonstrates its ability to pinpoint the causes inherent to the data to trigger the correct model prediction. Index Terms—Machine Learning, Deep Learning, Inter- pretability, Explainability, Time Series I. I NTRODUCTION Deep Neural Networks (DNNs) have been applied success- fully in various domains on tasks like regression, classification, or anomaly detection. Due to their ability to extract important features of the input data automatically, they can be easily adapted to new problems [1]. By construction, DNNs are black boxes. Therefore, un- derstanding the reason for a specific network decision or even the overall model behavior is difficult. This lack of transparency significantly hampers the applicability of DNNs in many sectors, e.g. health care, finance, and Industry 4.0. It has already been pointed out in the literature that network explanations are required to fully exploit the potential of DNNs [2]. Explainability of DNNs is an active field of research and a variety of interpretation methods have been proposed [3]. The methods differ strongly in resulting explanations, referring to input parts [4], relevant training samples [5] or to concepts relevant for the network decisions [6]. Most interpretation methods try to assign relevance to individual input parts. There are various variants of such heatmapping methods, for example Integrated Gradients [7], Layerwise Relevance Propagation [8], SmoothGrad [9] or Guided Backpropagation [10]. Other methods, like LIME [11] or Meaningful Perturbation [12] also point out the relevant input parts. These heatmapping methods are especially popular for natural language processing (NLP) and the image domain, as pinpointing towards a special shape or object in the input image or towards certain words makes the network decision more intelligible. However, the use of heatmapping methods suffers greatly if the important input aspect cannot be localized, but is spread over the whole signal. While this is rarely the case for images, and certainly not meaningful for language processing, it is often an inherent property of time series. Trend, seasonality or frequency ranges are obviously non local, to name a few. Conceptual Explanation has been developed specially for describing global input properties and is one of the few works directly addressed toward neural network interpretation for the time series domain. A concept is an abstract (local or global) input property that can be manipulated by a suitable filter. Conceptual Explanation evaluates the effect preprocessing the network input by different filters has on the network perfor- mance. This makes the method model-agnostic and ensures easily intelligible results. The main contribution of this work is the introduction and characterization of Conceptual Explanation (Sec. III) as well as its evaluation on different datasets (Sec. IV). II. RELATED WORK Conceptual Explanations is a mask-based interpretation approach. In contrast to [4], [11], [12] it does not mask input regions, but input properties. While region-based masking usually adds unwanted side effects to the input, e.g. jumps and seasonality breaks, this problem does not occur for global filter-based masking. Heatmapping methods [7], [8], [9], [10] are, as described above, suitable for finding relevant local, but not global input properties. A drawback of these methods is that they are sample-based. The relevant information is not the position of the important pixels (which has no dataset-wide meaning), but the object parts these pixels refer to. Therefore, manual inspection of the highlighted areas and aggregation for many samples is necessary. An automatic extraction together with a statistical evaluation is not possible. TSXplain [13] combines heatmapping methods for finding the relevant input segments with the computation of statis- tical time-series properties to provide the user with a more insightful interpretation of the relevant input. As it is still based 978-1-7281-6926-2/20/$31.00 ©2020 IEEE