Parallel Coordinates for Discovery of Interpretable Machine Learning Models Dustin Hayes, Boris Kovalerchuk Dept. of Computer Science, Central Washington University, USA. Dustin.Hayes@cwu.edu, BorisK@cwu.edu Abstract—This work uses visual knowledge discovery in parallel coordinates to advance methods of interpretable machine learning. The graphic data representation in parallel coordinates made the concepts of hypercubes and hyperblocks (HBs) simple to understand for end users. It is suggested to use mixed and pure hyperblocks in the proposed data classifier algorithm Hyper. It is shown that Hyper models generalize decision trees. The algorithm is presented in several settings and options to discover interactively or automatically overlapping or non-overlapping hyperblocks. Additionally, the use of hyperblocks in conjunction with language descriptions of visual patterns is demonstrated. The benchmark data from the UCI ML repository were used to evaluate the Hyper algorithm. It enabled the discovery of mixed and pure HBs evaluated using 10-fold cross validation. Connections among hyperblocks, dimension reduction and visualization have been established. The capability of end users to find and observe hyperblocks, as well as the ability of side-by-side visualizations to make patterns evident, are among major advantages of hyperblock technology and the Hyper algorithm. A new method to visualize incomplete n-D data with missing values is proposed, while the traditional parallel coordinates do not support it. The ability of HBs to better prevent both overgeneralization and overfitting of data over decision trees is demonstrated as another benefit of the hyperblocks. The features of VisCanvas 2.0 software tool that implements Hyper technology are presented. Keywords—Interpretable machine learning, parallel coordinates, hypercube, hyperblock, decision tree, missing data. 1. INTRODUCTION Acceptance, interpretability, and comprehensibility of different classifiers are crucial for future Machine Learning (ML) advancements. For many machine learning models this is a very significant challenge due to their black box specifics making these models incomprehensible. Users are reluctant to deploy such models for high-risk, high-stakes decisions. A promising solution to this problem is visual knowledge discovery [5-7, 15, 18, 22]. We outline a parallel coordinates-based visual knowledge discovery approach below that includes supervised learning, data and model visualization, dimensionality reduction, and model simplification. The supervised classification models are the main topic of this work. Often, developing a reliable interpretable, explainable, and comprehensible ML model necessitates placing the end-user in control of the creation of a model. The end users are frequently