A Visual Data Mining Framework for Convenient Identification of Useful Knowledge 1, 2 1 Parts of the work are under patent applications. For most recent advances please contact the authors. 2 We would like to thank Tom Babin, Paul DeClerck, Dan DeClerck, Jeffrey Benkler and Michael Kramer for many useful discussions and suggestions. Thank you also to Mike Kotzin of Motorola’s Mobile Devices Business for supporting this project through the Illinois Manufacturing Research Center. Kaidi Zhao, Bing Liu Department of Computer Science University of Illinois at Chicago 851 S. Morgan Street Chicago, IL 60607. US {kzhao, liub}@cs.uic.edu Thomas M. Tirpak, Weimin Xiao Motorola Labs 1301 E. Algonquin Rd. Schaumburg, IL 60196. USA {T.Tirpak, awx003}@motorola.com Abstract Data mining algorithms usually generate a large number of rules, which may not always be useful to human users. In this project, we propose a novel visual data-mining framework, called Opportunity Map, to identify useful and actionable knowledge quickly and easily from the discovered rules. The framework is inspired by the House of Quality from Quality Function Deployment (QFD) in Quality Engineering. It associates discovered rules, related summarized data and data distributions with the application objective using an interactive matrix. Combined with drill down visualization, integrated visualization of data distribution bars and rules, visualization of trend behaviors, and comparative analysis, the Opportunity Map allows users to analyze rules and data at different levels of detail and quickly identify the actionable knowledge and opportunities. The proposed framework represents a systematic and flexible approach to rule analysis. Applications of the system to large-scale data sets from our industrial partner have yielded promising results. 1. Introduction Data mining algorithms usually generate a large number of patterns or rules [2] [20] that are hard to comprehend. Most of the discovered rules actually are not useful. A number of techniques have been proposed to help the user find interesting rules [1] [12] [18] [19] [21], either using objective measures, or subjective measures such as unexpectedness and actionability [1][15][18]. In our work, we propose a visual data mining framework called Opportunity Map. It integrates a set of visual data mining techniques, to quickly identify interesting and actionable knowledge. The visualization layout is inspired by the House of Quality in Quality Function Deployment [6] [23], specifically the Interrelationships Matrix in the House of Quality (HOQ) from Management Sciences. In the Opportunity Map, Customer Requirements in the HOQ are mapped to application requirements expressed as classes in data mining. Technical Requirements in HOQ are mapped to attributes and values. In this way, the framework is able to make use of well-established methodologies and business practices in product design and manufacturing from Management Sciences, such as fast identification of important activities and prioritizing them. An initial prototype of the proposed Opportunity Map system is reported in [28]. In this paper, we enhance previous methods and also extend the above framework with a number of novel visual mining methods which significantly improve the usability of the system. We introduce them briefly below. In the proposed Opportunity Map system, an integrated data mining rules visualization and distribution map visualization method is developed. Distribution maps (Distribution Bars/Correlation Charts) [5][11] are used in traditional statistical data analysis. This technique plots the data distribution of