IEEE TRANSACTIONS ON ELECTRONICS PACKAGING MANUFACTURING, VOL. 23, NO. 4, OCTOBER 2000 345 Decomposition in Data Mining: An Industrial Case Study Andrew Kusiak, Member, IEEE Abstract—Data mining offers tools for discovery of relation- ships, patterns, and knowledge in large databases. The knowledge extraction process is computationally complex and therefore a subset of all data is normally considered for mining. In this paper, numerous methods for decomposition of data sets are discussed. Decomposition enhances the quality of knowledge extracted from large databases by simplification of the data mining task. The ideas presented are illustrated with examples and an industrial case study. In the case study reported in this paper, a data mining approach is applied to extract knowledge from a data set. The extracted knowledge is used for the prediction and prevention of manufacturing faults in wafers. Index Terms—Data mining, decision making, decomposition, in- tegrated circuit, quality engineering. I. INTRODUCTION T HE VOLUME of data is growing at an unprecedented rate, both in the number of features (attributes) and objects (instances). For example, many databases with genetic information may contain thousands of features for large number of patients. In the technology applications, quantitative (e.g., from sensors) and qualitative (e.g., from manufacturing environment) data from diverse sources may be linked, thus significantly increasing the number of features. For example, to analyze the quality of wafers in semiconductor industry upstream chemistry information proved to be useful. Such information combined with the manufacturing process data results in a large number of features objects (cases) for which the information is collected. Data mining offers tools for discovery of patterns, associations, changes, anomalies, rules, and statistically significant structures and events in data. The patterns and hypothesis are automatically extracted from data rather than being formulated by a user as it is done in traditional modeling approaches, e.g., statistical or mathematical programming modeling. As a new discipline, data mining draws from other areas such as statistics, machine learning, databases, and high performance computing. In many applications, data is automatically generated and therefore the number of objects available for mining can be large. The time needed to extract knowledge from such large data sets is an issue, as it may easily run from seconds to days and beyond. One way to reduce computational complexity of knowledge discovery with data mining algorithms and deci- Manuscript received February 7, 2000; revised October 2, 2000. The author is with the Intelligent Systems Laboratory, The University of Iowa, Iowa City, IA 52242-1527 USA (e-mail: andrew-kusiak@uiowa.edu). Publisher Item Identifier S 1521-334X(00)11075-4. sion making based on the acquired knowledge is to reduce the volume of data to be processed at a time, which can be accom- plished by decomposition. In this paper, numerous decompo- sition approaches are defined and applied for effective knowl- edge discovery and decision making. Besides easing computa- tion, decomposition offers an added benefit. It facilities dynamic knowledge extraction that can be coupled with decision making, thus resulting in real-time autonomous systems able to continu- ously improve their predictive accuracy. The research reported in the paper is based on the develop- ments in machine learning and data mining discussed for ex- ample in [1]–[3]. Some of the most known learning algorithms are listed next. ID3: Induction Decision Tree is a supervised learning algorithm developed by Quinlan [4]. AQ15: Inductive learning system generates decision rules, where the conditional part is a logical for- mula [5]. Domain knowledge is used to generate new attributes that are not present in the input data. Naïve- Bayes: A simple induction algorithm that computes con- ditional probabilities of the classes. Given the in- stance, it selects the class with the highest poste- rior probability [6]. OODG: Oblivious read-Once Decision Graph induction algorithm for building oblivious decision graphs, using a bottom-up approach [7]. Lazy decision trees An algorithm for building the best decision tree for every test instance developed by Friedman et al. [8]. C4.5: The decision-tree induction algorithm by Quinlan [9]. CN2: The direct rule induction algorithm by Clark and Boswell [10]. This algorithm combines the best features of both ID3 [4] and AQ [5], where it uses pruning techniques similar to the techniques used in ID3 and related to the conditional rules used in AQ. IB: The instance based learning algorithms by Aha [11]. OC1: The Oblique decision-tree algorithm by Murthy and Salzberg [12]. 1521–334X/00$10.00 © 2000 IEEE