IEEE TRANSACTIONS ON ELECTRONICS PACKAGING MANUFACTURING, VOL. 23, NO. 4, OCTOBER 2000 345
Decomposition in Data Mining: An Industrial Case
Study
Andrew Kusiak, Member, IEEE
Abstract—Data mining offers tools for discovery of relation-
ships, patterns, and knowledge in large databases. The knowledge
extraction process is computationally complex and therefore a
subset of all data is normally considered for mining. In this paper,
numerous methods for decomposition of data sets are discussed.
Decomposition enhances the quality of knowledge extracted from
large databases by simplification of the data mining task. The
ideas presented are illustrated with examples and an industrial
case study. In the case study reported in this paper, a data mining
approach is applied to extract knowledge from a data set. The
extracted knowledge is used for the prediction and prevention of
manufacturing faults in wafers.
Index Terms—Data mining, decision making, decomposition, in-
tegrated circuit, quality engineering.
I. INTRODUCTION
T
HE VOLUME of data is growing at an unprecedented
rate, both in the number of features (attributes) and
objects (instances). For example, many databases with genetic
information may contain thousands of features for large number
of patients. In the technology applications, quantitative (e.g., from
sensors) and qualitative (e.g., from manufacturing environment)
data from diverse sources may be linked, thus significantly
increasing the number of features. For example, to analyze
the quality of wafers in semiconductor industry upstream
chemistry information proved to be useful. Such information
combined with the manufacturing process data results in a large
number of features objects (cases) for which the information
is collected.
Data mining offers tools for discovery of patterns, associations,
changes, anomalies, rules, and statistically significant structures
and events in data. The patterns and hypothesis are automatically
extracted from data rather than being formulated by a user as
it is done in traditional modeling approaches, e.g., statistical
or mathematical programming modeling. As a new discipline,
data mining draws from other areas such as statistics, machine
learning, databases, and high performance computing.
In many applications, data is automatically generated and
therefore the number of objects available for mining can be
large. The time needed to extract knowledge from such large
data sets is an issue, as it may easily run from seconds to days
and beyond. One way to reduce computational complexity of
knowledge discovery with data mining algorithms and deci-
Manuscript received February 7, 2000; revised October 2, 2000.
The author is with the Intelligent Systems Laboratory, The University of Iowa,
Iowa City, IA 52242-1527 USA (e-mail: andrew-kusiak@uiowa.edu).
Publisher Item Identifier S 1521-334X(00)11075-4.
sion making based on the acquired knowledge is to reduce the
volume of data to be processed at a time, which can be accom-
plished by decomposition. In this paper, numerous decompo-
sition approaches are defined and applied for effective knowl-
edge discovery and decision making. Besides easing computa-
tion, decomposition offers an added benefit. It facilities dynamic
knowledge extraction that can be coupled with decision making,
thus resulting in real-time autonomous systems able to continu-
ously improve their predictive accuracy.
The research reported in the paper is based on the develop-
ments in machine learning and data mining discussed for ex-
ample in [1]–[3]. Some of the most known learning algorithms
are listed next.
ID3: Induction Decision Tree is a supervised learning
algorithm developed by Quinlan [4].
AQ15: Inductive learning system generates decision
rules, where the conditional part is a logical for-
mula [5]. Domain knowledge is used to generate
new attributes that are not present in the input
data.
Naïve-
Bayes: A simple induction algorithm that computes con-
ditional probabilities of the classes. Given the in-
stance, it selects the class with the highest poste-
rior probability [6].
OODG: Oblivious read-Once Decision Graph induction
algorithm for building oblivious decision graphs,
using a bottom-up approach [7].
Lazy
decision
trees An algorithm for building the best decision tree
for every test instance developed by Friedman et
al. [8].
C4.5: The decision-tree induction algorithm by
Quinlan [9].
CN2: The direct rule induction algorithm by Clark and
Boswell [10]. This algorithm combines the best
features of both ID3 [4] and AQ [5], where it
uses pruning techniques similar to the techniques
used in ID3 and related to the conditional rules
used in AQ.
IB: The instance based learning algorithms by Aha
[11].
OC1: The Oblique decision-tree algorithm by Murthy
and Salzberg [12].
1521–334X/00$10.00 © 2000 IEEE