International Journal of Computer Applications (0975 – 8887) Volume 69– No.22, May 2013 35 A Data Mining Approach for Developing Quality Prediction Model in Multi-Stage Manufacturing Fahmi Arif, Nanna Suryana, Burairah Hussin Faculty of Information and Communication Technology, Univerisiti Teknikal Malaysia Melaka Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia ABSTRACT Quality prediction model has been developed in various industries to realize the faultless manufacturing. However, most of quality prediction model is developed in single-stage manufacturing. Previous studies show that single-stage quality system cannot solve quality problem in multi-stage manufacturing effectively. This study is intended to propose combination of multiple PCA+ID3 algorithm to develop quality prediction model in MMS. This technique is applied to a semiconductor manufacturing dataset using the cascade prediction approach. The result shows that the combination of multiple PCA+ID3 is manage to produce the more accurate prediction model in term of classifying both positive and negative classes. General Terms Data Mining, Prediction Model. Keywords Principal Component Analysis, ID3, Quality Prediction, Data Mining, Multi-stage Manufacturing. 1. INTRODUCTION In order to realize the on-line quality monitoring activity, the ability to predict the finished product quality from manufacturing operation condition is required. This ability can be enabled by providing a formulation or mathematical model which can relate the manufacturing operation condition to the product quality [1]. This model is called quality prediction model. Using quality prediction model, process engineers are able to monitor product quality level by evaluating the manufacturing operation. Recently, various data mining techniques have been employed to develop quality prediction model from manufacturing historical dataset. For example, clustering [2], [3], classification [4–16], association rules [17], [18], and regression have been applied in various industries. These techniques were implemented in injection molding industry, semiconductor manufacturing, slider manufacturing, machining process, hard disk manufacturing, loudspeaker manufacturing, and food processing industry. Most of the prediction models were developed in Single-stage Manufacturing System (SMS). Recently, multi-stage manufacturing system (MMS) becomes more common in real-world industrial setting [19]. MMS refers to the manufacturing system which involves more than one workstation to produce a complex product [20], [21]. Since customer’s taste has become more sophisticated, the complexity in product structure was growing, hence MMS becomes more popular. Various products such as printed circuit board (PCB), semiconductor, automotive products and aerospace device, also need several stages to be produced due to their complex structures [22]. In trying to achieve the faultless manufacturing, quality prediction models are also developed in MMS. Most of quality prediction model in MMS is developed using SMS approach. In MMS, final product is produced through a series of manufacturing operation performed in several workstations. Therefore, the use of SMS approach to measure quality in MMS can be misleading and ineffective due to the cumulative effect in a workstation as the result of the existence of preceding manufacturing operation in previous workstation [23]. Reference [24] proposed a framework of Cascade Quality Prediction Method (CQPM) for developing quality prediction. However, the accuracy of the prediction model that has been developed using CQPM has not been proved and the techniques that can be employed by this model have not been investigated as well. This study aims to propose a data mining technique developing quality prediction model for MMS based on CQPM. 2. RELATED WORKS In developing quality prediction model in MMS, there are two alternative approaches. First alternative is developing one prediction model for the whole manufacturing line. This approach, called single-point approach, treats manufacturing operation that is performed in every workstation as happened in a workstation. Various data mining technique such as classification [12], [14], clustering [3], and association rules [17], [18] have been employed to develop quality prediction model using this approach. Another approach is developing one prediction model for every workstation. This approach is called multi-point approach. Using this approach, there will be several prediction models for the whole manufacturing line. Clustering [2], Principal Component Analysis (PCA) [25], and Partial Least Square (PLS) [26], [27] are some of techniques employed to develop prediction model using this approach. Single-point approach that has been applied using multivariate statistics or data mining techniques assumed that each manufacturing workstation has an independent effect to the product quality level. Moreover, this model has difficulty to reveal the correlation between manufacturing operations from workstation to workstations [2], [25], [26]. From the point of view of partial and total quality as explained by [15], this approach can only explains the partial quality at the last workstation Multi-point approach is able to model the behavior of a particular workstation. In other word, this approach produced the model that is able to explain the relationship among manufacturing operation variables in a workstation. However, this approach can be misleading and ineffective considering that the measurement of a workstation is probably confounded by the cumulative effect from the previous workstation [23].