Volume 2, No. 08, October 2013 ISSN – 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ © 2013, http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved 12 A Hybrid Approach Using I 2 and MC4.5 Algorithms for Mining Multidimensional Data Sets (HIIMC4.5) S.Santhosh kumar Research Scholar, PRIST University, Thanjavur Lecturer, Department of Computer Science Government College for Women (A) Kumbakonam, Tamil Nadu, India Santhoshsundar@yahoo.com Dr.E.Ramaraj Director, Computer Center Alagappa University Karaikudi India eramaraj@rediffmail.com Abstract This paper presents a combinational approach of clustering and classification called semi-supervised learning approaches. In this work we developed a hybrid model by combining our earlier contributions of two algorithms. For large data bases, before searching a data, primary categorisation is needed to mine the data efficiently. The proposed hybrid model is compared with the existing hybrid model, which is also our earlier work to identify the closest data patterns in the large data bases. The new hybrid technique enables to improve the limitations of existing hybrid model. The proposed model is compared with established hybrid model. The implementation with different data sets, results accurate classification prediction with less error rate. Key Terms: C4.5 Classifier, k-means, M C4.5, I 2 Clustering. 1. Introduction Semi- supervised learning (SSL) [1], is a type of machine learning technique handles both labelled and unlabeled simultaneously. It is an emerging field of data mining become popularised since 2005. The primary advantage of use of SSL is its cost effectiveness. The processing of labelled data required knowledge, skill, technique and which is expensive. The SSL allows extracting the unlabelled data with small amount of labelled data. For large databases such as banking, medical are contains huge amounts of data. It is more expensive and also more complex to extract the particular (labelled) data, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. There are many SSL models developed based on data types. The Generative model is one of familiar model which combines classification and clustering techniques based on joint distribution of data. Based on combinational approach, we have taken our proposed hybrid model, which is a combination of k-means algorithm and C4.5