Journal of Engineering Science and Technology
Vol. 12, No. 6 (2017) 1446 - 1459
© School of Engineering, Taylor’s University
1446
AN ADABOOST OPTIMIZED CCFIS BASED
CLASSIFICATION MODEL FOR BREAST CANCER DETECTION
CHANDRASEKAR RAVI, NEELU KHARE*
School of Information Technology and Engineering (SITE), VIT University, Vellore
Campus, Vellore-632014, Tamil Nadu, India
*Corresponding Author: neelu.khare@vit.ac.in
Abstract
Classification is a Data Mining technique used for building a prototype of the
data behaviour, using which an unseen data can be classified into one of the
defined classes. Several researchers have proposed classification techniques
but most of them did not emphasis much on the misclassified instances and
storage space. In this paper, a classification model is proposed that takes into
account the misclassified instances and storage space. The classification
model is efficiently developed using a tree structure for reducing the storage
complexity and uses single scan of the dataset. During the training phase,
Class-based Closed Frequent ItemSets (CCFIS) were mined from the training
dataset in the form of a tree structure. The classification model has been
developed using the CCFIS and a similarity measure based on Longest
Common Subsequence (LCS). Further, the Particle Swarm Optimization
algorithm is applied on the generated CCFIS, which assigns weights to the
itemsets and their associated classes. Most of the classifiers are correctly
classifying the common instances but they misclassify the rare instances. In
view of that, AdaBoost algorithm has been used to boost the weights of the
misclassified instances in the previous round so as to include them in the
training phase to classify the rare instances. This improves the accuracy of the
classification model. During the testing phase, the classification model is
used to classify the instances of the test dataset. Breast Cancer dataset from
UCI repository is used for experiment. Experimental analysis shows that the
accuracy of the proposed classification model outperforms the PSO-
AdaBoost-Sequence classifier by 7% superior to other approaches like Naïve
Bayes Classifier, Support Vector Machine Classifier, Instance Based
Classifier, ID3 Classifier, J48 Classifier, etc.
Keywords: Association mining, Class-based closed frequent itemset, Tree structure,
Ensemble classifier, Optimization.