I.J. Intelligent Systems and Applications, 2014, 06, 37-45 Published Online May 2014 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2014.06.04 Copyright © 2014 MECS I.J. Intelligent Systems and Applications, 2014, 06, 37-45 Hierarchical Clustering Algorithm based on Attribute Dependency for Attention Deficit Hyperactive Disorder J Anuradha, B K Tripathy School of Computing Science and Engineering, VIT University, Vellore, Tamilnadu, India Email: januradha@vit.ac.in, tripathybk@vit.ac.in AbstractAttention Deficit Hyperactive Disorder (ADHD) is a disruptive neurobehavioral disorder characterized by abnormal behavioral patterns in attention, perusing activity, acting impulsively and combined types. It is predominant among school going children and it is tricky to differentiate between an active and an ADHD child. Misdiagnosis and undiagnosed cases are very common. Behavior patterns are identified by the mentors in the academic environment who lack skills in screening those kids. Hence an unsupervised learning algorithm can cluster the behavioral patterns of children at school for diagnosis of ADHD. In this paper, we propose a hierarchical clustering algorithm to partition the dataset based on attribute dependency (HCAD). HCAD forms clusters of data based on the high dependent attributes and their equivalence relation. It is capable of handling large volumes of data with reasonably faster clustering than most of the existing algorithms. It can work on both labeled and unlabelled data sets. Experimental results reveal that this algorithm has higher accuracy in comparison to other algorithms. HCAD achieves 97% of cluster purity in diagnosing ADHD. Empirical analysis of application of HCAD on different data sets from UCI repository is provided. Index TermsHierarchical Clustering, Attribute Dependency, ADHD, Cluster Purity. I. INTRODUCTION Attention Deficit Hyperactive Disorder also recognized as Hyperkinetic disorder rooted by an unknown factor that disrupts the processing of brain. This makes it difficult for an individual in sustaining attention till completion of a task, without waiting for their turn, acting recklessly which in turn affects their academic performances like reading, writing, communicating and lowers their confidence. It is well noticed among school going children of the age group between 6 and 13 years. Kids with these symptoms are considered as unworthy and having low learning motivation, poor self esteem, social rejection and rejected by their peer children [1] [2]. It is found in literature that around 10% of the school going children suffers from ADHD [3]. The diagnostic reports reveal that the percentage of boys reported to have ADHD is comparatively higher than that of girls. The symptoms are very difficult in differentiating it from other disorders that increases the risk of being misdiagnosed or remain undiagnosed. Correct and early diagnosis is more vital to overcome their academic challenges and lack of diagnosis may worsen their behavior. Also, proper diagnosis helps the teachers and parents in handling those children in a different way. DSM-IV [4][5] diagnostic criteria is used for diagnosing ADHD. Figure 1.1 (see Appendix - I) shows the criteria and their sub types for identifying ADHD. The sub types include inattention, hyperactive and impulsive and their combined type. The behavioral disorders are studied through 13 characteristics (attributes) like careless mistakes, sustaining attention, listening, following instruction, organizing tasks, loss of attention, distraction, forgetting activities, fidgets, engaging activities, talking excessively, blurting answers and interrupts. The values assumed by these characteristics are of the form yes/no answers. Clustering, which has attained a major focus of research under unsupervised learning is being applied in various fields not limited to data mining, pattern recognition, statistics, machine learning, image processing, medical diagnosis and digital signal processing. The proposed hierarchical clustering algorithm (HCAD) is applied for medical diagnosis to classify the students with ADHD in academic environment. Classification and clustering techniques play vital roles in identifying hidden patterns from a given data sets. Clustering is a technique for grouping of data with similar properties. There are various clustering techniques which include partitioning approaches, Hierarchical methods, Density based methods and Grid based methods. The partition approaches like centroid based clustering and representative object based technique fix the centroid for the desired number of clusters and data are added to each of these clusters based on the distance between the data and the centroid. On further iterations the best centroids are selected with in each cluster and the cluster objects are refined based on the similarity and dissimilarity measures between intra clusters and inter cluster respectively. Hierarchical methods like BIRCH, Chameleon and Probabilistic hierarchical clustering use agglomerative and divisive methods [6]. The agglomerative method follows a bottom up approach, where the data sets are initially formed into different groups and on iteration they are combined with other groups until desired number of clusters is formed. The divisive method follows a top down approach, where the entire data set is considered as one cluster and on iteration they are partitioned into different clusters until desired number of clusters is formed.