I.J. Intelligent Systems and Applications, 2014, 06, 37-45
Published Online May 2014 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijisa.2014.06.04
Copyright © 2014 MECS I.J. Intelligent Systems and Applications, 2014, 06, 37-45
Hierarchical Clustering Algorithm based on
Attribute Dependency for Attention Deficit
Hyperactive Disorder
J Anuradha, B K Tripathy
School of Computing Science and Engineering, VIT University, Vellore, Tamilnadu, India
Email: januradha@vit.ac.in, tripathybk@vit.ac.in
Abstract— Attention Deficit Hyperactive Disorder (ADHD) is a
disruptive neurobehavioral disorder characterized by abnormal
behavioral patterns in attention, perusing activity, acting
impulsively and combined types. It is predominant among
school going children and it is tricky to differentiate between an
active and an ADHD child. Misdiagnosis and undiagnosed
cases are very common. Behavior patterns are identified by the
mentors in the academic environment who lack skills in
screening those kids. Hence an unsupervised learning algorithm
can cluster the behavioral patterns of children at school for
diagnosis of ADHD. In this paper, we propose a hierarchical
clustering algorithm to partition the dataset based on attribute
dependency (HCAD). HCAD forms clusters of data based on
the high dependent attributes and their equivalence relation. It is
capable of handling large volumes of data with reasonably
faster clustering than most of the existing algorithms. It can
work on both labeled and unlabelled data sets. Experimental
results reveal that this algorithm has higher accuracy in
comparison to other algorithms. HCAD achieves 97% of cluster
purity in diagnosing ADHD. Empirical analysis of application
of HCAD on different data sets from UCI repository is provided.
Index Terms— Hierarchical Clustering, Attribute Dependency,
ADHD, Cluster Purity.
I. INTRODUCTION
Attention Deficit Hyperactive Disorder also recognized
as Hyperkinetic disorder rooted by an unknown factor
that disrupts the processing of brain. This makes it
difficult for an individual in sustaining attention till
completion of a task, without waiting for their turn, acting
recklessly which in turn affects their academic
performances like reading, writing, communicating and
lowers their confidence. It is well noticed among school
going children of the age group between 6 and 13 years.
Kids with these symptoms are considered as unworthy
and having low learning motivation, poor self esteem,
social rejection and rejected by their peer children [1] [2].
It is found in literature that around 10% of the school
going children suffers from ADHD [3]. The diagnostic
reports reveal that the percentage of boys reported to have
ADHD is comparatively higher than that of girls. The
symptoms are very difficult in differentiating it from
other disorders that increases the risk of being
misdiagnosed or remain undiagnosed. Correct and early
diagnosis is more vital to overcome their academic
challenges and lack of diagnosis may worsen their
behavior. Also, proper diagnosis helps the teachers and
parents in handling those children in a different way.
DSM-IV [4][5] diagnostic criteria is used for
diagnosing ADHD. Figure 1.1 (see Appendix - I) shows
the criteria and their sub types for identifying ADHD.
The sub – types include inattention, hyperactive and
impulsive and their combined type. The behavioral
disorders are studied through 13 characteristics (attributes)
like careless mistakes, sustaining attention, listening,
following instruction, organizing tasks, loss of attention,
distraction, forgetting activities, fidgets, engaging
activities, talking excessively, blurting answers and
interrupts. The values assumed by these characteristics
are of the form yes/no answers.
Clustering, which has attained a major focus of research
under unsupervised learning is being applied in various
fields not limited to data mining, pattern recognition,
statistics, machine learning, image processing, medical
diagnosis and digital signal processing. The proposed
hierarchical clustering algorithm (HCAD) is applied for
medical diagnosis to classify the students with ADHD in
academic environment. Classification and clustering
techniques play vital roles in identifying hidden patterns
from a given data sets. Clustering is a technique for
grouping of data with similar properties. There are various
clustering techniques which include partitioning
approaches, Hierarchical methods, Density based methods
and Grid based methods. The partition approaches like
centroid based clustering and representative object based
technique fix the centroid for the desired number of
clusters and data are added to each of these clusters based
on the distance between the data and the centroid. On
further iterations the best centroids are selected with in
each cluster and the cluster objects are refined based on the
similarity and dissimilarity measures between intra clusters
and inter cluster respectively. Hierarchical methods like
BIRCH, Chameleon and Probabilistic hierarchical
clustering use agglomerative and divisive methods [6]. The
agglomerative method follows a bottom up approach,
where the data sets are initially formed into different
groups and on iteration they are combined with other
groups until desired number of clusters is formed. The
divisive method follows a top down approach, where the
entire data set is considered as one cluster and on iteration
they are partitioned into different clusters until desired
number of clusters is formed.