Neha Reddy Bonthu et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 2443 – 2446 2443 ABSTRACT This paper is regarding Autism spectrum disorder(ASD) and its prevalence based on age. Three datasets from different sources are collected and an additional dataset was obtained from NSCH which consists of several attributed have undergone preprocessing and principal component analysis was applied in-order to achieve highest accuracy i.e., an approximate of 99% for all the datasets used . In addition to this, the study also includes the sub-grouping i.e., the disorders under the spectrum of ASD and the correlation between age and the disorder. Key words : Autism Spectrum Disorder (ASD), DSM-5 (The Diagnostic and Statistical Manual of Mental Disorders), NSCH (National survey of child health ). 1. INTRODUCTION Autism is defined as a developmental disorder with the difficulty of cognition, physical impairment, etc. Several types of research are conducted on autism in different areas. However, we are concerned regarding the types of disorders in the spectrum of ASD. As early studies reveal that even autism was treated as schizophrenia until 1943. By referring to the existing database, we can provide an alternative diagnosis with the aid of machine learning techniques. This study a continuation of our earlier study of subgrouping the disorders using DSM-5. We considered several factors like gender, history of disorders, symptoms, etc. We would like to add the new attribute i.e, the age. The datasets we are dealing with are of three different age groups and we have so far conducted four machine learning classification techniques. The proposed study aims at building a mobile screening tool that may be used by the citizens at the comfort of their own home and can assess regarding the type of disorder, Whether the disorder is in the spectrum of autism and treatments that are currently preferred by the medical experts for the disorders. 2. LITERATURE SURVEY The three datasets of different age groups are collected from the survey conducted by F.Abdeliaber in [1]. Several screening processes were conducted in [2,4,5,12]. However, we used different classifiers for the existing datasets. In [1] and [4] same datasets were used generated from a mobile screening app where the former used ten-fold cross-validation using if-then rules and obtained an accuracy around 90% and the latter performed KNN (K-Nearest Neighbor) and LDA(Linear Discriminant analysis) and best of them turned out to be LDA with an accuracy of 90%. The new screening process was introduced in [10] using MRI Scans and [13] has introduced similar technique using Neural networks classifier. The idea of considering age as an attribute as it was mentioned in [8] that early symptoms prediction in children is difficult. The role of DSM in the decision making of clinicians and medical experts is cited in [3,7]. In [7], DSM-IV and DSM-V are compared and the stringent nature, advancements in DSM-V that meet the criteria of ASD diagnosis are incorporated. In [3], the ideas of sub grouping of ASD and the significance of it during diagnosis are acknowledged. The basis of sub grouping in [3] is from clinical samples of [7,9] and some existing datasets. The specifiers in DSM-5 are mentioned which aid the phenotypic characterization. 3. METHODOLOGY In this study, data is collected from three different sources in raw formats. The datasets are of different age groups i.e., adult, adolescent, and toddler. Besides this we collected a dataset from NSCH. All these datasets are preprocessed for further applying the classification algorithms of machine learning . Later, machine learning algorithms Naive Bayes, Neural Networks, Support vector machine and random forest are applied. For achieving better accuracy, principal component analysis is applied to the existing datasets. . Autism Detection and Sub-grouping Neha Reddy Bonthu 1 , Swarna Kuchibhotla 2 , Prasuna Kotturu 3 ,Niranjan S.R.Mellacheruvu 4 , B. Manjula Josephine 5 1,2,3,5 Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, nehareddybonthu725@gmail.com, drkswarna@kluniversity.in, prasukotturu@kluniversity.in , manjulajosephine@kluniversity.in, India 4 Department of Electronics and Communication Engineering, Vikas college of Engineering and Technology Guntur ,niranjanmsr@gmail.com, India ISSN 2278-3091 Volume 9 No.2, March -April 2020 International Journal of Advanced Trends in Computer Science and Engineering Available Online at http://www.warse.org/IJATCSE/static/pdf/file/ijatcse231922020.pdf https://doi.org/10.30534/ijatcse/2020/231922020