Vol.:(0123456789) 1 3 Evolutionary Intelligence https://doi.org/10.1007/s12065-020-00498-2 SPECIAL ISSUE Feature reduction using SVM‑RFE technique to detect autism spectrum disorder Priya Mohan 1  · Ilango Paramasivam 2 Received: 22 May 2020 / Revised: 10 September 2020 / Accepted: 24 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020 Abstract Autism Spectrum Disorder (ASD) is a developmental disorder characterized by difculties in social interaction, commu- nication, and restricted or repetitive patterns of thought and behaviour. Diagnosing ASD is important since it is a life long condition and early diagnosis of ASD has a great deal of importance in terms of controlling the disease. This research work focuses on the analysis of the features that are vital in diagnosing the symptoms of ASD in an individual and to help in the early identifcation of ASD. The autism dataset for this research work is taken from the UCI repository. The proposed method, SVMAttributeEval, assigns feature weight to the features and the features are ranked based on their importance. The recursive Feature Elimination method is applied and the performance of the classifcation algorithms LibSVM, IBk, and Naïve Bayes for the reduced feature subsets selected by the wrapper method is measured. The empirical results show an improvement in the accuracy of the classifers on the removal of the least signifcant features with feature reduction of 60% achieved against the original feature set. The performance of the classifcation algorithms has signifcantly improved for the reduced feature subset of ASD. The LibSVM classifcation algorithm achieves 93.26% accuracy, IBk (92.3%), and Naïve Bayes (91.34%) for the selected feature subset as compared to the values achieved for the whole feature set. Keywords Autism spectrum disorder (ASD) · IBk (K-nearest neighbor) · Naïve Bayes · Recursive feature elimination (RFE) · LibSVM · SVMAttributeEval 1 Introduction Autism or Autism Spectrum Disorder (ASD) is a develop- mental disorder of the brain that consists of a range of con- ditions like challenges in exhibiting social skills, repetitive behaviors, lack of speech and nonverbal communication along with notable strengths and diferences. According to the data released by the Centers for Disease Control (CDC) on the prevalence of autism, the survey study has identifed 1 in 59 children as having autism spectrum disorder (ASD) as on April 26th 2018 [1]. This implies that early diagnosis of ASD can lead to better outcomes by enabling the families with ASD to avail early intervention services between 18 and 24 months of age for the afected autistic individual. Several screening instruments have been developed to gather quick information about a child’s social and communicative development viz., Checklist for Autism in Toddlers (CHAT), the Modifed Checklist for Autism in Toddlers (M-CHAT), the Screening Tool for Autism in Two-Year-Olds (STAT), and the Social Communication Questionnaire (SCQ) for children 4 years of age and older. These tools are conducted by caregivers, parents, or teachers, and require responses to a large number of questions which makes many of them lengthy and inefcient. Therefore, it is necessary to identify an infuential set of features in the screening process for speeding up the diagnostic procedures and to help in the referral of autistic individuals for early intervention pro- gramme which forms the basis of this research work [2]. The application of data preprocessing techniques can reduce the number of features required for prediction [3]. Feature selection is a preprocessing technique commonly used in high-dimensional data and its purposes include reducing dimensionality, removing irrelevant and redun- dant features, thereby reducing the amount of data needing * Priya Mohan priya.vinoth13@gmail.com 1 Department of Computer Science, Bharathiar University, Coimbatore 641046, India 2 Department of Computer Science and Engineering, PSG Institute of Technology and Applied Research, Coimbatore 641062, India