mathematics Article Weighted Hybrid Feature Reduction Embedded with Ensemble Learning for Speech Data of Parkinson’s Disease Zeeshan Hameed 1 , Waheed Ur Rehman 2,3 , Wakeel Khan 4 , Nasim Ullah 5, * and Fahad R. Albogamy 6   Citation: Hameed, Z.; Rehman, W.U.; Khan, W.; Ullah, N.; Albogamy, F.R. Weighted Hybrid Feature Reduction Embedded with Ensemble Learning for Speech Data of Parkinson’s Disease. Mathematics 2021, 9, 3172. https://doi.org/10.3390/ math9243172 Academic Editors: Cornelio Yáñez Márquez, Yenny Villuendas-Rey and Miltiadis D. Lytras Received: 7 October 2021 Accepted: 7 December 2021 Published: 9 December 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Faculty of Information Technology, College of Computer Science, Beijing University of Technology, Beijing 100124, China; zeeshanhameed.zh@gmail.com 2 College of Mechanical Engineering and Applied Electronics Technologies, Beijing University of Technology, Beijing 100124, China; wrehman87@bjut.edu.cn 3 Swedish College of Engineering and Technology, Rahim Yar Khan 64200, Pakistan 4 Department of Electrical Engineering, Foundation University Islamabad, Islamabad 44000, Pakistan; wakeel.khan@fui.edu.pk 5 Department of Electrical Engineering, College of Engineering, Taif University KSA, P.O. Box 11099, Taif 21944, Saudi Arabia 6 Computer Sciences Program, Turabah University College, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; f.alhammdani@tu.edu.sa * Correspondence: nasimullah@tu.edu.sa Abstract: Parkinson’s disease (PD) is a progressive and long-term neurodegenerative disorder of the central nervous system. It has been studied that 90% of the PD subjects have voice impairments which are some of the vital characteristics of PD patients and have been widely used for diagnostic purposes. However, the curse of dimensionality, high aliasing, redundancy, and small sample size in PD speech data bring great challenges to classify PD objects. Feature reduction can efficiently solve these issues. However, existing feature reduction algorithms ignore high aliasing, noise, and the stability of algorithms, and thus fail to give substantial classification accuracy. To mitigate these problems, this study proposes a weighted hybrid feature reduction embedded with ensemble learning technique which comprises (1) hybrid feature reduction technique that increases inter-class variance, reduces intra-class variance, preserves the neighborhood structure of data, and remove co-related features that causes high aliasing and noise in classification. (2) Weighted-boosting method to train the model precisely. (3) Furthermore, the stability of the algorithm is enhanced by introducing a bagging strategy. The experiments were performed on three different datasets including two widely used datasets and a dataset provided by Southwest Hospital (Army Military Medical University) Chongqing, China. The experimental results indicated that compared with existing feature reduction methods, the proposed algorithm always shows the highest accuracy, precision, recall, and G-mean for speech data of PD. Moreover, the proposed algorithm not only shows excellent performance for classification but also deals with imbalanced data precisely and achieved the highest AUC in most of the cases. In addition, compared with state-of-the-art algorithms, the proposed method shows improvement up to 4.53%. In the future, this algorithm can be used for early and differential diagnoses, which are rated as challenging tasks. Keywords: Parkinson’s disease; dimensionality reduction; ensemble learning; hybrid feature learning 1. Introduction The use of machine learning techniques to control diseases is becoming popular nowadays [13]. Parkinson’s disease damages the nerve cells that are responsible for body movement [4]. As a symptom of Parkinson’s disease, speech plays an informative role in the pathogenesis of Parkinson’s disease. The convenience of voice acquisition makes remote monitoring of Parkinson’s disease possible. However, speech datasets often have noise and high aliasing characteristics. This brings troublesomeness in the processing of speech Mathematics 2021, 9, 3172. https://doi.org/10.3390/math9243172 https://www.mdpi.com/journal/mathematics