Vol.:(0123456789) 1 3 Journal of Ambient Intelligence and Humanized Computing https://doi.org/10.1007/s12652-018-1031-9 ORIGINAL RESEARCH Improved salp swarm algorithm based on particle swarm optimization for feature selection Rehab Ali Ibrahim 1 · Ahmed A. Ewees 2,3 · Diego Oliva 4 · Mohamed Abd Elaziz 5 · Songfeng Lu 1,6 Received: 22 January 2018 / Accepted: 3 September 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018 Abstract Feature selection (FS) is a machine learning process commonly used to reduce the high dimensionality problems of datasets. This task permits to extract the most representative information of high sized pools of data, reducing the computational efort in other tasks as classifcation. This article presents a hybrid optimization method for the FS problem; it combines the slap swarm algorithm (SSA) with the particle swarm optimization. The hybridization between both approaches creates an algorithm called SSAPSO, in which the efcacy of the exploration and the exploitation steps is improved. To verify the performance of the proposed algorithm, it is tested over two experimental series, in the frst one, it is compared with other similar approaches using benchmark functions. Meanwhile, in the second set of experiments, the SSAPSO is used to determine the best set of features using diferent UCI datasets. Where the redundant or the confusing features are removed from the original dataset while keeping or yielding a better accuracy. The experimental results provide the evidence of the enhancement in the SSAPSO regarding the performance and the accuracy without afecting the computational efort. Keywords Salp swarm algorithm · Particle swarm optimization · Feature selection · Global optimization · Swarm techniques 1 Introduction Feature selection (FS) has become the focus of many types of researches for machine learning and data mining in many areas of applications for datasets of tens or hundreds or even thousands of variables (Guyon and Elisseef 2003). FS has been widely applied in several areas like text process- ing of internet documents (Al-Ayyoub et al. 2017; Chikh and Chikhi 2017; Saravanan and Rajesh Babu 2017), gene expression array analysis (Li and Wong 2002), text catego- rization (Yang et al. 2002), genomics (Kohane et al. 2002), cancer detection (Prabukumar et al. 2017), image classifca- tion (Ibrahim et al. 2018), computer vision (Arigbabu et al. 2016), signal processing (Kung et al. 2010), bioinformatics (Awada et al. 2012), image retrieval (El Aziz et al. 2018a; Li and Wang 2015; Wang et al. 2018), medical applications (Chang et al. 2012), combinatorial chemistry (Jensen et al. 2009) and others. Feature selection is of three benefts: (1) improving the prediction performance of the predictors, (2) providing a better understanding of the underlying process that generates the data fastly (3) has more cost-efective predictors (Han et al. 2011). These contributions cover a wide area of aspects * Songfeng Lu lusongfeng@hust.edu.cn Rehab Ali Ibrahim rehab100r@yahoo.com Ahmed A. Ewees a.ewees@hotmail.com Diego Oliva diego.oliva@cucei.udg.mx Mohamed Abd Elaziz abd_el_aziz_m@yahoo.com 1 School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China 2 University of Bisha, Bisha, Kingdom of Saudi Arabia 3 Department of Computer, Damietta University, Damietta, Egypt 4 Departamento de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Av. Revolucion 1500, Guadalajara, Jal, Mexico 5 Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt 6 Shenzhen Huazhong University of Science and Technology Research Institute, Shenzhen 518063, China