International Journal of Electrical and Computer Engineering (IJECE) Vol. 9, No. 4, August 2019, pp. 3108~3114 ISSN: 2088-8708, DOI: 10.11591/ijece.v9i4.pp3108-3114 3108 Journal homepage: http://iaescore.com/journals/index.php/IJECE Empirical analysis of ensemble methods for the classification of robocalls in telecommunications Meghna Ghosh, Prabu P Department of Computer Science, Christ (Deemed to be University), India Article Info ABSTRACT Article history: Received Oct 25, 2018 Revised Apr 1, 2019 Accepted Apr 8, 2019 With the advent of technology, there has been an excessive use of cellular phones. Cellular phones have made life convenient in our society. However, individuals and groups have subverted the telecommunication devices to deceive unwary victims. Robocalls are quite prevalent these days and they can either be legal or used by scammers to trick one out of their money. The proposed methodology in the paper is to experiment two ensemble models on the dataset acquired from the Federal Trade Commission (DNC Dataset). It is imperative to analyze the call records and based on the patterns the calls can classify as a robocall or not a robocall. Two algorithms Random Forest and XgBoost are combined in two ways and compared in the paper in terms of accuracy, sensitivity and the time taken. Keywords: Ensemble method Machine Learning Random Forest Robocalls XGBoost Copyright © 2019 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Meghna Ghosh, Department of Computer Science, Christ (Deemed to be University), Hosur Main Road, Bangalore-560029, India. Email: meghna.ghosh@cs.christuniversity.in, prabu.p@christuniversity.in 1. INTRODUCTION The Federal Trade Commission received over 22 million complaints of illegal and unwanted calls, in 2014. Telephone spammers today are leveraging recent technical advances in the telephony ecosystem to distribute massively automated spam calls known as robocalls [1]. A phone call that uses a computerized autodialer to deliver a pre-recorded message at the other end, as if it were from a robot is a robocall. Once viewed as an inconvenience they have reached epidemic proportions. Few robocalls are also considered legal. The calls permitted can be campaigning for candidates, alerting students to campus closures, appointment reminders, flight cancellation etc. An illegal robocall is a non-emergency call containing a pre-recorded message without the consent of the consumer. It can be either from a registered business which contravened the law or from a scammer that pose as a legal organization in order to steal your money, identity or both. Technology has made it easy to find ways scrape personal information on public databases or internet to find the phone numbers and sell them to both legal and illegal spam callers. In Canada, during the Canadian Federal 2011, in order to reach voters, the political parties legitimately used robocalls. The investigation showed that the robocalls were used to divert the people from casting their ballot by giving them inaccurate information of the changed locations of the poll stations. There has been a steep rise in the automated calls since 2009. According to the FTC report, an agency received over 375,000 complaints about automated robocalls as compared to 2009. The report also stated that the increase in the number of robocalls is due to the free or cheap access to internet calling services which also helps the scammers hide their identity. Machine Learning is an application of artificial intelligence that provides the system the potential to grasp patterns and learn from data and ameliorate from experience depending on some task, without being explicitly coded. Machine Learning mainly focuses on learning from input data and predicting an outcome