David C. Wyld et al. (Eds) : ACITY, DPPR, VLSI, WiMNET, AIAA, CNDC - 2015
pp. 163–178, 2015. © CS & IT-CSCP 2015 DOI : 10.5121/csit.2015.51315
Babu Rajesh V, Phaninder Reddy, Himanshu P and Mahesh U Patil
Centre for Development of Advanced Computing
cdac.in
ABSTRACT
Android being a widely used mobile platform has witnessed an increase in the number of
malicious samples on its market place. The availability of multiple sources for downloading
applications has also contributed to users falling prey to malicious applications. Classification
of an Android application as malicious or benign remains a challenge as malicious applications
maneuver to pose themselves as benign. This paper presents an approach which extracts
various features from Android Application Package file (APK) using static analysis and
subsequently classifies using machine learning techniques. The contribution of this work
includes deriving, extracting and analyzing crucial features of Android applications that aid in
efficient classification. The analysis is carried out using various machine learning algorithms
with both weighted and non-weighted approaches. It was observed that weighted approach
depicts higher detection rates using fewer features. Random Forest algorithm exhibited high
detection rate and shows the least false positive rate.
KEYWORDS
Mobile Security, Malware, Static Analysis, Machine Learning, Android
1. INTRODUCTION
Android is a widely used mobile platform and due to it's dominance in consumer space, Android
becomes a lucrative target for malware developers who are exploiting the popularity and
openness of Android platform for various benefits. Malware developers use Android
marketplaces as entry points for hosting thesir malicious applications into the android user space.
According to RiskIQ [1] report, malicious applications in Play store have grown by 388 percent
from 2011 to 2013, while the number of such applications removed annually by Google has
dropped from 60 percent in 2011 to 23 percent in 2013. As a large number of applications are
uploaded and updated regularly on these market places, Manual analysis of all the applications is
difficult task. Scarcity of effective mechanisms to detect these malicious samples has fueled the
rise of malware applications on Android market places. In this regard we present DroidSwan, a
system for classifying applications as malware or benign, based on static analysis of Android
APK. DroidSwan extracts various crucial features from an Android application, assigns weight to
these features and builds a classifier model using machine learning algorithms. The classifier
model is trained using the malware data set of 1260 malware acquired from Genome Malware