Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology, 7 (3.20) (2018) 502-506
International Journal of Engineering & Technology
Website: www.sciencepubco.com/index.php/IJET
Research paper
Malware Analysis Using Apis Pattern Mining
Nawfal Turki Obeis and Wesam Bhaya
University of Babylon, College of Information Technology, Babil, Iraq.
*E-mail:nawfalaljumaili@yahoo.com, wesambhaya@itnet.uobabylon.edu.iq.
Abstract
Malicious code threats cybersecurity. Malware and its detection have caught the challenges of both anti-malware industry and researchers for
decades.
We use pattern mining technique to find the frequent Windows Application Program Interface (API) calls and then uses the frequent item sets
to build the sequence of features for next analysis. Shingling techniques have proven effective for the problem of detecting. For verification,
we use clustering processes of malware sequences based on their frequent API call sequences.
We have achieved a high detection rate of 99.029% with accuracy as high as 98.8%. Thus, proposal method improved state of the art
technology in several aspects: accuracy, detection rate, and false alarm rate were decreased.
The experiment upon a big API sequence dataset demonstrated that the using frequent of API call sequences could realize a high accuracy for
malware clustering while dropping the computation time.
Keywords: Malicious Code; Malware Detection; Shingling; API Calls; Pattern Mining.
1. Introduction
Malware program alludes to plans that purposefully mishandle
vulnerabilities in preparing systems for a ruinous reason. Malware
program can be isolated if the item needs or does not require a host
framework to work. Another technique for classifying Malware
program is by perceiving if the item creates copies of itself or not [1,
2].
Malware program creators routinely use diverse strategies to alter or
change existing malware into new polymorphic adjustments to evade
detection. The openness of innovative toolboxes has made it less
requesting for malware creators to use procedures, for instance,
dead-code expansion and enlist reassignment to play out this change.
The malicious program change or jamming can be classified into
transformative nature and polymorphism [3, 4].
A malware detector system is a PC program that endeavors to
distinguish and identify malware using an assortment of techniques
that join recognizing malware signature, utilizing heuristic standards,
and perceiving malware conduct or exercises. Malware locators can
work locally on the system that is being secured or give protection
remotely through a PC network [5, 2].
There are two types of data are required by malware detector
systems, specifically, information of the malware behavior or
signature which can be expanded through a learning methodology
and the framework under evaluation. Once the two wellsprings of
data get the chance to be available, the malware detector uses its
detection techniques to determine whether the product is benign or
malware [6].
A Software program is a set of APIs. Define a k-shingle for a
software program to be any subset of APIs of length k found within
the software program.
2. Related Works
This segment surveys a some of the current algorithms and
techniques that are utilized for detecting the malware.
Fan, Ye, and Chen (2016) provide an effective sequence mining
method Called "All-Nearest-Neighbor (ANN)" to recognize the
malwares in light of the found sequences. The principle aims of this
article were to separate the all-around spoke to highlights from
Portable Executable (PE) records, and to recognize the malwares
with the of ANN technique.[7]
Fan, Hsiao, Chou, and Tseng (2015) provide tracing and analyzing
the malware by distinguishing the malicious and malicious programs
with the assistance of "Application Programmable Interfaces" (API)
calls. The researchers chose some classification procedures, for
example, Bayesian, decision tree and Support Vector Machine for
malware classification.[8]
Guo and et al.(2014) prescribed a system behavior classification
model to detect the portable malwares in view of its behavior
characteristics. This work incorporates two phases, which
incorporates analyzer training and network behavior detection.[9]
Demme and et al. (2013) provided a malware detector with identify
the minor varieties in a malware programs. In this work, the fine-
grained run time information was gathered without backing off the
applications. The subversion of the insurance plot was avoided with
the safe updating of Anti-Virus algorithms.[10]