© IJARCSMS (www.ijarcsms.com), All Rights Reserved 29 | P age e-ISJN: A4372-3114 ISSN: 2321-7782 (Online) p-ISJN: A4372-3115 ISSN: 2347-1778 (Print) Impact Factor: 7.327 Volume 8, Issue 2, February 2020 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com A Survey of Machine Learning Techniques for Identifying and Classifying Malwares Umesh V. Nikam 1 Department of Computer Science & Engineering, P. R. M. I. T&R, Badnera. Amravati, India Dr. V. M. Deshmukh 2 Department of Computer Science & Engineering P. R. M. I. T&R, Badnera. Amravati, India Abstract: A serious threat on the internet today is a malware. As the malware propagate they change their code. Nowdays attacker creates polymorphic and metamorphic malwares. The traditional signature based detection techniques are inefficient against modern day’s malware threats. The various malware families have different behavior pattern reflecting their origin and purposes. These patterns can be used to detect and classify unknown malwares into their families using machine learning technique. This survey paper provides an overview of various techniques for detecting and classifying malwares into their respective families. Keywords: Malware, Machine learning, Classification. I. INTRODUCTION A malware is a computer program with the purpose of causing harm to the operating system. Basic purpose of malware is to fulfill the harmful intent of an attacker by gathering personal information about a user or host system, thus hampering availability, integrity and privacy of user’s data. There is a wide a range of malwares like Worm, Virus, Trojan horse, Rootk it, Backdoor, Botnet, Spyware, Adware etc. Known software threats can be detected by modern antivirus software effectively but is inefficient in detecting novel malware. A study by AusCERT found that 80 percent of new malware was not detected by latest antivirus software. [1] Detection, mitigation and classification of malware is a major problem in internet today. The malwares are continuously growing in volume, variety and velocity. A. LIMITATIONS OF TRADITIONAL ANTIVIRUS Traditional signature based antivirus system is reactive in nature. In order to detect a malware in earlier days malware analyst used to manually generate a signature or a hash, and creates a database of a those signatures. During every new scan antivirus system scans the database and if there is a match detects the malware. But because of polymorphic nature of malwares; this signature based detection technique is not able to identify various security threats. In order to create a more reliable and robust system we need to develop an alternative to the traditional signature based detection system. To overcome the drawback of signature based system, malware analysis techniques are being followed, which can be either static or dynamic. These malware analysis techniques help the analyst to understand risk associated with malicious code. In static analysis malicious software’s are analyzed without being executed. Before doing static analysis it is necessary to unpack and decrypt executables. The detection pattern used can be Byte Sequence, N Grams, Syntactic Library Call, Control Flow Graph, String Signature etc.