Review Article
Use of Security Logs for Data Leak Detection: A Systematic
Literature Review
Ricardo
´
Avila ,
1
Rapha¨ el Khoury ,
1
Richard Khoury ,
2
and F´ abio Petrillo
1
1
D´ epartement d’informatique et de Math´ ematique, Universit´ e du Qu´ ebec ` a Chicoutimi, Qu´ ebec, Canada
2
Department of Computer Science and Software Engineering, Universit´ e Laval, Qu´ ebec, Canada
Correspondence should be addressed to Ricardo
´
Avila; ricardo.lims@gmail.com
Received 26 October 2020; Revised 19 January 2021; Accepted 19 February 2021; Published 11 March 2021
Academic Editor: Flavio Lombardi
Copyright © 2021 Ricardo
´
Avila et al. is is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Security logs are widely used to monitor data, networks, and computer activities. By analyzing them, security experts can pick out
anomalies that reveal the presence of cyber attacks or information leaks and stop them quickly before serious damage occurs. is
paper presents a systematic literature review on the use of security logs for data leak detection. Our findings are fourfold: (i) we
propose a new classification of information leaks, which uses the GDPR principles; (ii) we identify the twenty most widely used
publicly available datasets in threat detection; (iii) we describe twenty types of attacks present in public datasets; and (iv) we
describe thirty algorithms used for data leak detection. e selected papers point to many opportunities that can be investigated by
researchers interested in contributing to this area of research.
1. Introduction
Cybercriminals seek to access, modify, or delete confidential
information for financial gain, fame, personal revenge, or to
disrupt organizational services [1]. ese attackers exploit
software vulnerabilities, the high workload and inexperience
of employees, and the heterogeneity of security solutions
implemented in an organization to carry out their attacks. In
this context, organizations must develop strategies that allow
them to be resilient in the face of malicious attacks. Security
mechanisms create layers of protection and generate event
logs that can be analyzed to detect and react to possible
intrusions. By studying these logs, security analysts can
detect and respond to attacks as they occur, rather than
forensic investigation weeks or months after an incident.
Security logs are thus a crucial tool in the detection of
attacks and information leaks. However, using them comes
with several important challenges. It requires attention to
detail in order to pick out anomalous elements in a long list
of events. e massive size of modern security logs makes it
necessary to analyze a very high volume of data in a short
amount of time. e heterogeneity of devices and systems in
a current corporate computer ecosystem, and thus the
heterogeneity of logs they generate, also contributes to log
analysis complexity.
In this paper, we present a Systematic Literature Review
(SLR) of 33 published, peer-reviewed studies on the use of
security logs to detect information leaks. We mapped the
state-of-the-art topic and its implications for future research,
aiming to create approaches to detect and react to attacks.
More specifically, this review makes four key contributions:
(1) A description of different types of information leaks
that uses the recommendations of GDPR
(2) e identification and description of the 20 most
commonly used publicly available benchmark
datasets of security logs
(3) e identification and description of 20 types of
attacks that can be detected through an analysis of
security logs
(4) e identification and description of 30 algorithms
used for data leak detection
e remainder of this study is organized as follows: in
Section 2, we describe the protocol used in this systematic
literature review. In Sections 3, 4, 5, 6, and 7, we present the
Hindawi
Security and Communication Networks
Volume 2021, Article ID 6615899, 29 pages
https://doi.org/10.1155/2021/6615899