International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 10 | Oct -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1544
Big Data Security Challenges: An Overview and Application of User
Behavior Analytics
Tanya Akutota
1
, Swarnava Choudhury
2
1
UG Student, Computer Science Department
National Institute of Technology- Silchar
2
UG Student, Electronics and Communication Department
National Institute of Technology-Silchar
---------------------------------------------------------------------------***---------------------------------------------------------------------------
Abstract- Automation and digitization of activities have
resulted in a huge volume of data generated, called Big Data.
Big Data helps many organizations gain useful insights, but at
the same time, there are two types of risk involved: Security
Risk to Big Data itself, and Privacy Risks of users and
Individuals. In this paper, the characteristics of big data, its
applications, and the security and privacy challenges that come
with it are discussed. This paper also explores a novel Big Data
Security Analytics method, called User Behavior Analytics, its
functioning, use cases and advantages.
Keywords: Analytics, Big Data, Challenges, Security, SIEM,
UBA
1. INTRODUCTION
21
st
century has seen the human lives shifting
towards digitalization; automated machines in industries,
cellular phones, social networks, etc., all have led us to this.
Such huge digitization means generation of huge, perhaps
complex sets of data every day. These large and complex data
maybe the data from sensors, browsing reports, usersǯ
statistics or anything which are increasing exponentially with
each passing day. As the inventor of World-Wide-Web, Tim
Berners-Lee said, ǮData is a precious thing as they last longer
than systemsǯ, Big Data Analytics (or BDA) is the tool which
actually helps us in realizing the power of such large and
complex datasets. The conventional database tools are not
able to process such amount of heterogeneous data. Whereas
Big Data Analytics uses the power of parallel processing to
extract an enormous amount of valuable information, like
future trends of market, developments in life science, etc.,
from the data gathered from all possible and available
sources.
A Big Data has many unique characteristics which
set it apart from a conventional database system. The types of
data they work upon varies. There are basically 3 major
classes of data, namely:
1. Structured data- These data are present in the form
of rigid relational models, with specific data types
and sizes. Conventional database techniques are
efficient at this level.
2. Semi-structured data- A type of structured data, but
it is hierarchical in nature with the use of tags and
markers. XML data is a perfect example of such data.
3. Unstructured data- It doesnǯt follow a predefined
model. The data vary widely; this is where Big Data
Analytics comes in.
A Big Data can be best described using 5 characteristics, more
popularly known as Dzthe ͷ Vǯsdz:
Volume- the scale of data; from Exabytes to
Zettabytes!
Velocity- rate at which streaming data is generated
and analysed.
Variety- different forms of data- from various
external or internal sources.
Veracity- the uncertainty of data, i.e., the different
probabilities a value can take.
Value- analysis and visualization of all the above
components gives out the final data, the precious
information referred to as the Value.