International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 04 | Apr-2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 2212
Terror Attack Identifier: Classify using KNN, SVM, Random Forest
algorithm and alert through messages
Abhishek Barve
1
, Manali Rahate
2
, Ayesha Gaikwad
3
, Priyanka Patil
1
Assistant Professor, Vidyalankar Institute of Technology, Mumbai, India
2,3,4
Students, Vidyalankar Institute of Technology, Mumbai, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - The system to prevent terrorist attacks that will
relay emergency alerts at all phones is set to begin .This
system could warn people of terrorist strikes by text messages
by broadcasting it to all the people in the nearby location.
With the popularity of social networks , mostly news providers
used to split their news in various social networking sites and
web blogs.
Machine learning techniques will be used to train the data .In
order to create the instances words from each short message
were consider and bag-of-words approach was used to create
feature vector .The data was trained using KNN(K-Nearest
Neighbor), Support vector machine, Random forest machine
learning techniques.
Key Words: Data Mining, Twitter, news, text analysis,
terrorist attack, tweets.
1. INTRODUCTION
Now-a-days in India, there are many news groups who share
their news headlines as short messages in micro blogging
services such as Twitter. Authors of these messages write
about their life, share opinions on variety of topics and
discuss current issues. Because of a free format of messages
and an easy accessibility of micro blogging platforms,
Internet users tend to shift from traditional communication
tools (such as traditional blogs or mailing lists) to micro
blogging services.
As more and more users post about products and services
they use, or express their political and religious views, micro
blogging web-sites become valuable sources of people’s
opinions and sentiments. We use a dataset formed of
collected messages from Twitter. Twitter contains a very
large number of very short messages created by the users of
this micro blogging platform. The contents of the messages
vary from personal thoughts to public statements. The short
messages will be classified by the system into a group: war-
terrorist-crime.
1.1 Objectives
To develop a system that will extract the live tweets from
twitter site, will classify those tweets and display the news
under its section that will help news seeker to keep track of
news. For development of the proper system a perfect
classifier has to be selected that can be done by comparing
different classifier result on tweets provided.
1.2 Scope
Scope of this dissertation is to develop a system that will
collect short messages from twitter social networking site.
The collected twitter messages are used to train by using
SVM, Random Forest and KNN data mining techniques and a
classifier is built that will classify the messages (e.g. war-
terrorist).The performance of each classification techniques
is calculated that will be the effectiveness of the system.
Thus precision and recall values are calculated to measure
the performance of each classifier system. F_β was calculated
to obtain a single value measurement. The results generated
from all 3 classifiers is compared in order to find the
classifier that provides high performance for most groups
will be consider as the best classifier for classifying the
messages extracted from twitter, so that users or analyst in
specific field able to know about the news
1.3 Proposed system
We are using K-Nearest Neighbour data mining method for
classifying twitter message into new group. This Chapter
deals with the study which involves detail knowledge of
twitter, Web Mining, data gathering techniques for tweets
extraction, feature selection technique and detail of
classification algorithms used for extraction.
2. Implementation
This shows how the system is implemented. For this first
module extract the tweets from the trusted news channel that
is the input for the system. The output module gives the
result in the form of tweets classified in news group: war-
terrorist-crime, economy business, health, sports
development-government, politics, accident, entertainment,
disaster-climate, education, society and international. For the
classification KNN, SVM and Random Forest are used, twits
are classified and analysis in done on the result drawn from
all three algorithms is shown in order to find the best
classifier for the twit’s classification.
Data Gathering
The classification will be applied into the short messages-
news of Twitter micro blog. Thus, twitter short messages are
needed to be collected. Twitter API provides the ability of
retrieving such short messages for a given user in XML file
format. Each XML file could carry out 200 short messages at
once.