International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1230
Categorization of Illegal Activities on Dark Web using Classification
Hrushikesh Thorat
1,
Shubham Thakur
2
, Amit Yadav
3
1-3
Department of Computer Engineering PHCET, Rasayani
------------------------------------------------------------------------***-----------------------------------------------------------------------
Abstract— The Dark Web is a part of WWW, which is only accessible by means of special software, which allows users and
website owners to remain anonymous. The dark web corpus contains activities that are illegal under the Indian Penal Code
and IT Amendment Act 2008. This leads to growth of illegal activities on the web. The collection and labelling of illegal dark
web content web pages is difficult and time consuming. The method proposed in this project can effectively classify,
visualize illegal activities on the dark web. We creatively select laws and regulations related to each type of illegal activities and
trained the classifiers. From the categories of drugs, gamblers, weapons, child pornography and counterfeit credit cards, we
picked corresponding legal documents from the Indian Penal Code (IPC) for supervised training. Then classifier algorithms
like Naive Bayes classifier classify the illegal content on the web pages. This will help Indian Cyber Crime Department to
monitor potential illegal activities and their corresponding websites in a timely manner. This classification defines a new
way of categorizing illegal activities on the dark web.
Keywords: Dark Web, Categorization, Illegal Activities, Visualization
I. INTRODUCTION
Dark Web, also known as ‘Dark Net’ is the online content consisting of web pages and forums that are encrypted with least
possibility of tracing the exact location of servers hosting the online content. Dark Web content also cannot be indexed by
typical search engines such as Google, Bing, Yahoo, etc. On Dark Web, the identity of the user and the website owner, server
locations remain anonymous. Today, various open source browsers such as Tor, Freenet, I2P are available for free to
access Dark Web content. According to Tor Metrics Project [1], there are more than 200K registered .onion addresses
as of May 2020, and on an average, 2 million users use Tor on daily basis.
[2]
Fig. 1. Divisions of World Wide Web
The Tor network was developed by United States Naval Research Laboratory in 1990’s for military purposes as the
onion routing principle gave anonymity to transmit confidential data with encryption. Later, in 2004, the original ‘The Onion
Routing Project’ was made free to public under a free and open source license by the name, ‘Tor Project’.