International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-1, November 2019
4441
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: A5262119119/2019©BEIESP
DOI: 10.35940/ijitee.A5262.119119
Visualization of Optimal Product Pricing using E-
Commerce Data
N Greeshma, C Raghavendra, K Rajendra Prasad
ABSTRACT: With the number of e-commerce websites being
increasing rapidly, online shopping has become the trend
nowadays. Though, online shopping is very easy; however, when
it comes to product selection it is a tedious task and time
consuming to identify which online site gives us the best price
and offers. Comparing products and filtering them from each
online site is a very time consuming task for a buyer. This paper
uses the techniques of Web Scraping using python libraries like
Beautiful Soup, requests, matplotlib for identifying the best prices
and for deciding the best product deal to the customer from
different online websites. Web scraping is an automated
technique of extracting data from websites. In this paper, real
time data is extracted from various e-commerce sites and
compared automatically. Finally, the results are graphically
displayed based on which the customer makes the appropriate
decision.
Keywords: Web Scraping, e-commerce data extraction,
python libraries.
I. INTRODUCTION
Generally, a web browser is used to search for information
on the internet. Browsers offer a simple and easy way to
view different websites and access them. Websites contain
huge amounts of data which is in unstructured form. There
is a lot of junk with useful data mixed on a website. So, to
look only for the useful and appropriate information on the
website relevant data extraction has to be done. This can be
achieved by using the techniques of web scraping. The
method of extracting information from websites is known as
“Web scraping” [6].
Web scraping is also termed as “Web Harvesting” or “Web
Data Extraction” or “Web Data mining”. It is an automated
technique used to extract large amounts of data faster and
easier. These large amounts of data are collected and stored
in a structured format (such as .CSV files, Databases).
Across the world few commercial web page administrators
describes web scraping is considered as legal and some
don‟t. The legality of using web scraping completely
depends on web page administrators only. If they agreed,
then they allow people to access the data of a particular
website [1].
Revised Manuscript Received on November 08, 2019.
N Greeshma*, Dept. of CSE, Institute of Aeronautical Engineering,
Dundigal, Hyderabad, India. Email: greeshmanalla@gmail.com
C Raghavendra*, Asst. Professor, Dept. of CSE, Institute of
Aeronautical Engineering, Dundigal, Hyderabad, India. Email:
crg.svch@gmail.com
Dr. K Rajendra Prasad*, Professor and Head, Dept. of CSE, Institute
of Aeronautical Engineering, Dundigal, Hyderabad, India. Email:
krprgm@gmail.com
Fig 1: Data from unstructured to structured format
through Web Scraping
In the world, Internet businesses are easy to start and low
risky to maintain. People prefer to establish an online store
because of low tax, no crowd, more variety and early
updates and so many. But the numbers of e-commerce
services are increasing, this in turn results in the customers
tending to spend a lot of time in deciding price, rating,
features of the product and duration for delivery.
Nevertheless, 54% of Internet used by people looking for
data about merchandise or administrations, 48% data
searches for educational purpose, 40% contents is searched
for health and clinical data, 28% job seeking actions, and
24% data are searched for government and law
administrative organizations [2]. This paper discusses one of
the ways of extracting the data from the e-commerce
websites and revealing to customer screen which helps them
to sort out huge amount of irrelevant data. Web scraping can
be implemented through many programming languages like
Python, Node.js, PHP, Ruby, C++, etc. This paper uses the
implementation of python language for Web Scraping, as
python is more adaptive to further data processing; it is easy
in implementation and also has many open source
frameworks and libraries such as Beautiful Soup, Requests,
Pandas, Matplotlib, etc.
II. METHODS
Python is an open source general-purpose language with
great interactive environment. It is Object Oriented,
Procedural and Functional which supports large number of
modules and libraries [3].
Requests [10] library is used for making HTTP requests in
python for accessing web pages. We can get the raw HTML
of webpages which can then be parsed for retrieving the
data.
Beautiful Soup [9] is a popular python library that parses a
web page and provides a convenient interface to navigate
the content. It pulls out data out of HTML and XML files.
By simple commands, Beautiful Soup could parse content
from within the HTML container [7]. It is considered as the
advanced library for web scraping and can be installed in
python by issuing „pip install beautifulsoup4‟ in command
prompt.
Matplotlib is a free, open-source and a friendly
visualization library in Python for 2D plots of data arrays. It
is a multi-platform data visualization library built on NumPy
arrays. One of the greatest
benefits of visualization is that
it allows us visual access to