Recent Patents on Engineering
Send Orders for Reprints to reprints@benthamscience.net
Recent Patents on Engineering, 2022, 16, e150721194823
RESEARCH ARTICLE
Convolutional Neural Network Based Intelligent Advertisement Search
Framework for Online English Newspapers
109
Pooja Jain
1,*
, Kavita Taneja
1
and Harmunish Taneja
2
1
Department of Computer Science and Applications, Punjab University, Chandigarh 160014, Punjab, India;
2
Department of Computer Science and Information Technology, DAV College, Chandigarh, Punjab, India
Abstract: Background: Instant access to desired information is the key element for building an in-
telligent environment creating value for people and steering towards society 5.0. Online newspapers
are one such example which provide instant access to information anywhere and anytime on our
mobiles, tablets, laptops, desktops, etc. But when it comes to searching for a specific advertisement
in newspapers, online newspapers do not provide easy advertisement search options. Also, there are
no specialized search portals which can provide for keyword-based advertisement search across
multiple online newspapers. As a result, to find a specific advertisement in multiple newspapers, a
sequential manual search is required across a range of online newspapers.
Objective: This research paper proposes a keyword-based advertisement search framework to pro-
vide an instant access to the relevant advertisements from online English newspapers in a category
of reader’s choice.
Methods: First, an image extraction algorithm is proposed which can identify and extract the images
from online newspapers without using any rules on advertisement placement and/or size. It is fol-
lowed by a proposed deep learning Convolutional Neural Network (CNN) model named
‘Adv_Recognizer’ which is used to separate the advertisement images from non-advertisement im-
ages. Another CNN Model, ‘Adv_Classifier’, is proposed, which classifies the advertisement imag-
es into four pre-defined categories. Finally, Optical Character Recognition (OCR) technique is used
to perform keyword-based advertisement searches in various categories across multiple newspapers.
Results: The proposed image extraction algorithm can easily extract all types of well-bounded im-
ages from different online newspapers and this algorithm is used to create ‘English newspaper im-
age dataset’ of 11,000 images, including advertisements and non-advertisements. The proposed
‘Adv_Recognizer’ model separates advertisement and non-advertisement images with an accuracy
of around 97.8%. and the proposed ‘Adv_Classifier’ model classifies the advertisements in four pre-
defined categories exhibiting an accuracy of around 73.5%.
Conclusion: The proposed framework will help newspaper readers in performing exhaustive adver-
tisement searches across a range of online English newspapers in a category of their own interest. It
will also help in carrying out advertisement analysis and studies.
A R T I C L E H I S T O R Y
Received: January 31, 2021
Revised: April 02, 2021
Accepted: May 18, 2021
DOI:
10.2174/1872212115666210715163919
Keywords: Advertisement image classification, convolutional neural networks (CNN), newspaper advertisements, newspaper
layout segmentation, optical character recognition (OCR), residual networks (ResNet), transfer learning.
1. INTRODUCTION
India is one of the largest newspaper markets in the
world, with private companies, government departments,
recruitment agencies, educational institutes, etc. using news-
paper advertisements as a primary source for advertising
jobs, tenders, admission notices, sales, promotions, etc. Due
*Address correspondence to this author at the Department of Computer
Science and Applications, Panjab University, Chandigarh 160014, Punjab,
India; E-mail: poojajain9199@gmail.com
to the rapid increase in the internet users every year, the
trend has shifted from traditional newspaper reading to
online newspaper reading. Moreover, home-locked situations
during unprecedented circumstances like pandemics which
restrict the access to the printed copy of newspapers, have
further boosted this trend many folds. Whether it is a tradi-
tional newspaper reader or an online newspaper reader,
search for relevant newspaper advertisements is very crucial
for people waiting for specific advertisements to be out in
the newspapers. Interested students, job seekers, contractors,
2212-4047/22 $65.00+.00 © 2022 Bentham Science Publishers