Research Article Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics Atif Khan, 1 Muhammad Adnan Gul, 1 M. Irfan Uddin, 2 Syed Atif Ali Shah , 3 Shafiq Ahmad , 4 Muhammad Dzulqarnain Al Firdausi, 4 and Mazen Zaindin 5 1 Department of Computer Science, Islamia College Peshawar, Peshawar, Pakistan 2 Institute of Computing, Kohat University of Science and Technology, Kohat, Pakistan 3 Faculty of Engineering and Information Technology, Northern University, Nowshehra, Pakistan 4 King Saud University, College of Engineering, Department of Industrial Engineering, Riyadh, Saudi Arabia 5 King Saud University, College of Science, Department of Statistics and Operations Research, Riyadh, Saudi Arabia Correspondence should be addressed to Shafiq Ahmad; ashafiq@ksu.edu.sa Received 23 February 2020; Accepted 7 May 2020; Published 1 August 2020 Academic Editor: Shaukat Ali Copyright © 2020 Atif Khan et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and summarization is one of the challenging tasks in natural language processing. erefore, an automatic approach is demanded to summarize the vast amount of movie reviews, and it will allow the users to speedily distinguish the positive and negative aspects of a movie. is study has proposed an approach for movie review classification and summarization. For movie review classification, bag-of-words feature extraction technique is used to extract unigrams, bigrams, and trigrams as a feature set from given review documents, and represent the review documents as a vector space model. Next, the Na¨ ıve Bayes algorithm is employed to classify the movie reviews (represented as a feature vector) into positive and negative reviews. For the task of movie review summarization, Word2vec feature extraction technique is used to extract features from classified movie review sentences, and then semantic clustering technique is used to cluster semantically related review sentences. Different text features are used to calculate the salience score of each review sentence in clusters. Finally, the top-ranked sentences are chosen based on highest salience scores to produce the extractive summary of movie reviews. Experimental results reveal that the proposed machine learning approach is superior than other state-of-the-art approaches. 1. Introduction With the expansion of Web 2.0 that emphasizes the in- volvement of users, many websites such as a movie review website, such as Internet Movie Database (IMDB) and Amazon, encourage its users to write reviews for the products they liked or purchased, in order to enhance the shopping experience and satisfaction of customers. Online sellers often ask their customers to provide opinions or reviews over products or services they purchased online. e amount of reviews received by a product increases quickly as millions of customers post reviews about a product, which results in information overload. is information overload makes it a challenging task for a potential customer to scan each review of a product for making a quick decision whether to purchase a product or not. At the same time, it is also hard for service providers or online merchants/product manufacturers to keep track of a huge amount of reviews posted by customers related to the services or products [1]. In order to overcome the challenge of information overload, an automatic review classification and summarization sys- tem is needed [2]. In this study, we will focus on the movie review domain. Considering the movies, summarizing thousands of reviews received by a movie can help the viewers (customers) to swiftly scan the summary of it and promptly make a decision Hindawi Scientific Programming Volume 2020, Article ID 5812715, 14 pages https://doi.org/10.1155/2020/5812715