International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.1, January 2022 DOI: 10.5121/ijaia.2022.13102 19 MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL APPROACHES Manav Agarwal, Shreya Venugopal, Rishab Kashyap and R Bharathi Department of CSE, PES University, Bangalore, India ABSTRACT Movies are among the most prominent contributors to the global entertainment industry today, and they are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial Neural Network. The models stated above were compared on a variety of factors, including their accuracy on the training and validation datasets as well as the testing dataset, the availability of new movie characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered that certain characteristics have a greater impact on the likelihood of a film's success than others. For example, the existence of the genre action may have a significant impact on the forecasts, although another genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best performing model of all the models discussed. KEYWORDS Regression Models, Clustering Techniques, Time Series Model, Artificial Neural Network, Movie Success, Statistical Significance. 1. INTRODUCTION Movies are one of the most important contributing factors to the entertainment industry in the world today, and from a commercial perspective are among the highest revenue-generating businesses [1]. Hence from a movie industry point of view, it is important to know whether a movie that they are thinking of making will be successful. There are plenty of factors such as the genre of a movie, the popularity of a movie based on its votes, critical judgement, and so on which influence the choices made by people to watch certain movies of their liking. Thus, to identify a movie that is worth watching, people look into various websites and articles such as Metacritic, Rotten Tomatoes, IMDb and many more sites that give us the rating of the movie. Thus, the analysis involves the study of these user ratings and the other factors that affect the movie and this helps us identify whether a movie is truly successful or not. This would help those who are creating movies make better movies and thus get more revenue. It will also make sure that the audience gets to watch movies that they enjoy. We wish to exploit the various techniques and tools used to uncover useful information from a variety of data that provides information about movies to infer various useful traits in it using which we would like to build a movie success predictor. The data is used in predictive models which include regressors, classifiers and time series models. Each of these models uses the attributes associated with the movies as independent values used to predict the outcome rating of the movie. Crossing a certain pre-set