International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1733 Visualizing and Forecasting Stocks Using Machine Learning Harshal Pujari, Akshata Ubale, Shubham Patil, Atharva Shrivastav, Prof. Vrushali Kondhalkar Students, Dept. of Computer Engineering, Jayawantrao Sawant College of Engineering, Pune, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - An Accurately predicting stock market returns is a veritably grueling task due to its unpredictable, arbitrary, and rapid-changing nature. With the preface of Machine Learning techniques, programmed style prediction of stock market returns has been proved to be more effective. Although there are various models in Machine Learning to prop prediction, this work substantially focuses on the use of the Regression model and the LSTM model for the prediction of stock market returns. Key Words: LSTM, Prediction, Stock, Regression, Dataset, etc. 1. INTRODUCTION The stock market is characterized as dynamic, arbitrary, and unsystematic in nature. There are various factors that affect stock prices like political conditions, the global economy, etc., making stock prediction a grueling task. Thus, using Machine Learning techniques to predict stock values by observing trends, could prove highly effective. [1] In Machine Learning, the dataset is the most significant part. Even a little change in data can immortalize massive changes. So, data should be as refined and concrete as possible. For this work, the dataset is attained from Yahoo Finance. Yahoo Finance is a part of the Yahoo network which allows accessing datasets of stocks of several companies. The regression model and LSTM model are considered for this work. Regression serves the purpose of keeping errors as low as possible and LSTM grants memory for the data and results to be used for the long run. The graph of the actual and the predicted value of stocks is plotted using both regression and LSTM techniques. The remaining paper consists of the following: Section 2 discusses the related work. Section 3 puts forward the two models used and the methods used in them in detail. Section 4 discusses the results produced with different plots for both models in detail. Section 5 has the conclusion and the last section contains the references. 2. RELATED WORK From the literature survey, it is clear that the machine learning techniques is applied for stock market vaticination across the world. Compared to contemporary vaticination techniques, these techniques are much more accurate. The model developed by Kim and Ha in [2] is a blend of artificial neural networks (ANN) and genetic algorithms (GAs). They discretized the features for predicting the stock price index. They incorporated data from technical indicators and the daily Korea stock price index (KOSPI). The data accommodated 2928 trading days, stretching from January 1989 to December 1998. They applied optimization of feature discretization, which is a technique akin to dimensionality reduction. They introduced genetic algorithms (GA) to enhance the Artificial Neural networks (ANN). Limitation of their work is that they focused only on two factors in optimization. They believed that the genetic Algorithm has a substantial prospective for feature discretization optimization. Qiu and Song in [3] also introduced a solution that was based on an optimized artificial neural network (ANN) model. In this work, the authors have utilized genetic algorithms and an artificial neural network-based model and named it as GA- ANN model. For data mining applications, Piramuth in [4] organized an in-depth evaluation of various feature selection methods. The datasets which were credit approval data, tam, and kiang data, were used. It compared how various feature selection methods optimized decision tree performance. The featured selection methods like probabilistic distance measures: the Bhattacharyya measure, the Mahalanobis distance measure, the Matusita measure, the divergence measure, and the Patrick-Fisher measure; were compared. The advantage of this paper is that the author analyzed both feature selection methods, i.e., probabilistic distance-based and several inter- class feature selection methods. Another strength is that the evaluation was performed, based on different datasets. However, only decision tree was used in this work as evaluation algorithm. So, it is difficult to conclude if the feature selection method still works the same way on the larger and complex dataset or model. Hasan and Nat in [5] forecasted the stock market for stock prices of four distinct Airlines. They used the Hidden Markov Model (HMM) for prediction. The states of the model were reduced down to four states: The opening price, the closing price, the highest price, and the lowest price. The strength of this paper is that no expert knowledge is needed in this approach to design a prediction system. On the other hand, the dataset used for the training and testing purposes of this model is really small. A maximum of 2 years is selected as the data range of the training and testing dataset.