International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 05 | May 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1733
Visualizing and Forecasting Stocks Using Machine Learning
Harshal Pujari, Akshata Ubale, Shubham Patil, Atharva Shrivastav, Prof. Vrushali Kondhalkar
Students, Dept. of Computer Engineering, Jayawantrao Sawant College of Engineering, Pune, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - An Accurately predicting stock market returns is a
veritably grueling task due to its unpredictable, arbitrary, and
rapid-changing nature. With the preface of Machine Learning
techniques, programmed style prediction of stock market
returns has been proved to be more effective. Although there
are various models in Machine Learning to prop prediction,
this work substantially focuses on the use of the Regression
model and the LSTM model for the prediction of stock market
returns.
Key Words: LSTM, Prediction, Stock, Regression,
Dataset, etc.
1. INTRODUCTION
The stock market is characterized as dynamic,
arbitrary, and unsystematic in nature. There are various
factors that affect stock prices like political conditions, the
global economy, etc., making stock prediction a grueling task.
Thus, using Machine Learning techniques to predict stock
values by observing trends, could prove highly effective. [1]
In Machine Learning, the dataset is the most significant part.
Even a little change in data can immortalize massive
changes. So, data should be as refined and concrete as
possible. For this work, the dataset is attained from Yahoo
Finance. Yahoo Finance is a part of the Yahoo network which
allows accessing datasets of stocks of several companies.
The regression model and LSTM model are considered for
this work. Regression serves the purpose of keeping errors
as low as possible and LSTM grants memory for the data and
results to be used for the long run. The graph of the actual
and the predicted value of stocks is plotted using both
regression and LSTM techniques. The remaining paper
consists of the following: Section 2 discusses the related
work. Section 3 puts forward the two models used and the
methods used in them in detail. Section 4 discusses the
results produced with different plots for both models in
detail. Section 5 has the conclusion and the last section
contains the references.
2. RELATED WORK
From the literature survey, it is clear that the machine
learning techniques is applied for stock market vaticination
across the world. Compared to contemporary vaticination
techniques, these techniques are much more accurate.
The model developed by Kim and Ha in [2] is a blend of
artificial neural networks (ANN) and genetic algorithms
(GAs). They discretized the features for predicting the stock
price index. They incorporated data from technical indicators
and the daily Korea stock price index (KOSPI). The data
accommodated 2928 trading days, stretching from January
1989 to December 1998. They applied optimization of feature
discretization, which is a technique akin to dimensionality
reduction. They introduced genetic algorithms (GA) to
enhance the Artificial Neural networks (ANN). Limitation
of their work is that they focused only on two factors in
optimization. They believed that the genetic Algorithm has a
substantial prospective for feature discretization
optimization.
Qiu and Song in [3] also introduced a solution that was
based on an optimized artificial neural network (ANN) model.
In this work, the authors have utilized genetic algorithms and
an artificial neural network-based model and named it as GA-
ANN model.
For data mining applications, Piramuth in [4] organized
an in-depth evaluation of various feature selection methods.
The datasets which were credit approval data, tam, and kiang
data, were used. It compared how various feature selection
methods optimized decision tree performance. The featured
selection methods like probabilistic distance measures: the
Bhattacharyya measure, the Mahalanobis distance measure,
the Matusita measure, the divergence measure, and the
Patrick-Fisher measure; were compared. The advantage of
this paper is that the author analyzed both feature selection
methods, i.e., probabilistic distance-based and several inter-
class feature selection methods. Another strength is that the
evaluation was performed, based on different datasets.
However, only decision tree was used in this work as
evaluation algorithm. So, it is difficult to conclude if the
feature selection method still works the same way on the
larger and complex dataset or model.
Hasan and Nat in [5] forecasted the stock market for stock
prices of four distinct Airlines. They used the Hidden Markov
Model (HMM) for prediction. The states of the model were
reduced down to four states: The opening price, the closing
price, the highest price, and the lowest price. The strength of
this paper is that no expert knowledge is needed in this
approach to design a prediction system. On the other hand,
the dataset used for the training and testing purposes of this
model is really small. A maximum of 2 years is selected as the
data range of the training and testing dataset.