An introduction to the use of hidden Markov models for stock return analysis Chun Yu Hong * , Yannik Pitcan † December 4, 2015 Abstract We construct two HMMs to model the stock returns for every 10-day period. Our first model uses the Baum-Welch algorithm for inference about volatility, which regards volatility as hidden states and uses a mean zero Gaussian distribution as the emission probability for the stock re- turns. Our second model uses a spectral algorithm to perform stock re- turns forecasting. We analyze the tradeoffs of these two implementations as well. 1 Introduction Hidden Markov models (HMMs) are known for their applications to speech processing and pattern recognition. They are attractive models for discrete time series analysis because of their simple structures. It is therefore not surprising that there has been research on the applications of HMMs to finance. Hassan and Nath (2005) use HMM to forecast the price of airline stocks. The goal is to predict the closing price on the next day based on the opening price, the closing price, the highest price and the lowest price today. The performance of the HMM is similar to that of artificial neural networks (ANN). O et al. (2004) propose a three-level hierarchical HMM to model the dynam- ics of the stock prices. The first level consists of the hidden states that describe the trend of the stocks: strong bear, weak bear, random walk, weak bull and strong bull. The second level consists of the hidden states responsible for the components of a Gaussian mixture. The third level consists of the outputs: the relative closing prices, defined as the percent change in closing price relative to the previous closing price. Since many of these HMM models for stock returns focus on forecasting, we decide to introduce a very simple HMM for performing inference about volatil- ity changes. The idea of using HMM for volaility analysis is not new: there are a few existing papers on HMM-GARCH (generalized autoregressive condi- tional heteroskedasticity) models for volatility forecasts (for example, Zhuang and Chan, 2004; Rossi and Gallo, 2005). However, these models are often too complex to be interpreted properly. Our model allows simple and natural in- terpretations yet provides important insight into the heteroskedastic nature of * Department of Statistics, UC Berkeley: jcyhong@berkeley.edu † Department of Statistics, UC Berkeley: pitcany@berkeley.edu 1