A Machine Learning Approach for Stock Price Prediction Carson Kai-Sang Leung Richard Kyle MacKinnon Yang Wang University of Manitoba, Winnipeg, MB, Canada kleung@cs.umanitoba.ca ABSTRACT Data mining and machine learning approaches can be incorporated into business intelligence (BI) systems to help users for decision support in many real-life applications. Here, in this paper, we pro- pose a machine learning approach for BI applications. Specifically, we apply structural support vector machines (SSVMs) to perform classification on complex inputs such as the nodes of a graph struc- ture. We connect collaborating companies in the information tech- nology sector in a graph structure and use an SSVM to predict pos- itive or negative movement in their stock prices. The complexity of the SSVM cutting plane optimization problem is determined by the complexity of the separation oracle. It is shown that (i) the sep- aration oracle performs a task equivalent to maximum a posteriori (MAP) inference and (ii) a minimum graph cutting algorithm can solve this problem in the stock price case in polynomial time. Ex- perimental results show the practicability of our proposed machine learning approach in predicting stock prices. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications—data mining; I.2.6 [Artificial Intelligence]: Learning—parameter learning; J.1 [Computer Applications]: Administrative data pro- cessing—financial General Terms Algorithms; Experimentation; Validation Keywords Business intelligence (BI), data mining, finance, graph labeling, machine learning, minimum graph-cuts, stock price prediction, structural support vector machine (SSVM), support vector machine (SVM) Corresponding author: C.K.-S. Leung. Permission to make digital or hard copies of all or part of this work for per- sonal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstract- ing with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. IDEAS’14, July 07–09, 2014, Porto, Portugal. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2627-8/14/07 ...$15.00. http://dx.doi.org/10.1145/2628194.2628211 1. INTRODUCTION Data mining [4, 10, 11, 12, 18] and machine learning [13, 14, 17] approaches can be incorporated into business intelligence (BI) sys- tems to help users for decision support in many real-life applica- tions. One interesting BI application is to predict stock prices. In general, making predictions [3], including stock price prediction, is a difficult problem. A recent statistic showed that, from 2008 to 2012, only 9.84% of Canadian equity fund managers achieved bet- ter returns than passively managed funds based on the S&P/TSX Composite Index [6]. That means that more than 90% of the time, funds in which stocks were actively selected by fund managers per- formed worse than the market as a whole. Improvement in this area is desirable. Algorithms that can predict stock prices more accurately have a huge financial incentive for professionals who have access to stock prices. Aside from the potential for creating multi-millionaires, such an algorithm would have other benefits as well. One such benefit is to root out bad investments that are destined to fail, re- ducing the chance of major disruptions and market crashes when they do. Another benefit is that a successful algorithm could be adapted to other domains with similar problem requirements. Our key contribution of this paper is our proposal of a machine learning approach based on a structural support vector machine (SSVM), with maximum a posteriori (MAP) inference calculated using minimum graph cuts. The remainder of this paper is organized as follows. Background is provided in Section 2. Section 3 describes our proposed SSVM with MAP inference calculated using minimum graph cuts. Specif- ically, we describe the graph structure, feature vectors, and training labels. Experimental results in Section 4 show the effectiveness (es- pecially, accuracy) of our proposal. Finally, conclusions are given in Section 5. 2. BACKGROUND & RELATED WORKS Structural support vector machines (SSVMs) allow us to classify non-linearly separated data using a linear hyperplane by means of a max-margin approach together with slack variables for each train- ing example. The general problem can be expressed mathemati- cally by the following equations [5]: min w0 1 2 ||w|| 2 + C n n X i=1 ξi ! , (1) and i y ∈Y\y i : w T Ψ(x i ,y i ) w T Ψ(x i ,y) + Δ(y i ,y) - ξi , (2) where (i) w is the parameter learnt by the SSVM during training, (ii) C is a parameter set by the user that describes the magnitude 274