Adaptive stock trading with dynamic asset allocation using reinforcement learning Jangmin O a, * , Jongwoo Lee b , Jae Won Lee c , Byoung-Tak Zhang a a School of Computer Science and Engineering, Seoul National University, San 56-1, Shillim-dong, Kwanak-gu, Seoul 151-742, Republic of Korea b Department of Multimedia Science, Sookmyung Women’s University, Chongpa-dong, Yongsan-gu, Seoul 140-742, Republic of Korea c School of Computer Science and Engineering, Sungshin Women’s University, Dongsun-dong, Sungbuk-gu, Seoul 136-742, Republic of Korea Received 4 December 2003; received in revised form 11 October 2005; accepted 14 October 2005 Abstract Stock trading is an important decision-making problem that involves both stock selection and asset management. Though many promising results have been reported for predicting prices, selecting stocks, and managing assets using machine-learning tech- niques, considering all of them is challenging because of their complexity. In this paper, we present a new stock trading method that incorporates dynamic asset allocation in a reinforcement-learning framework. The proposed asset allocation strategy, called meta policy (MP), is designed to utilize the temporal information from both stock recommen- dations and the ratio of the stock fund over the asset. Local traders are constructed with pattern-based multiple predictors, and used to decide the purchase money per recom- mendation. Formulating the MP in the reinforcement learning framework is achieved by a compact design of the environment and the learning agent. Experimental results 0020-0255/$ - see front matter Ó 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2005.10.009 * Corresponding author. Tel.: +82 2 880 1847; fax: +82 2 875 2240. E-mail addresses: jmoh@bi.snu.ac.kr (J. O), bigrain@sookmyung.ac.kr (J. Lee), jwlee@cs.sung- shin.ac.kr (J.W. Lee), btzhang@cse.snu.ac.kr (B.-T. Zhang). Information Sciences 176 (2006) 2121–2147 www.elsevier.com/locate/ins