A January 16, 2015 Quantitative Finance DRAFT˙GPIRL˙Algorithmic˙Strategy˙Identification To appear in Quantitative Finance, Vol. 00, No. 00, Month 20XX, 1–24 Quantitative Finance : Gaussian Process-Based Algorithmic Trading Strategy Identification Steve Y. Yang, Qifeng Qiao, Peter A. Beling, William T. Scherer, and Andrei A. Kirilenko‡‡ The University of Virginia, 151 Engineer’s Way, P.O. Box 400747, Charlottesville, VA 22904 MIT Sloan School of Management, 50 Memorial Drive, Cambridge, Massachusetts 02142 (DRAFT v1.2 released November 2012) Many market participants now employ algorithmic trading, commonly defined as the use of computer algorithms to automatically make certain trading decisions, submit orders, and manage those orders after submission. Identifying and understanding the impact of algorithmic trading on financial markets has become a critical issue for market operators and regulators. Advanced data feed and audit trail information from market operators now make the full observation of market participants’ actions possible. A key question is the extent to which it is possible to understand and characterize the behavior of individual participants from observations of trading actions. In this paper, we consider the basic problems of categorizing and recognizing traders (or, equiv- alently, trading algorithms) on the basis of observed limit orders. These problems are of interest to regulators engaged in strategy identification for the purposes of fraud detection and policy develop- ment. Methods have been suggested in the literature for describing trader behavior using classification rules defined over a feature space consisting of the summary trading statistics of volume and inventory and derived variables that reflect consistency of buying or selling behavior. Our principal contribution is to suggest an entirely different feature space that is constructed by inferring key parameters of a sequential optimization model that we take as a surrogate for the decision making process of the traders. In particular, we model trader behavior in terms of a Markov decision process (MDP). We infer the reward (or objective) function for this process from observation of trading actions using a process from machine learning known as inverse reinforcement learning (IRL). The reward functions learned through IRL then constitute a feature space that can be the basis for supervised learning (for classification or recognition of traders) or unsupervised learning (for categorization of traders). Making use of a real-world data set from the E-Mini futures contract, we compare two principal IRL variants, linear IRL (LIRL) and Gaussian Process IRL (GPIRL), against a method based on summary trading statistics. Results suggest that IRL-based feature spaces support accurate classification and meaningful clustering. Further, we argue that, because they attempt to learn traders’ underlying value propositions under different market conditions, the IRL methods are more informative and robust than the summary statistic-based approach and are well suited for discovering new behavior patterns of market participants. Keywords : Inverse Reinforcement Learning; Gaussian Process; High Frequency Trading; Algorithmic Trading; Behavioral Finance; Markov Decision Process; Support Vector Machine Corresponding author. Email: yy6a@viginia.edu; and ak67@mit.edu 1