A January 16, 2015 Quantitative Finance DRAFT˙GPIRL˙Algorithmic˙Strategy˙Identiﬁcation To appear in Quantitative Finance, Vol. 00, No. 00, Month 20XX, 1–24 Quantitative Finance : Gaussian Process-Based Algorithmic Trading Strategy Identiﬁcation Steve Y. Yang†, Qifeng Qiao†, Peter A. Beling†, William T. Scherer†, and Andrei A. Kirilenko‡‡ †The University of Virginia, 151 Engineer’s Way, P.O. Box 400747, Charlottesville, VA 22904 ‡ MIT Sloan School of Management, 50 Memorial Drive, Cambridge, Massachusetts 02142 (DRAFT v1.2 released November 2012) Many market participants now employ algorithmic trading, commonly deﬁned as the use of computer algorithms to automatically make certain trading decisions, submit orders, and manage those orders after submission. Identifying and understanding the impact of algorithmic trading on ﬁnancial markets has become a critical issue for market operators and regulators. Advanced data feed and audit trail information from market operators now make the full observation of market participants’ actions possible. A key question is the extent to which it is possible to understand and characterize the behavior of individual participants from observations of trading actions. In this paper, we consider the basic problems of categorizing and recognizing traders (or, equiv- alently, trading algorithms) on the basis of observed limit orders. These problems are of interest to regulators engaged in strategy identiﬁcation for the purposes of fraud detection and policy develop- ment. Methods have been suggested in the literature for describing trader behavior using classiﬁcation rules deﬁned over a feature space consisting of the summary trading statistics of volume and inventory and derived variables that reﬂect consistency of buying or selling behavior. Our principal contribution is to suggest an entirely diﬀerent feature space that is constructed by inferring key parameters of a sequential optimization model that we take as a surrogate for the decision making process of the traders. In particular, we model trader behavior in terms of a Markov decision process (MDP). We infer the reward (or objective) function for this process from observation of trading actions using a process from machine learning known as inverse reinforcement learning (IRL). The reward functions learned through IRL then constitute a feature space that can be the basis for supervised learning (for classiﬁcation or recognition of traders) or unsupervised learning (for categorization of traders). Making use of a real-world data set from the E-Mini futures contract, we compare two principal IRL variants, linear IRL (LIRL) and Gaussian Process IRL (GPIRL), against a method based on summary trading statistics. Results suggest that IRL-based feature spaces support accurate classiﬁcation and meaningful clustering. Further, we argue that, because they attempt to learn traders’ underlying value propositions under diﬀerent market conditions, the IRL methods are more informative and robust than the summary statistic-based approach and are well suited for discovering new behavior patterns of market participants. Keywords : Inverse Reinforcement Learning; Gaussian Process; High Frequency Trading; Algorithmic Trading; Behavioral Finance; Markov Decision Process; Support Vector Machine Corresponding author. Email: yy6a@viginia.edu; and ak67@mit.edu 1