1 Deep Neural Networks in High Frequency Trading Prakhar Ganesh, Senior, Dept. of CSE, IIT Delhi, and Puneet Rakheja, Founder and CEO, WealthNet Advisors Abstract—The ability to give precise and fast prediction for the price movement of stocks is the key to proﬁtability in High Frequency Trading. The main objective of this paper is to propose a novel way of modeling the high frequency trading problem using Deep Neural Networks at its heart and to argue why Deep Learning methods can have a lot of potential in the ﬁeld of High Frequency Trading. The paper goes on to analyze the model’s performance based on it’s prediction accuracy as well as prediction speed across full-day trading simulations. Index Terms—Deep Learning, Neural Networks, Multi Layer Perceptrons, Finance, High Frequency Trading I. I NTRODUCTION D EEP Learning methods are prophesied to revolutionize the ﬁeld of AI and represents a step towards building autonomous systems. We live in an era where we are creating unbelievable amount of data everyday. Neural networks hold the ability to scale problems that were previously unsolvable, causing a huge wave of interest in this ﬁeld. For example, currently, deep reinforcement learning is used in problems such as learning to play games directly from the pixels of an image, which would not have been considered a scalable problem up until recent years [10], [1]. Deep learning has become increasingly popular [20] since the introduction of an effective new way of learning deep neu- ral networks [21], [22]. It has proved very effective for large- scale tasks, and this success has been based largely on the use of the back-propagation algorithm with rather standard, feed- forward multi-layer neural networks. In addition to improved learning procedures, the main factors that have contributed to the recent successes of deep neural networks have been the availability of more computing power, the availability of more training data, and better software engineering. The inference of predictive models from historical data is not new in quantitative ﬁnance; a number of examples include coefﬁcient estimation for the CAPM, Fama and French factors [5], and related approaches. However the special challenges for machine learning presented by HFT can be considered two fold : 1) Microsecond sensitive live trading - As the complexity of the model increases, it gets more and more compu- tationally expensive to keep up with the speed of live trading and actually use the information provided by the model. 2) Tremendous amount and ﬁne granularity of data - The past data available in HFT is huge in size and extremely precise. However there is a lack of understanding of how Copyright 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. such low-level data, like the recent trading behavior, relates to actionable circumstances (such as proﬁtable buying or selling of shares) [6]. There has been a lot of research on what ”features” to use for prediction or modeling [6], [7], [8], [9], yet it is difﬁcult to model the problem without a proper understanding of the underlying relations. II. RELATED WORK One of the primary goals of the ﬁeld of artiﬁcial intelligence (AI) is to produce fully autonomous agents that interact with their environments to learn optimal behaviours, possibly improving over time through trial and error. Currently, deep learning is enabling many other machine learning algorithms, for example reinforcement learning as mentioned earlier, to scale to problems that were previously intractable, such as learning to play video games directly from pixels. Deep Learning has penetrated a lot of ﬁelds, including ﬁnance. However its reach in high frequency trading is limited [19], speciﬁcally due to the computational constraints and primitive problem modeling methods. There has been a lot of other machine learning algorithm tried and tested in the ﬁeld of high frequency trading. There has been a lot of work done speciﬁcally in terms of feature engineering in HFT, focused on simpler models like linear regression, multiple Kernel learning, maximum margin, tradi- tional model-based reinforcement learning etc. [6], [13]. However, due to the computational complexity of Deep Learning models, lesser work has been done in terms of incorporating such recent and more complex models and instead more focus is made towards extracting useful features from the current trading book state and recent trading behavior. The common feature values like bid-ask spread, percentage change in price, weighted price etc. [6] and some specialized features like order imbalance [18] were among many others that we used in our model. III. PROBLEM STATEMENT We aim to create a pipeline which uses information about the past trading behavior and current snapshot of the order book to predict price movement in the near future. We then aim to use this information for making decision in the market for maximum proﬁtability. A. Tick Data Tick data refers to most granular level market information available from electronically traded markets. Every order request and trade information is provided as a ”tick” event. Current state of the order book refers to the top few (ﬁve, in arXiv:1809.01506v2 [cs.LG] 29 Oct 2018