Research Article Modeling Investor Behavior Using Machine Learning: Mean-Reversion and Momentum Trading Strategies Thiago Christiano Silva , 1,2 Benjamin Miranda Tabak , 3 and Idamar Magalhães Ferreira 3 1 Universidade Cat´ olica de Bras´ ılia, Distrito Federal, Brazil 2 Department of Computing and Mathematics, Faculty of Philosophy, Sciences, and Literatures in Ribeirão Preto, Universidade de São Paulo, São Paulo, Brazil 3 FGV/EPPG Escola de Pol´ ıticas P´ ublicas e Governo, Fundação Get´ ulio Vargas, School of Public Policy and Government, Getulio Vargas Foundation, Distrito Federal, Brazil Correspondence should be addressed to Benjamin Miranda Tabak; benjaminm.tabak@gmail.com Received 27 August 2019; Revised 9 November 2019; Accepted 21 November 2019; Published 26 December 2019 Academic Editor: Jos´ e Manuel Gal´ an Copyright © 2019 iago Christiano Silva et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We model investor behavior by training machine learning techniques with financial data comprising more than 13,000 investors of a large bank in Brazil over 2016 to 2018. We take high-frequency data on every sell or buy operation of these investors on a daily basis, allowing us to fully track these investment decisions over time. We then analyze whether these investment changes correlate with the IBOVESPA index. We find that investors decide their investment strategies using recent past price changes. ere is some degree of heterogeneity in investment decisions. Overall, we find evidence of mean-reverting investment strategies. We also find evidence that female investors and higher academic degree have a less pronounced mean-reverting strategy behavior com- paratively to male investors and those with lower academic degree. Finally, this paper provides a general methodological approach to mitigate potential biases arising from ad-hoc design decisions of discarding or introducing variables in empirical econometrics. For that, we use feature selection techniques from machine learning to identify relevant variables in an objective and concise way. 1. Introduction is paper studies the determinants of investors’ behavior in the stock market using transaction-level data on buy and sell operations of investors. Our data contains detailed in- formation of the investor’s identity and her socioeconomic characteristics, the investment value, and variation due to the buy or sell operation over time. e data is confidential and comes from a large and representative Brazilian bank. With this rich dataset, we are able to study how investors respond to changes in the Brazilian stock market due to variations of its market index, called IBOVESPA. We use historical variations of the IBOVESPA index with different horizons (window length) to test which one better predicts the investors’ behavior. To mitigate potential concerns due to subjective de- cisions by the analyst—and also to prevent discarding a potentially relevant predictor—we opt to use an objective approach to identify those horizons that best explain in- vestors’ buy or sell operations. For that, we use a robust feature selection technique borrowed from the machine learning literature called elastic net. e great advantage of the elastic net comes by the simplicity of its loss function (just like a regression) and also the robustness in preventing overfitting by optimally using a convex combination of the Lasso and Ridge regularization methods. Overfitting can occur as the algorithm may learn the dynamics of the variable of interest and fit very well the training dataset but with poor predictability in other datasets. Evaluating the potential for overfitting is essential for researchers as it may undermine the model. We understand that our method seeks to avoid, to some extent, the perils of overfitting. e Ridge and the Lasso algorithms impose penalties for large weights in the model [1]. In this way, they tend to reduce the Hindawi Complexity Volume 2019, Article ID 4325125, 14 pages https://doi.org/10.1155/2019/4325125