Research Article
Modeling Investor Behavior Using Machine Learning:
Mean-Reversion and Momentum Trading Strategies
Thiago Christiano Silva ,
1,2
Benjamin Miranda Tabak ,
3
and Idamar Magalhães Ferreira
3
1
Universidade Cat´ olica de Bras´ ılia, Distrito Federal, Brazil
2
Department of Computing and Mathematics, Faculty of Philosophy, Sciences, and Literatures in Ribeirão Preto,
Universidade de São Paulo, São Paulo, Brazil
3
FGV/EPPG Escola de Pol´ ıticas P´ ublicas e Governo, Fundação Get´ ulio Vargas, School of Public Policy and Government,
Getulio Vargas Foundation, Distrito Federal, Brazil
Correspondence should be addressed to Benjamin Miranda Tabak; benjaminm.tabak@gmail.com
Received 27 August 2019; Revised 9 November 2019; Accepted 21 November 2019; Published 26 December 2019
Academic Editor: Jos´ e Manuel Gal´ an
Copyright © 2019 iago Christiano Silva et al. is is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
We model investor behavior by training machine learning techniques with financial data comprising more than 13,000 investors
of a large bank in Brazil over 2016 to 2018. We take high-frequency data on every sell or buy operation of these investors on a daily
basis, allowing us to fully track these investment decisions over time. We then analyze whether these investment changes correlate
with the IBOVESPA index. We find that investors decide their investment strategies using recent past price changes. ere is some
degree of heterogeneity in investment decisions. Overall, we find evidence of mean-reverting investment strategies. We also find
evidence that female investors and higher academic degree have a less pronounced mean-reverting strategy behavior com-
paratively to male investors and those with lower academic degree. Finally, this paper provides a general methodological approach
to mitigate potential biases arising from ad-hoc design decisions of discarding or introducing variables in empirical econometrics.
For that, we use feature selection techniques from machine learning to identify relevant variables in an objective and concise way.
1. Introduction
is paper studies the determinants of investors’ behavior in
the stock market using transaction-level data on buy and sell
operations of investors. Our data contains detailed in-
formation of the investor’s identity and her socioeconomic
characteristics, the investment value, and variation due to
the buy or sell operation over time. e data is confidential
and comes from a large and representative Brazilian bank.
With this rich dataset, we are able to study how investors
respond to changes in the Brazilian stock market due to
variations of its market index, called IBOVESPA. We use
historical variations of the IBOVESPA index with different
horizons (window length) to test which one better predicts
the investors’ behavior.
To mitigate potential concerns due to subjective de-
cisions by the analyst—and also to prevent discarding a
potentially relevant predictor—we opt to use an objective
approach to identify those horizons that best explain in-
vestors’ buy or sell operations. For that, we use a robust
feature selection technique borrowed from the machine
learning literature called elastic net. e great advantage of
the elastic net comes by the simplicity of its loss function
(just like a regression) and also the robustness in preventing
overfitting by optimally using a convex combination of the
Lasso and Ridge regularization methods. Overfitting can
occur as the algorithm may learn the dynamics of the
variable of interest and fit very well the training dataset but
with poor predictability in other datasets. Evaluating the
potential for overfitting is essential for researchers as it may
undermine the model. We understand that our method
seeks to avoid, to some extent, the perils of overfitting. e
Ridge and the Lasso algorithms impose penalties for large
weights in the model [1]. In this way, they tend to reduce the
Hindawi
Complexity
Volume 2019, Article ID 4325125, 14 pages
https://doi.org/10.1155/2019/4325125