Biased or Limited: Modeling Sub-Rational Human Investors in Financial Markets Penghang Liu ∗ University at Bufalo Bufalo, New York, USA penghang@bufalo.edu Kshama Dwarakanath J.P.Morgan AI Research Palo Alto, California, USA kshama.dwarakanath@jpmorgan.com Svitlana S Vyetrenko J.P.Morgan AI Research Palo Alto, California, USA svitlana.s.vyetrenko@jpmchase.com ABSTRACT Multi-agent market simulation is an efective tool to investigate the impact of various trading strategies in fnancial markets. One way of designing a trading agent in simulated markets is through reinforcement learning where the agent is trained to optimize its cumulative rewards (e.g., maximizing profts, minimizing risk, im- proving equitability). While the agent learns a rational policy that optimizes the reward function, in reality, human investors are sub- rational with their decisions often difering from the optimal. In this work, we model human sub-rationality as resulting from two possible causes: psychological bias and computational limitation. We frst examine the relationship between investor profts and their degree of sub-rationality, and create hand-crafted market scenarios to intuitively explain the sub-rational human behaviors. Through experiments, we show that our models successfully capture human sub-rationality as observed in the behavioral fnance literature. We also examine the impact of sub-rational human investors on mar- ket observables such as traded volumes, spread and volatility. We believe our work will beneft research in behavioral fnance and provide a better understanding of human trading behavior. KEYWORDS human behavior, reinforcement learning, multi-agent systems, mar- ket simulations ACM Reference Format: Penghang Liu, Kshama Dwarakanath, and Svitlana S Vyetrenko. 2022. Bi- ased or Limited: Modeling Sub-Rational Human Investors in Financial Mar- kets. In Proceedings of 3rd ACM International Conference on AI in Finance (ICAIF ’2022). ACM, New York, NY, USA, 7 pages. https://doi.org/XXXXXXX. XXXXXXX 1 INTRODUCTION Research in fnance is well facilitated by the versatile market sim- ulations, which provide feasible experiment control and concrete market observations [14]. Multi-agent market simulators have been applied in fnancial research to reproduce the scaling laws for re- turns, assess the benefts of co-location, investigate the impact of ∗ Work done while the author was interning at J.P.Morgan AI Research. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. ICAIF ’2022, Nov 02ś04, 2022, New York, NY © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-XXXX-X/22/09. . . $15.00 https://doi.org/XXXXXXX.XXXXXXX large orders, and evaluate trading strategies [4, 8, 17]. These simu- lators promote the use of reinforcement learning (RL) algorithms to learn complex trading strategies. For example, [1] use RL to learn a trading strategy for daily investors, [11, 22] use RL to design market makers that provide liquidity in the market. These RL agents are trained in market simulations to learn a trading strategy that optimizes the specifed reward function (e.g., to make profts or to provide liquidity). In other words, the agent obtains an optimal trading strategy upon sufcient training. This is coherent with the notion of homo economicus, which assumes that humans are ideal decision-makers who are perfectly rational and have unlimited access to information. However, humans in real life are complex entities that may not always make perfect decisions. Studies show that various psycho- logical biases afect the human decision-making process [6, 10, 23]. Moreover, humans may attempt to make decisions that are satis- factory rather than optimal due to limited access to information and processing power [19, 20]. We refer to such behaviour as be- ing sub-rational, as opposed to perfectly rational decision-making. Although several models have been proposed to consider human sub-optimality in the RL setting, existing work mostly focuses on inferring the reward function from real human demonstration data, rather than to explain the human decision making process and evaluate the consequences of sub-rational decisions. In this paper, we model and examine the behavior of sub-rational human investors in fnancial markets. We introduce two types of sub-rational human investors: psychologically myopic and bounded rational. For each type of human investor, we investigate the rela- tion between the investor’s profts and the degree of sub-rationality. We also demonstrate the corresponding trading strategy in a hand- crafted market scenario to intuitively explain the strategy. In addi- tion, our experimental analysis discovers the impact of sub-rational investors on the market. We believe our models will provide an efective framework that captures and examines human investors while aiding in better understanding of their infuence in fnancial markets. 2 RELATED WORK Multi-agent simulators have become increasingly prevalent for modeling fnancial markets. Tux et al. [17] introduced a multi-agent model of fnancial markets to support the time scaling law from mutual interactions of participants. In recent contributions, Byrd et al. [8] developed a discrete event simulator to investigate the market impact of a large market order. Additionally, Vyetrenko et al. [24] proposed realism metrics to evaluate the fdelity of the simulated markets. While these multi-agent market simulators can be populated with rule based trading agents, they allow for the use arXiv:2210.08569v1 [cs.AI] 16 Oct 2022