Scenario-based Synthetic Dataset Generation for Mobile Money Transactions Denish Azamuke, Marriette Katarahweire, and Engineer Bainomugisha denishazamuke@gmail.com,kmarriette@gmail.com,baino@mak.ac.ug Department of Computer Science, Makerere University Kampala, Uganda ABSTRACT There is limited availability of mobile money transaction datasets from Sub-Saharan Africa for research because transaction data records are sensitive in nature and therefore raise privacy concerns. This has in turn hindered the potential to study fraudulent patterns in mobile money transactions so as to propose realistic mitigation measures based on Machine Learning Approaches to the prevailing fnancial fraud challenges in the region. This research presents mobile money scenarios that should be considered in order to im- plement a simulator that can harness synthetic datasets for mobile money transactions from Sub-Saharan Africa so as to carry out fraud detection research. These scenarios include the defnition of a mobile money ecosystem with processes used by actors such as mobile money agents, clients, merchants and banks to interact with each other in mobile money operations. There is also a need for a real mobile money dataset to extract statistical information and diverse fraudulent behaviours of actors and fraud examples in mobile money markets. This research uses the design consid- erations to examine process-driven techniques such as numerical simulation, agent-based modeling, and data-driven techniques such as neural networks that can be leveraged to generate synthetic datasets for mobile money transactions. Common data generation toolkits like PaySim, AMLSim, RetSim and ABIDES that are based on these techniques have been examined. The design considerations are used to design a realistic model known as MoMTSim based on real mobile money processes and agent-based modeling techniques that can be implemented to generate synthetic transaction datasets for mobile money with fraud instances. This will facilitate fraud detection research. The synthetic datasets eliminate data privacy risks, are easy and faster to obtain, and are cheap to experiment with. With the proposed model, diferent research groups can move to the implementation stage to realise a model for synthetic data generation for mobile money transactions from the Sub-Saharan region. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. FAMECSE ’22, June 7ś8, 2022, Cairo-Kampala, Egypt © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-9663-9/22/06. . . $15.00 https://doi.org/10.1145/3531056.3542774 CCS CONCEPTS · Computing methodologies Machine learning; Model ver- ifcation and validation; · Applied computing Electronic funds transfer. KEYWORDS Mobile money, datasets, agent-based modeling, fraud detection, synthetic data ACM Reference Format: Denish Azamuke, Marriette Katarahweire, and Engineer Bainomugisha. 2022. Scenario-based Synthetic Dataset Generation for Mobile Money Trans- actions. In Federated Africa and Middle East Conference on Software Engi- neering (FAMECSE ’22), June 7ś8, 2022, Cairo-Kampala, Egypt. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3531056.3542774 1 INTRODUCTION Mobile money systems enable access to fnancial services through the use of feature or smart phones without having an account at a bank [12]. Through the mobile phone, users are able to send or receive money and pay for goods and services such as domestic bills and in-store purchases. Most recently, during the COVID- 19 pandemic, mobile money systems have been leveraged to dis- burse funds to the vulnerable population by governments and non- governmental organisations [11]. The Global System for Mobile Communications Association (GSMA) defnes mobile money as all fnancial services that can be accessed using a phone. This defnition thus includes mobile banking which is concerned with individuals performing transactions between bank accounts and mobile money accounts [39]. Mobile money platforms are spurring fnancial inclusion in Sub- Saharan Africa (SSA) with 548 million registered mobile money accounts in SSA [19], US $490 billion transaction value (a growth of 23%) and in East African countries such as Uganda, 43% of the population have mobile money accounts compared to 11% with bank accounts (BoU). In Kenya, 72% of the population have mobile money accounts compared to 29% with bank accounts. Unfortunately, mobile money systems are vulnerable to fnancial fraud and laundering targeting end-users, mobile money agents, and mobile network operator systems, that if not appropriately dealt with are likely to discourage usage among the population, potentially reversing years of progress on achieving fnancial inclu- sion. Reported millions of US dollars are lost in the fraud [7]. Mobile money fnancial fraud types range from simple to sophis- ticated including split deposits and withdrawal of funds carried out by mobile money agents, parallel money transfers on the network, money laundering on mobile money fnancial service platform by 64