On Decentralizing Federated Learning Akul Agrawal 1 , Divya D Kulkarni 2 , and Shivashankar B. Nair 3 Abstract— Federated Learning (FL), a distributed version of Deep Learning (DL), was introduced to tackle the problem of user privacy and huge bandwidth requirements in sending the user data to the company servers that run DL models. FL enables on-device training of the models. Most FL approaches are entirely centralized and suffer from inherent limitations such as single node failure and channel bandwidth bottlenecks. To circumvent these issues, we present an approach to de- centralize FL using mobile agents coupled with the Federated Averaging (FedAvg) algorithm. A hybrid model that combines both centralized and decentralized approaches has also been presented. Results obtained by running the model on different network topologies indicate that the hybrid version proves to be the better option for an FL implementation. I. I NTRODUCTION In recent years, the world has seen an enormous increase in the usage of handheld devices. A massive quantity of data is being generated from these devices from the various applications and sensors onboard. Further, the processing capability of handheld devices is increasing day by day, paving the way to train the models on the device itself by using the data generated locally. These locally learned models signiﬁcantly improve the user experience, with an array of features evolved from the learned data. The models can be further enhanced if the learning is performed by accumulating the data from several such devices, on a central entity like a server. McMahan et al. [1] introduced a technique termed Fed- erated Learning (FL) where, in lieu of data, the models generated at the individual devices are shared with the central server. Each device has a local dataset, over which a model is trained, and the trained weights are shared with the central server in the form of an update [1]. The central server, in turn, averages these model weights received from various devices and shares the averaged one with the individual devices. The averaged model received by the devices is, thus, generally better than that which was locally generated on the device. The ﬁnal goal of learning the desired model is achieved over several such rounds of exchange of models between the local devices and the central server. Such an FL model is centralized and thus suffers from the inherent drawbacks of any centralized system, including scal- ability, privacy issues, a central point of failure, maintenance costs, and large clients-to-server bandwidth requirements, to 1 Dept. of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India akulagrawal@iitg.ac.in 2 Dept. of Computer Science and Engineering, Indian Institute of Tech- nology Guwahati, Guwahati, Assam, India divyadk@iitg.ac.in 2 Dept. of Computer Science and Engineering, Indian Institute of Tech- nology Guwahati, Guwahati, Assam, India sbnair@iitg.ac.in name a few [2]. To circumvent this, we propose herein, a decentralized version of FL that uses mobile agents [[3], [4]] to disseminate locally learned models to other devices. Mobile agents are capable of carrying data and code from one device to another in a network of devices. They can also execute the code, if required, using the processing resources available on the device. As opposed to a centralized system wherein the server performs the averaging of the model weights, the work presented in this paper uses mobile agents that move from one device to the other and average the model weights which they carry, with the one available locally on the devices. The client devices, thus, receive an update every time the mobile agent visits them. This paper also discusses the merits of a hybrid (a fusion of centralized and decentralized) FL model, where the averaging is performed not only by a mobile agent migrating in the network but also by a central server whenever it receives the latest averaged model weights from a mobile agent. The mobile agent, thus, sends the model it carries intermittently to the server, which in turn, averages and relays the same to all other devices. The following are the main contributions presented in this work: 1) Fully decentralized Federated Learning with mobile agents communicating among the devices 2) A hybrid FL model combining both the features of centralized and decentralized models 3) Experiments comparing the decentralized and hybrid FL models varying the number of mobile agents, incorporating various topologies providing an overview of the performance of the models II. RELATED WORK Mcmahan et al. [1] ﬁrst coined the term Federated Learn- ing and described the FL procedure to train a DL model in a distributed fashion on decentralized data. They justiﬁed that FL is robust to the non-IID distribution of data among clients (i.e., the training data available with different clients is not identical) and that the primary constraint in this approach is the communication costs in every communication round.A communication round is a process in which the client sends its weights to the server and receives the averaged weights from the server. FL also exhibits a signiﬁcant improvement in these communication costs as compared to the synchro- nized stochastic gradient descent techniques [5]. Kamp et al. [6] have proposed a dynamic averaging model to improve the performance of the state-of-the-art averaging algorithm FedAvg [1], proposed by Mcmahan et al. [1]. While a peer-to-peer FL approach has been used and reported in several papers [[7], [8]], most of them use a 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) October 11-14, 2020. Toronto, Canada 978-1-7281-8526-2/20/$31.00 ©2020 IEEE 1590 Authorized licensed use limited to: MICROSOFT. Downloaded on December 02,2021 at 03:57:04 UTC from IEEE Xplore. Restrictions apply.