1 Federated Learning With Blockchain for Autonomous Vehicles: Analysis and Design Shiva Raj Pokhrel and Jinho Choi, Senior Member, IEEE Abstract—We propose an autonomous blockchain-based fed- erated learning (BFL) design for privacy-aware and efficient vehicular communication networking, where local on-vehicle machine learning (oVML) model updates are exchanged and verified in a distributed fashion. BFL enables on-vehicle machine learning without any centralized training data or coordination by utilizing the consensus mechanism of the blockchain. Relying on a renewal reward approach, we develop a mathematical framework that features the controllable network and BFL parameters (e.g., the retransmission limit, block size, block arrival rate, and the frame sizes) so as to capture their impact on the system-level performance. More importantly, our rigorous analysis of oVML system dynamics quantifies the end-to-end delay with BFL, which provides important insights into deriving optimal block arrival rate by considering communication and consensus delays. We present a variety of numerical and simulation results highlighting various non-trivial findings and insights for adaptive BFL design. In particular, based on analytical results, we minimize the system delay by exploiting the channel dynamics and demonstrate that the proposed idea of tuning the block arrival rate is provably online and capable of driving the system dynamics to the desired operating point. It also identifies the improved dependency on other blockchain parameters for a given set of channel conditions, retransmission limits, and frame sizes. Index Terms—on-Vehicle Machine Learning, Federated learn- ing, Blockchain, Delay Analysis, consensus delay, low delay I. I NTRODUCTION Next-generation wireless networks are envisaged to guaran- tee low-delay and ultra-high reliable connectivity anywhere, anytime and on-the-move [1], [2]. This will satisfy the real- time communication constraints for the impending autonomous vehicles. To this end, on-Vehicle Machine Learning (oVML) is a persuasive solution wherein each vehicle maintains their best machine learning model and is thereby capable of making intelligent decisions, even when it loses connectivity for some time. Training such an oVML model could require more samples of data than those per each vehicle. As a result, it demands data trading (and knowledge exchanges) with neighboring vehicles [3], [4]. In this paper, we address the challenge of training each oVML model by exploiting federated learning with their neighboring vehicles [5]–[9]. Machine learning techniques in various fashions have already been applied to improve the performance of autonomous vehicles [10]: computer vision for analyzing obstacles, machine learning for adapting their pace to the environment (e.g., bumpiness of the road). Due to the potential high number of self-driving cars and the need for them to quickly respond Shiva Raj Pokhrel and Jinho Choi are with the School of IT, Deakin University, Geelong, Australia. Email: {shiva.pokhrel@deakin.edu.au} to real world situations, traditional cloud approach generates safety risks. Federated learning can represent a solution for limiting volume of data transfer and accelerating such learning processes of autonomous vehicles. The anticipated outcome is a systematic approach for such design that transforms our existing vehicles into mobile data centers, performs federated learning and reacts timely to their needs. Resulting benefits include a better provision of seamless messages transfer and low delay internet services on the move for such connected autonomous vehicles. Due to the potentially large number of anticipated au- tonomous cars and the need for them to quickly respond to real- world situations, the existing cloud-based learning approach is sluggish and may generate safety risks. One main problem is that locally collected data samples are owned by each vehicle. Thus, their trading and knowledge sharing should keep the raw data private from other neighboring vehicles. In this regard, as proposed by Google’s federated learning (GFL) [6], each vehicle trades its locally trained model update (mainly, gradient parameters and learning weightage) rather than the raw data. Federated learning can represent a solution for limiting the volume of data transfer and accelerating the learning processes. Besides, it is worth noting that the regeneration of the raw data from the traded model is not possible, thus guaranteeing privacy. Such trading in GFL is handled by the aid of a centralized server that produces a global model, which is an ensemble average of all the locally trained model updates. Thereafter, each vehicle downloads the globally updated model and computes their next local update until the completion of the global model training process. Observe that, because of the closed-loop exchanges (locally trained model update followed by a globally aggregated model update triggering next iteration of local training), the delay incurred in training completion of GFL may sometimes be around several minutes (10 or more), as reported recently for Google’s keyboard application [11]. In oVML, in addition, a centralized server may not be available. To this end, we propose (and evaluate) a blockchain-based federated learning (BFL) model, as illustrated in Figure 1, for efficient communication of autonomous vehicles as GFL is not straightforwardly applicable because of the following two fundamental shortcomings [12], [13]. i) Centralization: GFL depends on a single global server, which is vulnerable to server’s malfunctioning, is highly dependent on network connectivities and suffers heavily from bottleneck (as a result of the traffic of local model updates from each oVML). Such malfunctioning mostly incurs an inaccurate global model, increases the response time and creates distortion over oVML. Shiva Raj Pokhrel and Jinho Choi