SEU Sensitivity of Robust Communication Protocols C. Lopez-Ongil, M. Portela-Garcia, M. Garcia-Valderas, A. Vaskova, L. Entrena Electronic Technology Department University Carlos III of Madrid Leganes, Spain celia@ing.uc3m.es J. Rivas-Ábalo, A. Martín-Ortega, J. Martinez-Oter, S. Rodríguez-Bustabad, I. Arruego INTA (National Institute for Aerospace Technique) Madrid, Spain alberto.martinortega@insa.es Abstract.— Robust communications protocols are widely used in safety critical applications, such as aerospace or automotive systems. On-board complex systems, working in harsh environments, are composed of distributed modules with a high degree of interaction and critical tasks replication. Single Event Upsets (SEUs) are very probable to affect these modules and their communications. Robust protocols are usually designed to correct communication data errors, usually due to problems in the physical medium (e.g. noise). They provide robustness in data transmission but fault tolerance is not ensured in the interface control logic. In this work, a study of the SEU sensitivity of a typical robust communication protocol, the CAN bus, is performed. Authors have applied an extensive fault injection campaign in the internal modules of the circuit that implements this standard in order to perform a depth analysis. Experimental results prove robustness is not complete in the control part of this protocol. Selective hardening will enhance this robustness with low extra cost in terms of area or performance. Keywords- CAN Bus, On-board Distributed Systems, SEUs, Transient Fault Emulation I. INTRODUCTION Distributed system architectures provide high performance by distributing the system tasks (control and processing) among several nodes interconnected. The correct behavior of the complete system depends on the reliability of data transfers. This issue is a main concern in safety-critical applications like aviation, military, space or automotive applications. In fact, the use of distributed system architectures is very common in this kind of applications. For example, in aerospace applications, onboard electronic systems usually include some functionality distribution among different devices, and nowadays, motor vehicles include more and more electronic tasks performed by several computing nodes interconnected. Therefore, communication protocols with error detection mechanisms are used in order to ensure correct data transfers and data consistency in all the nodes, like Controller Area Network (CAN) [1] or SpaceWire [2] standards. The available error detection mechanisms provide robustness against faults at the physical layer of the protocol, i.e. in data transmission. However, the standards do not define error detection mechanisms to ensure fault tolerance when faults affect the internal elements in the CAN module. Radiation effects (Single Event Effects, SEEs) are a main concern in applications working within harsh environment, like aerospace applications, and also in any safety-critical application, since current features of technologies make digital circuits more and more sensitive to this kind of effects, even at ground level [3]. These effects can be significantly more important in periods of maximum solar activity, when satellites can require a shutdown to prevent hard errors, flight plans have to be modified or power grid could suffer a light out [4]. Ionizing particles and neutrons produce soft errors affecting memory elements or combinational logic in digital circuits. Soft errors in memory elements can modify the content of one memory element (SEU), flip-flop or a RAM cell, or can modify multiple bits (MCU, MBU). The affected memory element stores the faulty value until a new data is written again. This effect is modeled as a bit-flip in the faulty location. The bit-flip model is a model widely accepted by the scientific community. Soft errors in combinational logic provoke an erroneous pulse in the logic value that can be propagated through the circuit and eventually can be stored in a memory element. All these effects can produce a failure in a digital circuit [3]. In this paper, an analysis of the SEU sensitivity in a robust communication protocol is presented. In particular, the study has been performed on a CAN IP module developed by INTA (National Institute for Aerospace Techniques, Spain). This core is used in a distributed architecture of an on-board computer in OPTOS satellite [5]. In order to qualify the final system implemented, results presented in this paper must be considered together with the sensitivity of the target technology chosen. Fault injection campaigns have been performed to study the effect of single bit-flips in the memory elements that implements the CAN protocol and to detect the most sensitivity areas. Hardening only these areas, with some mitigation technique, suppose an optimum solution in terms of dependability and cost. The paper is organized as follows. Section II summarizes works presented in the literature related to dependability of communication protocols against radiation effects. Section III details the case study, CAN module, the standard specifications as well as the IP core under test developed at INTA. Section IV describes the developed experiments and analyses the obtained results. Finally, section V states the conclusions of this work. II. RELATED WORK Robust protocols are highly demanded in many applications where working environment is suffering from external interferences, such as ionizing radiation, electromagnetic interferences, other sources of noise, etc. Authors have studied a subset of protocols available nowadays in the market of telecommunications, for aerospace, avionics, automotive and data networks applications. Some of these protocols are company proprietary while others are standardized by ISO or IEEE. We have made a comparison regarding the type of communication, data transfer speed, fault tolerance and implementations possibilities (length, number of This work has been partially supported by the Spanish Government. The project code is TEC2010-22095-C03-03. 188 978-1-4673-2085-6/12/$31.00 c 2012 IEEE