International Conference on Technology and Business Management April 10-12, 2017 ISBN: 978-1-943295-06-7 30 A Novel Partial Dynamic Reconfiguration Based Fault Tolerant System on Chip Deepa Jose Nirmala Devi Vidhya B KCG College of Technology, Karapakkam (deepa.ece@kcgcollege.com) This paper presents a reconfiguration algorithm driven intrinsic repair as a practical and efficient generic solution for Field Programmable Array Based based Software Defined Radio (SDR) systems. This research deals with the implementation of a real-time adaptive system for fault tolerant FT mission critical applications that can change automatically based on the performance and reliability constraints of the application. The hardware implementation on Virtex 6E FPGA offers greater than 40% power savings at reliability (R) values as high as 0.75 and the gain reduces to 30% for high-reliability systems (R > 0.90) while retaining similar fault-protection capabilities associated with triple modular redundancy. The PDR implementations provide an average a 14.61% improvement in terms of power and 32.6% improvement in terms of resource utilization Keywords: Software Defined Radio; Partial Dynamic Reconfiguration; Fault Tolerant; Low Power; Reliability; Signal Processing; FPGA 1. Introduction Software Defined Radios (SDR) are currently used for various applications such as navigation, geodetic science, atmospheric science and other space applications. Since SDR contains software functions, it can change the parameters at any time according to the situations. FPGA based SDRs are used in space missions due to the combination of their computational power and the ability to reconfigure, enabling a system to support new protocols while in orbit. For the correct function, the FPGA has to be configured by loading configuration data into its configuration memory. FPGAs are equipped with an SRAM configuration memory which allows for fast and frequent reconfiguration, but which also makes them very susceptible to SEU errors. Such an error can have serious consequences and has to be mitigated before it causes an incorrect function in the system. If the FPGA is used outside the Earth’s atmosphere or the FPGA based designs are produced in massive numbers, the probability of an SEU attack will rapidly increase. The issue of SEU mitigation is an actively researched problem, which remains somewhat unsolved. Most real-time systems are based either on Application Specific Integrated Circuits (ASICs) or reconfigurable hardware. Modern society, has become reliant on complex SOC systems which must provide reliable service. Mission critical signal processing systems often have stringent reliability and performance requirements. Failures in these systems can result in data corruption and lower performance, leading to catastrophic failures. The reliable operation of these systems is vital, not only for mission critical applications but also for regular mass-market applications. For the sub-micron SOC systems the effects of process variations (random dopant fluctuations/sub-wavelength lithography) and lower voltage/ current threshold, make these designs susceptible to various types of faults [1,2]. Maintenance and repair are usually very expensive and time consuming for these systems. Hence, besides performance, reliability and FT have become important design metrics for these systems. Due to the higher probability of faults, there is a fundamental need for addressing reliability and design for testability (DFT) issues, while designing FT VLSI systems. FT systems require increased capability for autonomous fault tolerance and self-adaptation at run-time, especially as system complexities and interdependencies increase. The signal processing systems should desirably have an optimal balance between performance and reliability requirement of the FT system and an automated and adaptive online flexible repair algorithm. Redundancy is the provision of the functional capabilities, that would be unnecessary in a fault-free environment. This can be backup components, which automatically ‘kick in’ should one component fail. However, the associated redundancy brings in a large number of penalties: increase in weight, size, power consumption, cost, as well as time to design, verify, and test. Hence in this research work redundancy is incorporated using partial dynamic reconfiguration (PDR). Normal hardware redundancy architectures are based on TMR or duplex systems [3-6]. The present approach is fundamentally different from the TMR, because it selects between the stages of redundancy (simplex, duplex, TMR), based on the importance of balance in performance and reliability. TMR is a wise choice in the latter case, whereas a simplex configuration is a better option in the former. This paper implements various signal processing circuits for SDR systems and intends to compare different criteria; such as size, critical path delay, power and analyze the optimum fault tolerant combination which can be used. Additionally, to control the entire error detection and recovery from various types of faults, a novel fault handling reconfiguration algorithm for FPGA based SDR systems, is proposed and implemented in this paper. 2. Fault Handling Reconfiguration Algorithm - Control Flow of Detection and Repair Process This section describes the proposed fault handling control algorithm, whose procedure is depicted in Figure 1. A fault (SEU or stuck-at-0/1) is active when it produces an error. To achieve an FT system, the proposed fault handling algorithm implements both system failure prevention mechanisms - aimed to increase the mean time between errors (MTBE), and