Hardware Support for High Performance, Intrusion- and Fault-Tolerant Systems G. P.Saggese , C. Basile , L. Romano , Z. Kalbarczyk , R. K. Iyer University of Illinois at Urbana-Champaign 1308 W. Main St., 61801 Urbana, Illinois Universit` a degli Studi di Napoli Federico II Via Claudio 21, 80125 Napoli, Italy Abstract— The paper proposes a combined hardware/software approach for realizing high performance, intrusion- and fault- tolerant services. The approach is demonstrated for (yet not limited to) an Attribute Authority server, which provides a com- pelling application due to its stringent performance and security requirements. The key element of the proposed architecture is an FPGA-based, parallel crypto-engine providing (1) optimally dimensioned RSA Processors for efficient execution of compu- tationally intensive RSA signatures and (2) a KeyStore facility used as tamper-resistant storage for preserving secret keys.To achieve linear speed-up (with the number of RSA Processors) and deadlock-free execution in spite of resource-sharing and scheduling/synchronization issues, we have resorted to a number of performance enhancing techniques (e.g.,useof different clock domains, optimal balance between internal and external parallelism) and have formally modeled and mechanically proved our crypto-engine with the Spin model checker. At the software level, the architecture combines active replication and threshold cryptography, butin contrast with previous work, the code of our replicas is multithreaded so it can efficiently use an attached parallel crypto-engine to compute an Attribute Authority partial signature (asrequired by threshold cryptography). Resulting replicated systems that exhibit nondeterministic behavior, which cannot be handled with conventional replication approaches. Our architecture is based on a Preemptive Deterministic Scheduling algorithm to govern scheduling of replica threads and guarantee strong replica consistency. I. INTRODUCTION Combining intrusion and fault tolerance is an effective approach to handle security and reliability issues and has attracted significant research interest [1]–[4].In meeting se- curity and reliability requirements, however, existing solutions often sacrifice performance, a loss that is notacceptable for many critical applications (e.g., e-commerce, e-procurement). Also,mostof the security mechanisms proposed are purely software based, which simplifies design and implementation but reduces resilience to security attacks [5]. In an attempt to improve security, smart-cards have been proposed as tamper- resistant devices to implement access control mechanisms [6]. Current smart-card technology, however, provides quite limited computational and storage capabilities; moreover, its tamper- resistance property has been questioned by experimental in- vestigations [7]. This study leverages current research on intrusion- and fault- tolerant architectures and combines software approaches with the use of reconfigurable hardware devices to provide substan- tially improved performance and security. While it is clear that a hybrid approach can be superior to a software-only approach (e.g.,our experiments show about an order of magnitude in speed-up), the effects on an overall system architecture are less understood. Consider, for instance, that the efficient co bination of parallel hardware with multithreaded software c result in systems exhibiting nondeterministic behavior, which cannotbe handled with conventional replication approaches (such as the Byzantine dissemination quorums used in COC [2]). Ourapproach is demonstrated in (yet notlimited to)the context of attribute certification systems [8], [9], which prov a compelling application due to their stringent performance and security requirements. Specifically, this paper presents design, implementation, and evaluation of a distributed, RSA based Certificate Engine (the core element of an Attribute Authority)thatcan tolerate both accidental and malicious faults yetprovide high performance. (The concepts and the techniques we propose also apply to RSA-based Certificatio Authorities, since the procedures for assembling and signing certificates are quite similar [10], [11].) The key component of our architecture is a hardware crypto-engine that integrates, in a single FPGA device, a large number of RSA Processors to accelerate computationa expensive RSA operations and a tamper-resistant KeyStore preserve secret keys;this is done seamlessly with threshold- cryptography support. Implementing RSA Processors and the KeyStore in a single chip provides significant improvement security and performance. (A secret key kept in the KeyStor is directly accessed by the RSA Processors without ever bei transfered outside the FPGA device.) While the crypto-engin approach might seem straightforward in principle, serious technical challenges must be overcome to provide an actual implementation. A solid design must provide linear speed-up (with the number of RSA Processors) and deadlock-free exe tion in spite of resource-sharing and scheduling/synchroniza tion of the multiple units executing concurrently. To achieve these goals, we have resorted to a number of performance enhancing techniques (e.g., use of different clock domains, optimal balance between internal and external parallelism) have formally modeled and mechanically proved our crypto- engine with the Spin model checker [12].In addition, our crypto-engine design is general and can serve a broad range of security applications (e.g., SSL connection establishment, elliptic curve operations). At the software level, the proposed architecture combines active replication and threshold cryptography to detect and Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems (SRDS’04) 1060-9857/04 $ 20.00 IEEE