CUFFS: An Instruction Count Based Architectural Framework for Security of MPSoCs Krutartha Patel Sri Parameswaran Roshan G. Ragel School of Computer Science and Engineering, University of New South Wales, Sydney, Australia. Department of Computer Engineering, University of Peradeniya, Sri Lanka. {kpatel, sridevan}@cse.unsw.edu.au roshanr@ce.pdn.ac.lk Abstract—Multiprocessor System on Chip (MPSoC) architecture is rapidly gaining momentum for modern embedded devices. The vulner- abilities in software on MPSoCs are often exploited to cause software attacks, which are the most common type of attacks on embedded systems. Therefore, we propose an MPSoC architectural framework, CUFFS, for an Application Specific Instruction set Processor (ASIP) design that has a dedicated security processor called iGuard for detecting software attacks. The CUFFS framework instruments the source code in the application processors at the basic block (BB) level with special instructions that allow communication with iGuard at runtime. The framework also analyzes the code in each application processor at compile time to determine the program control flow graph and the number of instructions in each basic block, which are then stored in the hardware tables of iGuard. The iGuard uses its hardware tables to verify the applications’ execution at runtime. For the first time, we propose a framework that probes the application processors to obtain their Instruction Count and employs an actively engaging security processor that can detect attacks even when an application processor does not communicate with iGuard. CUFFS relies on the exact number of instructions in the basic block to determine an attack which is superior to other time-frame based measures proposed in the literature. We present a systematic analysis on how CUFFS can thwart common software attacks. Our implementation of CUFFS on the Xtensa LX2 processor from Tensilica Inc. had a worst case runtime penalty of 44% and an area overhead of about 28%. Categories and Subject Descriptors B.8.2 [Performance and Reliability]: Performance Analysis and Design Aids General Terms Design, Performance, Security Keywords Architecture, Instruction Count, MPSoC, Attacks, Tensilica I. I NTRODUCTION A Multiprocessor System-on-a-Chip (MPSoC) has been widely ac- cepted as an architecture for high performance embedded systems [15]. The multimedia devices such as portable music players and cell-phones already deploy MPSoCs to exploit data processing parallelism and provide multiple functionalities [8, 27]. With increased functionalities the complexity of the design increases, and therefore the susceptibility of the system to attacks from adversaries. Embedded systems designers often do not include security in their design objectives. The short design turnaround times, due to competi- tive pressure of getting a system out in the market, is often soaked up by getting the functionality, performance and energy requirements correct [22]. Weaknesses in system implementation inevitably remain and are often exploited by the attackers in the form of either physical, software or side-channel attacks. Software attacks that exploit vulnerabilities in software code or weaknesses in the system design are the most common type of attacks [2]. Stack and heap based buffer overflows are the most common type of software attacks [19]. The buffer overflow vulnerabilities in application programs have been exploited since 1988 [14] and still continue to be exploited. On average nearly 10.7% of the vulnerabilities reported by the US-CERT vulnerability reports pertain to buffer overflow attacks. Figure 1 shows the percentage of buffer overflow attacks in each month of 2006 and 2007. Figure 2 shows an example of a stack buffer overflow attack. Figure 2(a) shows a snippet of vulnerable C code, Figure 2(b) shows the layout of the stack when function g is called from function f. As part of writing data to the array buffer in g, the attacker may supply malicious code in array buf before making a call to g. Passing a sufficiently higher value than K (which is 50), in len, would ensure that the stack overflows and the return address is overwritten as shown 0 5 10 15 20 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Months Buffer Overflow % 2006 2007 Fig. 1. US-CERT reported buffer overflow vulnerabilities in Figure 2(c). Thus the control flow of the program is changed to execute malicious code. This change in behavior disrupts the code integrity and causes fallacious program behavior. Recent literature suggests that newer security threats targeting portable electronics like mobile phones and music players may pose significant risks [4, 6]. Given that such devices already employ MPSoC architectures, it is imperative that security is considered at design time rather than be employed as a reactive measure. Incorporating security in the design definitely increases overheads, but given the ability of attacks to cause fraud, disrupt activity or threaten the confidentiality of data, the overheads are worth the cost [2, 21]. buffer[0] buffer[1] ... buffer[K-1] local variables g() saved FP g() return address g() arguments local variables f() saved FP f() return address f() Lower Addresses Higher Addresses Attacker’s code ... ... ... ... ... return address g() arguments local variables f() saved FP f() return address f() #define K 50 int f() { ... g(buf,len); ... } int g(void *s, size_t len) { char buffer[K]; memcpy(buffer, s, len); ... } (a) (b) (c) Stack Growth Stack Frame g() Stack Frame f() Fig. 2. A stack based buffer overflow attack. In this paper, we propose an architectural framework CUFFS for detection of software attacks. We design an MPSoC with a dedicated security processor called iGuard. Each basic block in the application processors of the MPSoC has one or two check-points which are instrumented with a special instruction that reports to iGuard. Our static analysis tool extracts the control flow of the program and the number of instructions between two sequential check-points. Both the control flow, and the number of instructions between two sequential check-points are recorded inside hardware tables of the iGuard. This information is created at compile time, and recorded in the hardware tables at load time. At runtime, the application processors report to the iGuard using the special instructions as to which basic block they are executing and the value of the processor’s Instruction Counter (IC) register. The iGuard uses the communicated information to check that the control flow is correct and that the number of instructions that were executed from one check-point to the other is in accordance with the information stored in its tables. However, if the iGuard finds that the control flow is incorrect or that the number of instructions between two check-points mismatch with the value in its hardware tables, it sends an interrupt to all the processors to abort execution. One of the novel contributions of this paper is the “active” iGuard processor in our architectural framework. By “active” we mean that 978-3-9810801-5-5/DATE09 © 2009 EDAA