Published in IET Circuits, Devices & Systems Received on 11th January 2012 Revised on 17th May 2012 doi: 10.1049/iet-cds.2012.0011 ISSN 1751-858X Low-power processor architecture exploration for online biomedical signal analysis A.Y. Dogan 1 J. Constantin 2 D. Atienza 1 A. Burg 2 L. Benini 3 1 Embedded Systems Laboratory (ESL) – EPFL, Lausanne – 1015, Switzerland 2 Telecommunications Circuits Laboratory (TCL) – EPFL, Lausanne – 1015, Switzerland 3 UNIBO-Micrel Laboratory, Viale Risorgimento 2, Bologna 40136, Italy E-mail: Q1 ahmed.dogan@epfl.ch Abstract: In this study, the authors explore sequential and parallel processing architectures, utilising a custom ultra-low-power (ULP) processing core, to extend the lifetime of health monitoring systems, where slow biosignal events and highly parallel computations exist. To this end, a single- and a multi-core architecture are proposed and compared. The single-core architecture is composed of one ULP processing core, an instruction memory (IM) and a data memory (DM), while the multi-core architecture consists of several ULP processing cores, individual IMs for each core, a shared DM and an interconnection crossbar between the cores and the DM. These architectures are compared with respect to power/performance trade-offs for different target workloads of online biomedical signal analysis, while exploiting near threshold computing. The results show that with respect to the single-core architecture, the multi-core solution consumes 62% less power for high computation requirements (167 MOps/s), while consuming 46% more power for extremely low computation needs when the power consumption is dominated by leakage. Additionally, the authors show that the proposed ULP processing core, using a simplified instruction set architecture (ISA), achieves energy savings of 54% compared to a reference microcontroller ISA (PIC24). 1 Introduction and related work According to the World Health Organization, cardiovascular and modern human behaviour-related diseases are the major cause of mortality worldwide [1]. Close and potentially continuous medical supervision is strongly needed to control these types of diseases. They are thus expected to soon require healthcare costs and medical management needs that are unsustainable for traditional healthcare delivery systems. Personal health monitoring systems are poised to offer large-scale and cost-effective solutions to this problem. Wireless body sensor networks (WBSNs) are the enabling technology for such personal health systems [2, 3]. A WBSN for health monitoring consists of a number of light-weight sensor nodes attached to the human body, where each node is responsible for processing a specific low rate physiological signal. For instance, one of the most important physiological signals is the electrocardiogram (ECG), which is typically acquired at sampling rates between 125 Hz and 1 kHz to capture the often important details of the waveform. In order to monitor the heart rate for extended periods of time (up to multiple days or weeks), an ultra-low-power (ULP) design with embedded biomedical signal processing for feature extraction on the sensor node is necessary [4] to reduce the costly signal storage or transmission [5] to the essence. An effective technique to reduce the computational power consumption is supply voltage scaling, potentially all the way to sub-threshold operation. In the literature, voltage scaling and its limitations and disadvantages such as performance loss, the risk of functional failure, performance variability etc., have been analysed extensively [6–9] and various low-power architectures have been presented. For example, Chen et al. [10] proposed a sensor platform capable of nearly-perpetual operation by using harvesting from solar cells. The proposed single processor architecture has an ARM Cortex M3 core with both retentive and non-retentive SRAM and a power management unit which controls the active and ultra low power sleep modes. In another work, Hanson et al. [11] presented a new ultra low energy processor with low voltage operations for wireless monitoring systems. They optimised the standby power consumption of the processor with the help of a new low leakage memory macros, memory size and instruction set adjustments and power gating. However, the main issue with low-voltage operation is the performance loss, which, for a given processing requirement, can limit the degree of use of voltage-scaling. Parallel computing using multiple cores can alleviate this issue, provided that the algorithms to be executed can be parallelised. To this end, Dreslinski et al. [12] proposed a near threshold computing (NTC), cluster-based multi- processor architecture with a shared cache that operates at a higher supply voltage to be able to serve multiple cores at the same time. Finally, Pu et al. [13] Q2 introduced a sub/near threshold co-processor for low energy mobile image IET Circuits Devices Syst., pp. 1–8 1 doi: 10.1049/iet-cds.2012.0011 & The Institution of Engineering and Technology 2012 Techset Composition Ltd, Salisbury Doc: {IEE}CDs/Articles/Pagination/CDSSI20120011.3d www.ietdl.org