An Overlap-Contention Free True-Single-Phase Clock Dual-Edge-Triggered Flip-Flop Andrea Bonetti, Adam Teman and Andreas Burg Telecommunications Circuits Lab (TCL), ´ Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL), Switzerland Email: {andrea.bonetti, adam.teman, andreas.burg}@epfl.ch Abstract—Dual-edge-triggered (DET) synchronous operation is a very attractive option for low-power, high-performance designs. Compared to conventional single-edge synchronous systems, DET operation is capable of providing the same throughput at half the clock frequency. This can lead to significant power savings on the clock network that is often one of the major contributors to total system power. However, in order to implement DET operation, special registers need to be introduced that sample data on both clock-edges. These registers are more complex than their single-edge counterparts, and often suffer from a certain amount of clock-overlap between the main clock and the internally generated inverted clock. This overlap can cause contention inside the cell and lead to logic failures, especially when operating at scaled power supplies and under process variations that characterize nanometer technologies. This paper presents a novel, static DET flip-flop (DET-FF) with a true-single- phase clock that completely avoids clock overlap hazards by eliminating the need for an inverted clock edge for functionality. The proposed DET FF was implemented in a standard 40 nm CMOS technology, showing full functionality at low-voltage operating points, where conventional DET-FFs fail. Under a near- threshold, 500 mV supply voltage, the proposed cell also provides a 35% lower CK-to-Q delay and the lowest power-delay-product compared to all considered DET-FF implementations. I. I NTRODUCTION The design of energy-efficient circuits remains one of the main challenges in the field of digital integrated systems [1]. A large portion of the power dissipated in VLSI architectures is attributed to clock distribution, consuming as much as 45% of the total system power [2]. Clock networks are characterized by a 100% activity factor, charging and discharging their parasitic capacitors during each cycle, and thereby leading to power dissipation that is directly proportional to clock frequency. For this reason, among the different applications, high-speed, high-throughput designs are especially affected by this issue. One well-known approach for reducing the clock power is dual-edge-triggered (DET) synchronous operation. By sam- pling data on both the rising and falling edges of the clock, the clock frequency can be reduced by 50% without changing the system throughput. This directly cuts the power dissipation of the clock network in half, leading to significant overall system power savings. However, implementation of DET operation requires the introduction of registers that sample, store, and propagate their input at both clock edges. While these dual- edge-triggered flip-flops (DET-FFs) are more complex and generally larger than their single-edge-triggered (SET) coun- terparts, they can be designed to be more energy-efficient [3], thereby providing additional power savings. The implementation of storage cells that are triggered on both clock edges is a well researched subject. Many solutions for the design of DET-FFs have been proposed [4]–[8]. The most popular of these cells is the transmission-gate latch-MUX (DET-TGLM) [4] due to its simple implementation that is based on two latches and an output multiplexer (MUX). An alternative configuration can be assembled by replacing the transmission gates with C 2 MOS gates [11], resulting in the C 2 MOS latch-MUX (DET-C 2 MOSLM) [5]. A different ap- proach is to generate a short pulse on each clock edge, thereby realizing a pulse-triggered DET-FF, as shown in [6]. More advanced DET FFs that limit the switching activity through pulse generation and precharge conditions are the conditional discharge flip-flop (DET-CDFF) [7] and the symmetric pulse generator flip-flop (DET-SPGFF) [8]. While these topologies have been demonstrated on various applications, few of them have been examined in deeply scaled process technologies under voltage scaling, commonly used for the implementation of energy-efficient systems. In particular, in the presence of considerable process variation, the use of both clock phases usually introduces some extent of clock-overlap, which can lead to race conditions and other detrimental circuit behavior. For example, when considering the traditional DET-TGLM, process, voltage and temperature (PVT) variations can cause this overlap to increase to a point, where the currently held data is over-written, resulting in a fatal logic error. Contribution: in this paper, we solve this clock overlap problem, by presenting the first static true-single-phase-clock (TSPC) DET-FF. By implementing the cell with TSPC circuits and an internal dual-feedback mechanism, completely static operation is achieved, enabling robust operation under voltage scaling and process variations. To demonstrate its function- ality in nanoscaled technologies, the cell was implemented in a 40 nm CMOS process, showing full functionality at a near-threshold, 500 mV supply voltage (V DD ) under extensive Monte Carlo (MC) statistical simulations. In addition to being the only topology to continue to operate robustly under these conditions, the proposed cell also provides the lowest CK-to- Q delay (t cq ) and the best power-delay product (PDP) when compared to other leading DET-FF solutions. Outline: the rest of this paper is organized as follows: Section II presents the clock-overlap hazard in the traditional DET-TGLM circuit. The proposed static dual-edge-triggered flip-flop with true-single-phase clock (SDET-TSPCFF) is pre- sented in Section III to address this hazard and enable low- voltage operation. Section IV provides simulation results for the proposed cell and a comparison with other popular DET- FF implementations. Finally, the conclusions are reported in Section V.