Micro and Nanosystems  Aloke Saha 1,* , Rahul Pal 2 and Jayanta Ghosh 3 1 Department of Electrical and Computer Engineering, Dr. B. C. Roy Engineering College, Durgapur, India; 2 Department of Electrical and Computer Engineering, Bengal College of Engineering, Durgapur, India; 3 Department of Electrical and Computer Engineering, NIT Patna, Patna, India Abstract: Background: The present study explores a novel self-pipelining strategy that can enhance speed-power efficiency as well as the reliability of a binary multiplier as compared to state-of-art reg- ister and wavepipelining. Method: Proper synchronization with efficient clocking between the subsequent self-pipelining stages has been assured to design a self-pipelined multiplier. Each self-pipelining stage consists of self-latching leaf cells that are designed, optimized and evaluated by TSMC 0.18μm CMOS technology with 1.8V supply rail and at 25°C temperature. The T-Spice transient response and simulated results for the de- signed circuits are presented. The proposed idea has been applied to design 4-b×4-b self-pipelined Wal- lace-tree multiplier. The multiplier was validated for all possible test patterns and the transient response was evaluated. The circuit performance in terms of propagation delay, average power and Power-Delay- Product (PDP) is recorded. Next, the decomposition logic is applied to design a higher-order multiplier (i.e., 8-bit×8-bit and 16-bit×16-bit) based on the proposed strategy using 4-bit×4-bit self-pipelined multi- plier. The designed multiplier was also validated through extensive TSpice simulation for all the required test patterns using W-Edit and the evaluated performance is presented. All the designs, optimizations and evaluations performed are based on BSIM3 device parameter of TSMC 0.18μm CMOS technology with 1.8V supply rail at 25°C temperature using S-Edit of Tanner EDA. Results: The reliability was investigated of the proposed 4-b×4-b multiplier in the temperature range - 40°C to 100°C for maximum PDP variation. Conclusion: A benchmarking analysis in terms of speed-power performance with recent competitive design reveals preeminence of the proposed technique. A R T I C L E H I S T O R Y Received: May 05, 2019 Revised: July 27, 2019 Accepted: August 06, 2019 DOI: 10.2174/1876402911666190916155445 Keywords: Decomposition logic, Power-Delay-Product (PDP), reliability, self-latching, self-pipelining, parallel multiplier. 1. INTRODUCTION Speed-power efficiency, as well as reliability, is an im- portant design constraint for modern digital VLSI designer [1-3]. A multiplier acts as a major bottleneck for most digital computing systems in terms of operating-speed, power- dissipation as well as reliability [4-21]. High throughput characteristic of parallel multiplier has earlier attracted re- searchers for a long time [22, 23]. After the introduction of a parallel multiplier by C. S. Wallace in 1964 [22] and subsequently by L. Dada in 1965 [23], a number of structural modifications have been investigated to achieve high-speed multiplication along with low-power dissipation. Rapid technological advancements in modern nano-scale MOS devices and an increase in function/device density in a single monolithic IC make the reliability also a major design issue. Increase in function density with complex device interconnection leads to generation of localised heat (hot- spot) and is responsible to increase the rate of IC failure [24]. *Address correspondence to this author at the Department of Electrical and Computer Engineering, Dr. B. C. Roy Engineering College, Durgapur, India; Tel: +91 8967627938; E-mail: saha81@gmail.com Conventional Register pipelining [18] is a well known and globally accepted design strategy at the structural level that can improve throughput performance of a parallel multi- plier. However in this approach, a large number of interme- diate latches are used to achieve the pipelining process. As a result, the throughput efficiency of a pipelined microelec- tronic circuit increases at the cost of more power, area and latency as compared to the non-pipelined system. Wave- pipelining was investigated by Burleson et al. in 1998 [25] as a feasible alternative to overcome problems associated with conventional register pipelining. Wave-pipelining [12, 13, 25] improves the overall efficiency by removing inter- mediate latches/flip-flops from the circuit. The maximum throughput of a wave-pipelined system depends not on the critical path delay but on the delay difference between the worst and the best path delay [25]. Hence, the delay- equalization among all the data-paths from input to output is a critical design challenge for effective wave-pipelining. However the wave-pipelined system suffers from reliability issue and its efficiency depends mostly on the designer’s skills. For more detail, readers are directed to references [12, 13, 25]. 1876-4037/20 $65.00+.00 ©2020 Bentham Science Publishers Send Orders for Reprints to reprints@benthamscience.net Micro and Nanosystems, 2020, 12, 149-158 149 RESEARCH ARTICLE Novel Self-Pipelining Approach for Speed-Power Efficient Reliable Binary Multiplication