1549-7747 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2020.2966373, IEEE Transactions on Circuits and Systems II: Express Briefs PAPER IDENTIFICATION NUMBER; 1 A 7.1-GHz 0.7-mW Programmable Counter with Fast EOC Generation in 65-nm CMOS Indranil Som, Santunu Sarangi, and T K Bhattacharyya* Abstract—An improved high frequency programmable counter is presented in this paper in realizing divider circuit for frequency synthesizers. The core improvement is based on modification of control architecture in End of Count (EOC) configuration. The overall architecture of the counter shows a speed improvement of more than 24% compared to its predecessor. The design is implemented in 65 nm CMOS technology with silicon occupancy of 42×78 μm 2 , consuming total power of 0.68 mW, at its highest operating frequency of 7.1 GHz in 1.2 V supply. The divider achieves full modulus of 255 with minimum division ratio of 5 at its highest frequency. Index Terms—Phase Locked Loop (PLL), dual modulus prescaler (DMP), Programmable Counter (PC), End of Count (EOC), Critical Delay Path. I. I NTRODUCTION F REQUENCY divider plays a critical role in high fre- quency synthesizers. Being an essential component, its adequate range of programmable divisional ratio, low power consumption and high-frequency operation are important de- sign criteria. Various programmable high frequency dividers have been reported in [1]–[5], where both architectural and or topological modifications have been incorporated, in realizing Programmable Divider (PDIV) for improving power efficiency. For example, work in [3] achieves power efficiency of 6.25 GHz/mW by incorporating low leakage TSPC flip-flops (FFs), in cascaded dual modulus prescalers (DMP). Work reported in [4] employs cascaded 2/1 cells in realizing PDIV and attains power efficiency of 5.5 GHz/mW at maximum clock frequency of 5.5 GHz. Single Programmable Counter (PC) serving as both program and pulse swallow counter in DMP based PDIV, has been reported in [5] to save power. Work in [5] attains highest operable clock frequency of 0.875 GHz and achieves 0.97 GHz/mW of power efficiency. Maximum operational frequency of 4.5 and 5.7 GHz with better power efficiency of 9.78 and 9.82 GHz/mW for [1] and [2] respectively, are achieved when theses works are recreated in 65 nm node. Figure 1(a) shows generic block diagram of such PC with N number of divide by two stages. For this kind of PC in every counting cycle, preset integer number [M ] d reduces progressively with each clock cycle (Fig. 1(b)); where [M ] d = ( ∑ N 1 IN k 2 k-1 ) ≤ 2 N - 1. Inputs {IN k ; k =1: N } are This work is supported by Chip to System Development (C2SD) under Special Manpower Development Programme (SMDP)Ministry of Electronics and Information Technology (MeiTy) under Grant 5(6)/2015-MDD, and Dt. 07-01-2016 The authors are with the Department of Electronics and Electri- cal Communication Engineering, Indian Institute of Technology Kharag- pur, Kharagpur, 721302 India (e-mail: indranilsom@ece.iitkgp.ac.in; san- tunu.sarangi@ece.iitkgp.ac.in; tkb@ece.iitkgp.ac.in). *Corresponding Author Fig. 1. (a) Generic block diagram of PC with (b) EOC timing. used for reloading preset number [M ] d . Outputs from each flip-flop (Q N to Q 1 ) are fed to (EOC) detector. Figure 1(b) shows timing diagram for last two clock cycles (T 1 and T 0 ), before beginning of fresh count cycle at T M-1 . After last clock edge at T 0 , detecting EOC (i.e. {Q N ....Q 1 } = [0] b ) and turning RELOAD = ‘1’ must be completed within time t d1 . Time for completion of loading preset number [M ] d and changing RELOAD = ‘0’ (t RELOAD ), must be within t d2 . Signal RELOAD should be equal to ‘0’ with timing margin t d3 , from fresh count cycle edge of T M-1 . Accommodating t d1 + t d2 + t d3 within last clock cycle T 0 (Fig. 1(b)), is primary speed bottleneck of ripple PC. Works in [1], [2] are based on reducing timing overhead of EOC detection (t d1 ) and pre-setting the PC (t d2 + t d3 ). Works in [1], [2] employ architectural improvement in End of Count (EOC) detection in ripple PC. To the author’s best knowledge after [2], probably no investigation on clock speed improvement, through EOC generation scheme, has been reported. In this brief, we have described a new EOC generation algorithm enhancing operating clock speed up to 7.1 GHz (24% more than recreated version of previous architecture [2]), with improved power efficiency. Moreover, core EOC generator operates with half rate clock (divided by two of input clock), reducing input clock loading. Describing essentials of previous EOC generation algorithms in section II, proposed EOC generation scheme has been elucidated in section III. Measurement results and performance comparisons are carried out in section IV. Conclusion has been drawn in section V. II. PREVIOUS WORKS The architecture for EOC detection as reported in [1] is redrawn in Fig. 2(a). End of counting detection (i.e. D 0 ) is