High-Performance Global Routing with Fast Overflow Reduction * Huang-Yu Chen , Chin-Hsiung Hsu , and Yao-Wen Chang †‡ Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan Abstract— Global routing is an important step for physical design. In this paper, we develop a new global router, NTUgr, that contains three major steps: prerouting, initial routing, and enhanced iterative negotiation-based rip-up/rerouting (INR). The prerouting employs a two-stage technique of congestion-hotspot historical cost pre-increment followed by small bounding-box area routing. The initial routing is based on efficient iterative monotonic routing. For traditional INR, it has evolved as the main stream for the state-of-the-art global routers, which reveals its great ability to reduce the congestion and overflow. As pointed out by recent works, however, traditional INR may get stuck at local optima as the number of iterations increases. To remedy this deficiency, we replace INR by enhanced iterative forbidden-region rip-up/rerouting (IFR) which features three new techniques of (1) multiple forbidden regions expansion, (2) critical subnet rerouting selection, and (3) look-ahead historical cost increment. Experimental results show that NTUgr achieves high- quality results for the ISPD’07 and ISPD’08 benchmarks for both overflow and runtime. I. I NTRODUCTION The very-large-scale circuit designs have brought new challenges for modern routers. Global routing is the first stage to tackle the stringent routing challenges; theoretically, detailed routing cannot complete if the global router could not generate an overflow-free solution. A good global router can systematically guide a detailed router to avoid congestion and achieve high routability, thus speeding up the time-consuming detailed routing process. Although many routing techniques have been studied and developed, such as maze routing [14], A*-search routing [7], pattern routing [13], monotonic routing [18], multicommodity flow [1] and integer linear program- ming (ILP) [10], it is not clear whether or not these traditional methods have sufficient capability to handle the upcoming design challenges. To encourage the development of effective global routing solutions, the ACM Int. Symposium on Physical Design (ISPD) held two global routing contests in 2007 and 2008. Driven by this world- wide competition, effective and efficient global routers have been developed in these two years [2], [5], [9], [16]–[19]. The iterative negotiation-based rip-up/rerouting (INR) [15], adopted by the state-of-the-art routers [5], [9], [17], [19], has revealed its great ability to spread out congestion as well as to reduce the over- flow, and thus INR becomes the main stream for developing modern global routers. In [19], the Lagrange Relaxation (LR) mathematical basis for INR was further explored. As pointed out by recent works [9], [17], however, INR may get stuck at local optima as the number of iterations increases, thus requiring additional schemes to resolve this problem. Archer [17] used a history scale factor to split INR into the initiation, negotiation, and convergence phases and developed an LR-based bounded-length min-cost topology improvement algorithm to improve INR. In [9], —————————————————————————————— * This work was supported in part by National Science Council of Taiwan under Grant No’s. NSC 96-2752-E-002-008-PAE, NSC 96-2628-E-002-248- MY3, NSC 96-2628-E-002-249-MY3, and NSC 96-2221-E-002-245. NTHU-Route developed a refinement process to further reduce over- flow when INR gets stuck at a local optimum. In this paper, we develop a new global router, NTUgr, that contains three major steps: prerouting, initial routing, and enhanced INR. The prerouting employs two new techniques: (1) congestion-hotspot historical cost pre-increment and (2) small bounding-box area routing. Especially, the traditional INR is replaced by enhanced iterative forbidden-region rip-up/rerouting (IFR) which features three new techniques: (1) multiple forbidden regions expansion, (2) critical sub- nets rerouting selection, and (3) look-ahead historical cost increment. Experimental results show that NTUgr achieves high-quality results for the ISPD’07 and ISPD’08 benchmarks for both overflow and runtime, demonstrating the effectiveness of the proposed flow. In particular, our router obtains the best routing solution for the most difficult instance of the ISPD’07 benchmarks, newblue3 (with only 31024 overflows), and achieves 10.8x–74.8x runtime speedups (with similar total wirelength), which is one of the fastest global routers reported in the literature. The rest of this paper is organized as follows. Section II describes the routing model and the problem formulation. Section III presents our global routing flow. Experimental results are reported in Sec- tion IV, and conclusions are given in Section V. II. PROBLEM FORMULATION For global routing, the routing region is partitioned into tiles (or called global cells) and a 2D or 3D routing graph composed of nodes (called global tile nodes) and edges (called global edges) models the routing region, where the global tile node represents a tile, and the global edge models the relationship between adjacent tiles. Each global edge is associated with a capacity to model the limited routing resource such as the number of available detailed routing tracks on the tile boundary or the maximum allowable via count between adjacent layers. The main objective of global routing is to minimize the total overflow, which is calculated by the total amount of routing demand that exceeds the capacity for all edges. The ISPD’07 metrics evaluate the global routing solution by the prioritized order: (1) the total overflow, (2) the maximum overflow, and (3) the total wirelength (each via connection equals three-unit wirelength). The ISPD’08 contest allows the contestants to use up to 4 CPUs in parallel, and the prioritized order of ISPD’08 metrics is (1) the total overflow, (2) the maximum overflow, and (3) the weighted total wirelength (each via connection equals one-unit wirelength). The weighted total wirelength is equal to the original total wirelength multiplied by (1 + min{0.1, 0.04 log 2 (cpu time / median cpu time)}), e.g.,a router would get 4% wirelength reduction per 2x faster runtime, and the maximum wirelength reduction is up to 10%. III. ROUTING METHODOLOGY The flow of our global router is shown in Fig. 1. For better trade- off between runtime and quality, we do not apply the time-consuming 3D routing approach but adopt the paradigm of 3D-to-2D capacity mapping followed by the planar (2D) routing and the 2D-to-3D layer assignment, similar to the previous routers [9], [19]. Different from the aforementioned works, our planar routing features a few new techniques incorporated in the three major steps: (1) prerouting, (2) initial iterative monotonic routing, and (3) iterative forbidden-region rip-up/rerouting (IFR) (see Fig. 1). We detail these techniques in this 978-1-4244-2749-9/09/$25.00 ©2009 IEEE 6B-3s 582