Performance-driven Global Placement via Adaptive Network Characterization Mongkol Ekpanyapong and Sung Kyu Lim School of Electrical and Computer Engineering, Georgia Institute of Technology {pop,limsk}@ece.gatech.edu Abstract Delay minimization continues to be an important objective in the design of high-performance computing system. In this paper, we present an effective methodology to guide the delay optimization process of the mincut-based global placement via adaptive sequential network characterization. The contribution of this work is the development of a fully automated approach to determine critical parameters related to performance-driven multi-level partitioning-based global placement with retiming. We validate our approach by incorporating this adaptive method into a state-of-the-art global placer GEO. Our A-GEO, the adaptive version of GEO, achieves 67% maximum and 22% average delay improvement over GEO. 1. Introduction Placement problem can be classified into two classes: global placement and detailed placement. Global placement identifies the location where groups of cells should be located. Whereas detailed placement provide detailed location for each cells such that the global placement solution is preserved. Recently global placement plays a significant role due to the increasing in circuit constraints and complexities. There are three major approaches in global placement: mincut-based approaches [4,13,27,2,5], analytical approaches [10,15], and Simulated Annealing approaches [24,25]. Mincut-based uses top down approach to recursively partition circuits into sub-netlists and assign gates to the tiles. Based on its fast running time and flexibility in handling various constraints, it has been adopted in many modern state-of-the-art placements, including a state-of-the-art timing driven placement [8]. With tremendous increasing demand in high performance computing, circuit performance improvement during physical design becomes highly interesting. During physical planning, gate location can be identified and hence can be used to accurately calculate wire delay. Knowing both gate and wire delay, total delay for the entire circuit can be computed. Circuit optimization on this physical design level can employ this knowledge and gain superior performance over same optimizations without such information. One optimization that can employ this advantage is retiming [17]. Retiming is a logic optimization technique, which shifts the position of flip-flops (FFs) for delay minimization or FFs reduction [17]. Recently, retiming has become more attractive in physical design where wire delay is more essential in the context of deeper submicron technology [23,12]. Exploiting geometric information enables us to further enhance retiming techniques with floorplanning; since location information is available, thus allows more accurate wire delay calculation. Retiming in physical design can be classified into two approaches: iterative approach and simultaneous approach. The iterative approach [26,18,19] first performs placement or floorplanning, after that retiming is performed. The alternative approach [8,6,22,9] simultaneously performs placement or floorplanning with retiming, by incorporating retiming information during placement or floorplanning. In [9], the authors suggest that the latter approach is better than the first with respect to retiming delay improvement. In [8], a state-of-the-art approach for mincut-based placement with retiming, so called GEO, was proposed. The concepts of Sequential Arrival Time (SAT) [21] and Sequential Required Time (SRT) were adopted here. Then slack value, used to identify critical gates/clusters, can be computed as the difference between SRT and SAT. Subsequently, an ε-network which contains the set of critical cells can be identified. By assigning additional delay weight α to an ε-network, those critical cells tend to be grouped closer together during circuit partitioning. Cong et al., [9], extend [8] work by generalizing the model to handle the gates/clusters with multiple outputs. However, both approaches keep best weighted-cutsize among all runs, which is the cutsize that incorporates retiming information as the best result to the next floorplanning level. In this paper, we show that while weighted-cutsize highly correlates with retiming delay, there is no guarantee that it will result in best retiming delay among all runs. Next we propose a methodology to identify the weight α assigned to those critical cells/nets instead of using a fixed constant value such as in [8]. Furthermore, we suggest a way to properly identify critical cells such as ε parameter in [8,9], sometimes referred to as K paths in [1], or defined to be at least 90% of the critical path delay as in [3]. In [22], the authors propose a way to identify critical edges using criticality distribution. However with their method, one has to iteratively search until finding the factor value making 90-100% critical to no more than 5% of the total number of edges. This is time