1 Support-Reducing Decomposition for FPGA Mapping Lucas Machado and Jordi Cortadella, Fellow, IEEE Abstract—Decomposition is a technology-independent process, in which a large complex function is broken into smaller, less complex functions. The costs of two-level or factored-form rep- resentations (cubes and literals) are used in most decomposition methods, as they have a high correlation with the area of cell- based designs. However, this correlation is weaker for ﬁeld- programmable gate arrays (FPGAs) based on look-up tables. Furthermore, local optimizations have limited power due to the structural bias of the circuit descriptions. This paper tries to reduce the structural biasing by remapping the LUT network and decomposing the derived functions using the support as cost function. The proposed method improves the FPGA mapping results of a commercial tool for the 20 largest MCNC benchmarks, with gains of 28% in delay plus 18% in area when targeting delay, and a reduction of 28% in area plus 14% in delay with area as cost function. Results with 23% less area and 6% less delay are obtained after physical synthesis (post place-and-route). Moreover, 12 of the best known results for delay (and 3 for area) of the EPFL benchmarks are improved. Index Terms—Logic decomposition, logic synthesis, perfor- mance optimization, support reducing, technology mapping. I. I NTRODUCTION F IELD-PROGRAMMABLE gate arrays (FPGAs) are inte- grated circuits consisting of programmable logic blocks and interconnections. FPGAs can be reprogrammed multiple times, and have a much smaller initial cost and production time in comparison with application-speciﬁc integrated circuits (ASIC). For these reasons, FPGAs are largely used for ASIC prototyping and low-volume applications. However, when compared to ASICs, the ﬂexibility given by FPGAs comes at the expense of larger area, power consumption, and delay [2]. Recently, FPGAs started to be employed in the optimization of speciﬁc tasks in data centers, with technology leaders making efforts in hybrid solutions with ASICs and FPGAs [3], [4]. The FPGA implementation process inherited many tech- niques from the ASIC design ﬂow. The use of well-established methods enabled the fast growing and wide usage of FPGAs, but these algorithms generally have cost functions customized for cell-based designs, in which the area is proportional to the number of transistors. Usual cost functions in logic synthesis are cubes in sum-of-product (SOP) forms, literals in Boolean function expressions, or nodes and levels of And-Inverter This work was performed with the support of CNPq, Conselho Nacional de Desenvolvimento Cient´ ıﬁco e Tecnol´ ogico, Brasil, in part by the Spanish Ministry for Economy and Competitiveness and the European Union (FEDER funds) under grant TIN2017-86727-C2-1-R, and in part by the Generalitat de Catalunya (2017 SGR 786). This paper was recommended by Associate Editor R. Drechsler. (Corresponding author: Lucas Machado.) The authors are with the Computer Science Department, Uni- versitat Polit` ecnica de Catalunya, 08034 Barcelona, Spain (e-mail: lmachado@cs.upc.edu; jordi.cortadella@upc.edu). Graphs (AIGs). On the other hand, FPGAs based on look- up tables (LUTs) are composed of logic blocks with k inputs (typically 4 to 6), and each LUT can implement any logic function of up to k inputs. A study on this miscorrelation is presented in [5], showing that the reduction of nodes and levels in AIGs does not necessarily translates to fewer LUTs or less logic depth in the FPGA mapping generated. A. Previous work Several works on FPGA mapping are based on cut- enumeration, performing a covering of the subject graph using k-cuts [6]–[8]. These cut-based techniques vary on the algorithms, parameters, and cost functions used for the cut- enumeration and covering. Nevertheless, the quality of the solution heavily depends on the structure of the subject graph. A second group of works rely on Binary Decision Diagrams (BDDs) to perform FPGA mapping [9]–[12]. BDDs usually provide per se a good starting point for FPGA mapping, as the redundant variables are removed and the structure size is reduced. Also, BDDs enable the use of functional techniques, reducing the structural bias. However, the complexity of BDDs increases signiﬁcantly with the number of variables, becoming computationally unfeasible for large designs. Thus, BDD- based methods are often applied to portions of the circuit (partial collapsing), but these methods are also structurally biased. This work proposes to combine these two strategies, using both functional decomposition and cut-based mapping. The idea of performing decomposition while reducing the support (and targeting FPGAs) has already been proposed. The support is minimized using don’t cares in [13], as explained in [14]. In [15], it is proposed a complex decomposition aiming support minimization, by identifying the compatibility of all variables (or classes) in the bound-set. BoolMap [9] uses the decomposition proposed in [15]. Our work proposes the restructuring of the LUT network using the support size as cost function, with the aid of simple and fast decompositions. The support-reducing techniques presented in this paper are well-known methods, with the exception of the abstraction- based decompositions (see Section IV-E). Other decomposition methods could be considered, such as [15]–[18], which are slower, but could improve the quality of results. Still, the key idea is to consider the support size as the cost function for decomposition, which restructures the subject graph targeting LUT-based FPGAs, and not the techniques incorporated. In this paper, a decomposition of a function F is considered support-reducing if the decomposing functions have their support size smaller than F . This deﬁnition differs from [19], which limits the term support-reducing to disjoint-support decompositions (DSD). © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI 0.1109/TCAD.2018.2878187