ACS: an Addressless Configuration Support for Efficient Partial Reconfigurations Jenny Yi-Chun Kuo Anderson Kuei-An Ku Jingling Xue Oliver Diessel Usama Malik* School of Computer Science and Engineering, University of New South Wales, NSW 2052, Australia {kuoy, kua, jingling, odiessel}@cse.unsw.edu.au * Endace Technology Limited, Hamilton 3204, New Zealand usama.malik@endace.com Abstract This paper presents a complete design of a reconfig- urable architecture support system, called ACS (an Ad- dressless Configuration Support), which provides efficient access to non-contiguous reconfigurable locations in re- configurable systems. ACS reduces the amount of partial reconfiguration information required by removing a large amount of addressing information and padding as found in Virtex-4 bitstreams. ACS improves significantly on the distTree architecture previously proposed by us. ACS introduces the selector block which connects the leaf nodes to a consecutive block of reconfiguration locations called a frame set. The sys- tem allows any number of leaf nodes customised to the size of the device, thereby providing much more flexibility. The hardware costs have also been reduced significantly over the distTree design. Together with the new marker load- ing mechanism, ACS is readily applicable to SRAM-based FPGAs. This new ACS system is benchmarked using eight real-world applications against a Virtex-4 device and the results show 6.83%-15.07% speedups when the reconfigu- ration granularity is set to a Virtex-4 frame. 1. Introduction Reconfigurable systems are systems which consist of arrays of reconfigurable hardware such as Field Pro- grammable Gate Arrays (FPGAs). The two main compo- nents of an FPGA are the logic blocks and the switch boxes that provide routing between these logic blocks. These switch boxes as well as the contents of the logic blocks are reprogrammable after fabrication to perform different tasks at different times. Applications nowadays often have more intricate func- tionalities and require more hardware resources than are available. As a result, designs often cannot fit onto a single FPGA device [8]. The solutions are often to use more logic by either networking multiple FPGAs or performing some form of run-time reconfiguration [7]. He et al. [9] proposed a solution by finding optimal strategies in networking multi- ple FPGA devices using crossbars. However, having multi- ple FPGAs not only increases the costs but also power con- sumption. The on-chip/off-chip latencies and synchroniza- tion between the devices are also of concern. In this paper, we take the second approach and present a new hardware support system, called ACS (an Addressless Configuration Support), which aims to facilitate fast run-time partial re- configurations with little extra hardware costs. 1.1. Background A run-time reconfiguration can either be a full or partial reconfiguration. Each configuration is referred to as a con- text and the context is stored in the form of a bitstream. A full reconfiguration requires a complete configuration of the whole system while a partial reconfiguration requires only the differences between the old and the new configurations, resulting in a smaller bitstream length and a smaller recon- figuration latency. The bitstreams are swapped in and out of an FPGA whenever new functionalities are required. Al- though run-time reconfigurable systems offer great flexibil- ities and the ability to accommodate larger designs than the physical capacity of the device, the time it takes to perform a reconfiguration can take up a large proportion of the to- tal execution time resulting in inefficient throughput. This delay is called the reconfiguration latency and is propor- tional to the length of the bitstream, which depends on the area to be reconfigured and the sparsity of the reconfigur- ing blocks. This leads to one of the major shortcomings in current FPGA technologies [13] [14] [15]. 1.2. Motivation In reconfigurable systems, an unused reconfiguration context is usually stored in a memory external to the work- ing hardware and is only swapped in when required. As a PREPRESS PROOF FILE CAUSAL PRODUCTIONS 1