IEEE Proof Web Version IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 A Hardware Implementation of a Run-Time Scheduler for Reconfigurable Systems Juan Antonio Clemente, Javier Resano, Carlos González, and Daniel Mozos Abstract—New generation embedded systems demand high performance, efficiency, and flexibility. Reconfigurable hardware can provide all these features. However, the costly reconfiguration process and the lack of management support have prevented a broader use of these resources. To solve these issues we have devel- oped a scheduler that deals with task-graphs at run-time, steering its execution in the reconfigurable resources while carrying out both prefetch and replacement techniques that cooperate to hide most of the reconfiguration delays. In our scheduling environment, task-graphs are analyzed at design-time to extract useful informa- tion. This information is used at run-time to obtain near-optimal schedules, escaping from local-optimum decisions, while only carrying out simple computations. Moreover, we have developed a hardware implementation of the scheduler that applies all the optimization techniques while introducing a delay of only a few clock cycles. In the experiments our scheduler clearly outperforms conventional run-time schedulers based on as-soon-as-possible techniques. In addition, our replacement policy, specially designed for reconfigurable systems, achieves almost optimal results both regarding reuse and performance. Index Terms—Field-programmable gate arrays (FPGA), recon- figurable architectures, task scheduling. I. INTRODUCTION I N THE LAST few years embedded devices have become more and more complex, including functionality initially developed for general purpose platforms such as multimedia support (sound processing, texture rendering, image and video displaying ). In fact the new generation of portable devices has inherited the area and energy constraints of embedded sys- tems, and at the same time they must achieve the performance required by multimedia applications. The best way to meet these constraints is to include some hardware (HW) support that can Manuscript received August 25, 2009; revised January 11, 2010 and April 25, 2010; accepted April 29, 2010. This work was supported by the Spanish Department of Science and Innovation under Grant TIN2009-09806 and Grant AYA2009-13300. J. A. Clemente is with the Computer Architecture Department, Universidad Complutense de Madrid, Madrid <PLEASE PROVIDE POSTAL CODE.>, Spain (e-mail: ja.clemente@fdi.ucm.es). J. Resano was with the Computer Architecture Department, Universidad de Complutense de Madrid, Madrid <PLEASE PROVIDE POSTAL CODE.>, Spain. He is now with the Computer Engineering Department, Universidad de Zaragoza, Zaragoza <PLEASE PROVIDE POSTAL CODE.>, Spain (e-mail: jresano@unizar.es). C. González and D. Mozos are with Universidad Complutense de Madrid, Madrid <PLEASE PROVIDE POSTAL CODE.>, Spain (e-mail: carlosgon- zalez@fdi.ucm.es; mozos@fis.ucm.es). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2010.2050158 speed up the execution, and even reduce the energy consump- tion. Traditionally this migration was carried out developing application-specific integrated circuits (ASICs), which are sil- icon circuits customized for a particular use. However, although using ASICs is a very efficient option three important draw- backs prevent their use as a general solution. First, the HW area in embedded/portable devices is very constrained. Hence only very critical functionality can be migrated to HW. Second, de- veloping a new ASIC involves an increase in time-to-market, which is frequently a key factor for the success of a platform. Finally, their functionality is fixed, and cannot be updated in order to fix some detected bugs, or improve the efficiency of the system. One interesting option to overcome these three limitations is to include reconfigurable HW resources: run-time reconfigura- tion allows reusing the same HW for different functionalities in order to meet the area constraints; the time-to-market is consid- erably shorter for reconfigurable HW than for ASICs, because the physical platform has been already tested, and the new func- tionality can be tested in the target board since the beginning of the design-cycle; finally, it offers an interesting tradeoff be- tween both performance and flexibility. Thus, this technology is especially suitable for applications that have dynamic and/or unpredictable behavior. In fact Sony has developed its own re- configurable architecture, and has included it in some portable devices. 1 In embedded systems, applications are often represented as one or several direct acyclic graphs (DAGs), where the nodes specify computational tasks and the edges represent precedence constraints. Managing efficiently the execution of these graphs is critical for embedded systems. Therefore, it is essential to optimize it. When dealing with reconfigurable systems the fol- lowing several issues must be taken into account in order to deal with DAGs efficiently. The system must manage the task-graph information and must guarantee that the execution meets the precedence constraints. It must schedule the task execution attempting to achieve the required performance. It must also efficiently schedule the run-time reconfigura- tions. Most of current reconfigurable devices only include one reconfiguration circuitry and the reconfiguration laten- cies are frequently very significant (of the order of millisec- onds), hence if many reconfigurations are demanded in a short period of time, the performance of the system can be seriously affected. When this happens the reconfiguration 1 [Online]. Available: www.sony.net/Products/SC-HP/cx_news/vol42/pdf/ sideview42.pdf 1063-8210/$26.00 © 2010 IEEE