IEEE EMBEDDED SYSTEMS LETTERS, VOL. 1, NO. 3, OCTOBER 2009 69 Efficient Software Synthesis for Dynamic Single Appearance Scheduling of Synchronous Dataflow Weichen Liu, Student Member, IEEE, Zonghua Gu, Member, IEEE, and Jiang Xu, Member, IEEE Abstract—Synchronous dataflow (SDF) is a widely-used model of computation for digital signal processing and multimedia applications. In this letter, we propose an automatic approach to synthesize efficient software from SDF models with improved runtime efficiency. Our synthesis technique is based on dynamic single-appearance scheduling (dynSAS), which generates software with minimized code size, the same as traditional single-appear- ance schedule (SAS), while requires much less buffer memory space. We enhance dynSAS systematically to reduce control flow overhead and increase memory utilization. Experiment results show that our approach can generate efficient software with enhanced runtime performance compared to related techniques. Index Terms—Genetic algorithms, scheduling, software syn- thesis, synchronous dataflow. I. INTRODUCTION S YNCHRONOUS dataflow (SDF) is a widely used model of computation for a broad class of DSP applications like signal processing, multimedia, and wireless communications [1]. SDF can be statically analyzed and scheduled, and can be used to generate efficient implementations in terms of memory size and runtime performance on embedded systems with lim- ited resources. Traditional single-appearance schedule (SAS) [2] can minimize code memory, in which each actor invocation appears exactly once in the program body, but may require very large data memory. Some authors [3], [4] have developed ef- ficient quasistatic scheduling techniques to construct dynamic SAS (dynSAS) to minimize data memory. The challenge is how to reduce the control flow overhead induced by runtime deci- sions, since a large number of runtime decisions can have a neg- ative impact on performance. In this letter, given any feasible buffer memory setting, we will provide a solution that constructs dynSAS with minimized control flow overhead. Bjorklund [5] first developed an approach to synthesize SAS code from a given non-SAS using dynamic runtime deci- sions. Gu [4] enhanced Bjorklund’s approach by minimizing the number of runtime decisions. Ko [6] provides a general- ized technique for compact schedule representations. Oh [3] developed dynamic loop count single-appearance schedule (dlcSAS) by computing loop counts dynamically at runtime. This technique incurs some performance overheads from Manuscript received November 23, 2009; revised December 27, 2009. First published January 08, 2010; current version published February 05, 2010. This manuscript was recommended for publication by P. Tabuada. W. Liu and J. Xu are with the Hong Kong University of Science and Tech- nology, Hong Kong, China (e-mail: weichen@ust.hk; eexu@ust.hk). Z. Gu is with Zhejiang University, Hangzhou, China (e-mail: zgu@zju.edu. cn). Digital Object Identifier 10.1109/LES.2009.2039851 dynamic computation of loop counts in addition to runtime decision overheads. In this letter, we develop software synthesis technique that improves upon [5], [3], and [4] to further reduce runtime overheads while avoiding any runtime loop count calculations. We propose an accurate model to measure control flow overhead, and use genetic algorithms to search for an optimized actor appearance order (AAO) (not necessarily a topological sort) to minimize runtime overhead, with a sophis- ticated synthesis approach. This letter is structured as follows: we present the bounded greedy algorithm (BGA) for SDF scheduling in Section II; the software synthesis technique is presented in Section III; per- formance evaluation results in Section IV and conclusions in Section V. II. THE BOUNDED GREEDY ALGORITHM SDF graphs can be scheduled infinitely by repeating a periodic schedule following the repetition vector. We adopt the BGA that tries to fire every actor as many times as pos- sible sequentially in the AAO (greedy), while respecting an upper bound equal to the actor’s entry in the repetition vector (bounded). It is a scheduling policy with deterministic behaviors, developed for generating periodic schedules with minimized control flow overhead in the synthesized code. Let be the buffer size of edge , be the current number of tokens on edge , and and be the production and consumption rate of edge (number of tokens produced or consumed in one firing). The repetition vector can be obtained by solving the balance equation [1]. The loop count of actor , defined as the number of firings of in an iteration, is constrained by three factors: number of tokens in its input buffers, number of free spaces in its output buffers, and its re- maining number of firings , which is defined as ’s entry in the repetition vector minus its accumulated number of firings (1) After actor fires for times, the changes on the number of tokens in the sets of incoming edges and outgoing edges are (2) Alg. 1 shows the pseudo-code for BGA. It is originally pro- posed for the schedulability test of an SDF graph [7], while we 1943-0663/$26.00 © 2009 IEEE Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 26,2010 at 12:22:30 UTC from IEEE Xplore. Restrictions apply.