IEEE EMBEDDED SYSTEMS LETTERS, VOL. 1, NO. 3, OCTOBER 2009 69 Efﬁcient Software Synthesis for Dynamic Single Appearance Scheduling of Synchronous Dataﬂow Weichen Liu, Student Member, IEEE, Zonghua Gu, Member, IEEE, and Jiang Xu, Member, IEEE Abstract—Synchronous dataﬂow (SDF) is a widely-used model of computation for digital signal processing and multimedia applications. In this letter, we propose an automatic approach to synthesize efﬁcient software from SDF models with improved runtime efﬁciency. Our synthesis technique is based on dynamic single-appearance scheduling (dynSAS), which generates software with minimized code size, the same as traditional single-appear- ance schedule (SAS), while requires much less buffer memory space. We enhance dynSAS systematically to reduce control ﬂow overhead and increase memory utilization. Experiment results show that our approach can generate efﬁcient software with enhanced runtime performance compared to related techniques. Index Terms—Genetic algorithms, scheduling, software syn- thesis, synchronous dataﬂow. I. INTRODUCTION S YNCHRONOUS dataﬂow (SDF) is a widely used model of computation for a broad class of DSP applications like signal processing, multimedia, and wireless communications [1]. SDF can be statically analyzed and scheduled, and can be used to generate efﬁcient implementations in terms of memory size and runtime performance on embedded systems with lim- ited resources. Traditional single-appearance schedule (SAS) [2] can minimize code memory, in which each actor invocation appears exactly once in the program body, but may require very large data memory. Some authors [3], [4] have developed ef- ﬁcient quasistatic scheduling techniques to construct dynamic SAS (dynSAS) to minimize data memory. The challenge is how to reduce the control ﬂow overhead induced by runtime deci- sions, since a large number of runtime decisions can have a neg- ative impact on performance. In this letter, given any feasible buffer memory setting, we will provide a solution that constructs dynSAS with minimized control ﬂow overhead. Bjorklund [5] ﬁrst developed an approach to synthesize SAS code from a given non-SAS using dynamic runtime deci- sions. Gu [4] enhanced Bjorklund’s approach by minimizing the number of runtime decisions. Ko [6] provides a general- ized technique for compact schedule representations. Oh [3] developed dynamic loop count single-appearance schedule (dlcSAS) by computing loop counts dynamically at runtime. This technique incurs some performance overheads from Manuscript received November 23, 2009; revised December 27, 2009. First published January 08, 2010; current version published February 05, 2010. This manuscript was recommended for publication by P. Tabuada. W. Liu and J. Xu are with the Hong Kong University of Science and Tech- nology, Hong Kong, China (e-mail: weichen@ust.hk; eexu@ust.hk). Z. Gu is with Zhejiang University, Hangzhou, China (e-mail: zgu@zju.edu. cn). Digital Object Identiﬁer 10.1109/LES.2009.2039851 dynamic computation of loop counts in addition to runtime decision overheads. In this letter, we develop software synthesis technique that improves upon [5], [3], and [4] to further reduce runtime overheads while avoiding any runtime loop count calculations. We propose an accurate model to measure control ﬂow overhead, and use genetic algorithms to search for an optimized actor appearance order (AAO) (not necessarily a topological sort) to minimize runtime overhead, with a sophis- ticated synthesis approach. This letter is structured as follows: we present the bounded greedy algorithm (BGA) for SDF scheduling in Section II; the software synthesis technique is presented in Section III; per- formance evaluation results in Section IV and conclusions in Section V. II. THE BOUNDED GREEDY ALGORITHM SDF graphs can be scheduled inﬁnitely by repeating a periodic schedule following the repetition vector. We adopt the BGA that tries to ﬁre every actor as many times as pos- sible sequentially in the AAO (greedy), while respecting an upper bound equal to the actor’s entry in the repetition vector (bounded). It is a scheduling policy with deterministic behaviors, developed for generating periodic schedules with minimized control ﬂow overhead in the synthesized code. Let be the buffer size of edge , be the current number of tokens on edge , and and be the production and consumption rate of edge (number of tokens produced or consumed in one ﬁring). The repetition vector can be obtained by solving the balance equation [1]. The loop count of actor , deﬁned as the number of ﬁrings of in an iteration, is constrained by three factors: number of tokens in its input buffers, number of free spaces in its output buffers, and its re- maining number of ﬁrings , which is deﬁned as ’s entry in the repetition vector minus its accumulated number of ﬁrings (1) After actor ﬁres for times, the changes on the number of tokens in the sets of incoming edges and outgoing edges are (2) Alg. 1 shows the pseudo-code for BGA. It is originally pro- posed for the schedulability test of an SDF graph [7], while we 1943-0663/$26.00 © 2009 IEEE Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 26,2010 at 12:22:30 UTC from IEEE Xplore. Restrictions apply.