Statistical Performance Analysis and Estimation for Parallel Multimedia Processing Min Li & Tanja Van Achteren & Erik Brockmeyer & Francky Catthoor Received: 4 April 2008 / Accepted: 4 November 2008 / Published online: 13 January 2009 # 2009 Springer Science + Business Media, LLC. Manufactured in The United States Abstract When parallelizing complex multimedia process- ing on multiple processors, the stochastic timing behavior should be carefully studied. Although there are already many papers on the performance analysis of stochastic parallel system, they are not targeted on multimedia processing. In this paper, first we study H.264/AVC encoder (running on x86) and QSDPCM encoder (running on TI TMS32C62) to characterize important aspects of the stochastic timing behavior in complex multimedia process- ing applications. It is shown that the variation and correlation are indeed very significant. In order to make systematic analysis feasible, we apply Stochastic Timed Marked Graph (STMG) as a formal model to capture essential timing related behaviors of parallel multimedia processing systems. Then, we show how the local timing variations and correlations interact and propagate to the global timing behavior; from this we conclude general parallelization guidelines. Furthermore, we develop an analytical performance estimation technique to derive the probability distribution of timing behavior for parallel multimedia processing systems that have correlated sto- chastic timing behaviors inside. The estimation technique is based on principal component analysis and approximations. Keywords Statistical analysis . Parallel signal processing . Stochastic Timed Marked Graph 1 Introduction Nowadays, parallelization has been widely recognized as an effective methodology to handle the challenges of low power and high performance in embedded system. When paralleliz- ing complex multimedia processing systems, the stochastic timing behavior has to be carefully handled. Despite some simple multimedia processing applications that are relatively deterministic and static, many complex multimedia process- ing applications are inherently nondeterministic. First of all, there are numerous conditions in many multimedia processing algorithms. Although some of them are manifest and can be analyzed statically, many of them depend on input data. For example, in many newly proposed motion estimation algo- rithms, the complexity depends on motion fields, and intensive movement of objects incurs high complexity. Moreover, many nondeterministic factors exist in the imple- mentation platform, such as unpredictable cache misses, page misses, bus conflicts, wrong speculations, wrong branch predictions, interruptions, coexistence of other applications, nondeterministic OS scheduling mechanisms and so on. The importance of the stochastic behavior is expected to increase further in the near future, and this is mainly driven by the drastically increasing complexity of multimedia processing and the technology scaling in deep submicron era, which implies significant variations on the circuit. When the stochastic behavior meets parallelization, signif- icant troubles will arise. Because of intensive synchroniza- tions, concurrent threads with stochastic behaviors will block each other much more often and in a much more unpredictable way, so that the stochastic behavior will be propagatedand J Sign Process Syst (2010) 58:105116 DOI 10.1007/s11265-008-0318-z M. Li : T. V. Achteren : E. Brockmeyer : F. Catthoor (*) IMEC, Kapeldreef 75, B-3000 Leuven, Belgium e-mail: Francky.Catthoor@imec.be M. Li e-mail: limin@imec.be T. V. Achteren e-mail: Tanja.VanAchteren@imec.be E. Brockmeyer e-mail: Erik.Brockmeyer@imec.be