A Minimalistic Architecture for Reconﬁgurable WFS-Based Immersive-Audio Dimitris Theodoropoulos Georgi Kuzmanov Georgi Gaydadjiev D.Theodoropoulos@tudelft.nl G.K.Kuzmanov@tudelft.nl g.n.gaydadjiev@tudelft.nl Computer Engineering Laboratory, EEMCS, TU Delft P.O. Box 5031, 2600 GA Delft, The Netherlands http://ce.et.tudelft.nl Abstract—We propose a minimalistic processor architecture tailoring Wave Field Synthesis (WFS)-based audio applications to conﬁgurable hardware. Eleven high-level instructions provide the required ﬂexibility for embedded WFS customization. We describe the implementation of the proposed instructions and apply them to a multi-core reconﬁgurable WFS architecture. Our approach combines software programming ﬂexibility with improved hardware performance and low power consumption. Experimental results suggest that our Virtex4FX60-based FPGA prototype, running at 100 MHz, can provide a kernel speedup of up to 4.5 times compared to an OpenMP-annotated software solution implemented on a Core2 Duo at 3.0 GHz. Furthermore, when larger FPGAs are utilized, we estimate that our system can render in real-time up to 32 acoustic sources when driving 64 loudspeakers. Ultimately, we estimated that the proposed system requires approximately 6 Watts, which is at least an order of magnitude less power compared to x86-based approaches. I. I NTRODUCTION The Wave Field Synthesis (WFS) is a technique that improves substantially the sound quality over stereophony [1]. Moreover, in order to generate the original acoustic wavefronts [2], it requires large loudspeakers arrays that need to be properly driven. Research on literature reveals that the majority of experimental and commercial WFS systems are implemented using General Purpose Processors (GPPs). The primary reason is due to their high-level programming environment, thus a system under development can be tested more rapidly. For example, the designer has the option to easily select different system parameters, such as the number of loudspeakers or the Finite Impulse Response (FIR) ﬁlter coefﬁcients sets, in order to conduct various experiments. However, two drawbacks are introduced, namely limited processing capabilities and excessive power consumption. In order to alleviate these obstacles, we propose a min- imalistic processor architecture 1 for embedded WFS. The supporting programming model allows a high-level interac- tion with a custom-hardware WFS processor, thus alleviating the need of long-time iterations to re-test the system that 1 Throughout this paper, we adopt the terminology from [3], according to which, the computer architecture is termed as the conceptual view and functional behavior of a computer system as seen by its immediate viewer - the programmer. The underlying implementation, termed also as micro- architecture, deﬁnes how the control and the datapaths are organized to support the architecture functionality. is under development. Moreover, our proposal combines the programming ﬂexibility of software approaches with the high performance and comparatively lower power consumption of the contemporary reconﬁgurable hardware. The architecture implementation allows utilization of varying number of pro- cessing elements, therefore, it is suitable for mapping on re- conﬁgurable technology. More speciﬁcally, the contributions of this paper are the following: ∙ We propose a unique minimalistic processor architec- ture, which is specialized for WFS processing and consists of eleven instructions, a dedicated memory organization and a Special Purpose Register (SPR) ﬁle. The architecture is scalable and allows programmer’s control over the underlying micro-architectural conﬁg- uration. Thus, once written, the same program can be executed on various implementation conﬁgurations. ∙ We implement a hardware prototype of our architec- ture as an embedded Multi-Core WFS Processor (MC- WFSP) on a V4FX60 FPGA. Our prototype can render up to 32 real-time sources when driving 56 loudspeak- ers, while larger FPGAs could accommodate systems that support 64 loudspeakers. ∙ Experimental results suggest that our prototype can process data 4.5 times faster compared to an OpenMP- annotated software implementation on a Core2 Duo running at 3.0 GHz. Also, our hardware design provides a power-efﬁcient solution. It consumes approximately 6 Watts, which is an order of magnitude less power compared to x86-based systems that require tens of Watts when in operation mode. The rest of the paper is organized as follows: Section II provides a brief background on the WFS technique and references to systems that utilize it. In Section III, we propose our architecture, while Section IV presents its hardware im- plementation. In Section V we compare our prototype against a software approach and related work. Finally, Section VI concludes the paper. II. BACKGROUND AND RELATED WORK Background: As it was mentioned in Section I, the WFS utilizes loudspeaker arrays to render all acoustic sources. In order to drive the i-th element of an L-sized array with