Analysis/Synthesis Approaches for Creatively Processing Video Signals Javier Villegas School of Information: Science, Technology, and Arts University of Arizona javier.villegasp@gmail.com Angus Graeme Forbes University of Illinois at Chicago Department of Computer Science aforbes@uic.edu ABSTRACT This paper explores methods for the creative manipulation of video signals and the generation of animations through a process of analysis and synthesis. Our approach involves four distinct steps, and diﬀerent creative outputs based on video inputs can be obtained by choosing diﬀerent alterna- tives at each of the steps. First, we decide which features to extract from an input video sequence. Next, we choose a matching strategy to associate the features between a pair of video frames. Then, we choose a way to interpolate be- tween corresponding features within these frames. Finally, we decide how to render these elements when resynthesizing the signal. We illustrate our approach with a range of dif- ferent examples, including video manipulation experiments, animations, and real-time multimedia installations. Categories and Subject Descriptors I.3.3 [Computer Graphics]: Picture/Image Generation; I.4.9 [Image Processing and Computer Vision]: Ap- plications; J.5 [Computer Applications]: Art and Hu- manities—ﬁne arts, performing arts General Terms Algorithms, Design, Experimentation Keywords Analysis/Synthesis, resynthesis techniques, video process- ing, animation, media arts, computer graphics 1. INTRODUCTION Signal alteration is a well established means for artistic expression in the visual arts. Popular tools such as Photo- shop, Instagram, and After Eﬀects enable a user to explore creative eﬀects by, for instance, applying ﬁlters to an input image or video. We introduce a powerful strategy for the manipulation of video signals that combines the processes Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full cita- tion on the ﬁrst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. Request permissions from permissions@acm.org. MM’14, November 03 - 07 2014, Orlando, FL, USA. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-3063-3/14/11 $15.00. http://dx.doi.org/10.1145/2647868.2654944. of analysis and synthesis. After an analysis process a sig- nal is represented by a series of elements or features. This representation can be more appropriate than the original for a wide range of applications, including, for example, the compression and transmission of video signals [27], or, as we describe in this paper, this representation can be used to generate new modiﬁed instances of the starting signal. In the audio domain, Analysis/Synthesis (hereafter, A/S) strategies have been used extensively in creative applica- tions. The phase vocoder is perhaps the best known A/S audio processing algorithm [8]. With the phase vocoder it is possible to manipulate the duration of a signal and the pitch of a signal independently. Other popular eﬀects in- clude dispersion, robotization, whisperization and automatic tuning [44]. However, although A/S techniques are some- times used for processing videos, in general there is less of an emphasis on using A/S for creative, real-time techniques on video signals. In one sense, many eﬀects applied on static images, includ- ing mosaicing, pointillism, and other non-photorealistic rep- resentations, can be thought of as A/S processes. In these processes, a particular set of features (e.g, regions, lines, or objects) are identiﬁed through an analysis of the input image. These features are then used to describe new ele- ments, which are then synthesized into a modiﬁed version of the original image. Thus, despite even potentially ex- treme modiﬁcations, the newly-created, non-photorealistic image nonetheless retains many aspects of the identity of the original input. In extending this technique to video in- put, a common problem with the A/S techniques, and many other non-photorealistic rendering approaches, is that when frames are analyzed independently the detected features can vary abruptly between consecutive frames. This is due to the nature of the detection algorithm or its sensitivity to noise. These rapid variations create distracting artifacts when the independently synthesized frames are put together in an an- imation. According to B´ enard et al., this issue of tempo- ral coherence has prevented non-photorealistic techniques, or stylized animations, from being more widely adopted for video manipulation [1]. We present a novel approach to the implementation of A/S techniques applied to video signals. Our approach involves matching elements between image pairs, i.e., video frames, and involves constructing video processing techniques over one or more of four distinct stages, each one of which enables diﬀerent creative decisions to be made. These matchings are not necessarily constrained to the image itself, but instead, as we show in Section 4, can take place in a diﬀerent do-