Swarm Modulation: An algorithm for real-time spectral transformation Ryan Janzen and Steve Mann Department of Electrical and Computer Engineering University of Toronto Abstract—A novel class of modulation is introduced, for frequency transformation and spectral synthesis in real-time. Swarm modulation has potential applications to enhance human hearing with extended frequency ranges, in medical diagnostics for electrocardiogram (ECG), electroencephalogram (EEG) and other medical signals, for RADAR analysis, for user-interface soniﬁcation, and for sound synthesis or non-synthetic sound transformation. Swarm modulation is a new way to transform signals, and is demonstrated for transforming subsonic and ultrasonic sound into the audible range of human hearing. Swarm modulation is based on the principle of phase- incoherent frequency-roaming oscillation. Features in the frequency-time plane are reconstructed via a time-varying pro- cess, controllable with instantaneous zero-latency reaction time to new information. Swarm modulation allows prioritization of salient output spectral features for efﬁcient processing, and overcomes cyclic beating patterns when Fourier and wavelet- based methods are applied in a stationary manner. Swarm modulation can ﬂexibly re-map sound when a user expressively touches physical matter creating vibration. By de- tecting subsonic, sonic and ultrasonic vibrations, we can add to materials a rich acoustic user-feedback that can be adjusted to sound like a bell, xylophone, dull piece of wood, or a variety of other objects, in real-time. By dynamically controlling the output sound spectrum depending on the input spectrum, simultaneously with a continuous and low-latency temporal response, the system imitates the physicality of touching a real object. Applied in control panels and expressive control surfaces, swarm modulation can create realistic sonic feedback, for human head-up operation of controls in critical applications. I. I NTRODUCTION Swarm modulation will be mathematically deﬁned in Sec- tion IV. First, we explain why there is a fundamental gap in real-time acoustic spectral transformations. A. Why are Frequency Transformations desired? The human senses operate over various spectral bands: Human hearing is typically sensitive from ∼20Hz to 20kHz, while vision can sense light from approximately 390 to 750 nm in wavelength [9]. Transforming our abilities, computerized eyeglasses have been built to shift infrared light to the red spectrum, all visible light to green, and ultraviolet light to blue, to gain real-time vision beyond what is possible for a human alone to see [10]. Figure 2 illustrates. In this work we attempt to do the same for hearing. This poses a greater challenge in real-time—the durations of wave cycles are much closer to the minimum time increments we can perceive, as compared to visible light, and thus there are complexities beyond simply “copying and mapping” sensor readings as with from pixels from video cameras. Frequency Time Phase-Incoherent Cascaded Oscillators Swarm Modulation Fourier Desired Broadband Reconstruction Zone Wavelet Fuzzy boundaries by dispatch functions With spreading of Range Density Function Generation signal targets in F-T plane Frequency Time Frequency Time Frequency Time Transformation Transformation Zero-latency reactions from dispatch control Non- repeating spectral painting Fig. 1: Comparison of methods for tiling the frequency-time plane, with causal real-time reconstruction of a time-varying requested spec- trum (shaped as ‘<’ in this example). While Prolate Spheroidal Wave Functions [1–5] attempt to approach the limitation in precision of Heisenberg-Gabor uncertainty [6–8], Swarm Modulation overcomes coherent beating while enabling real-time ﬂexible spectral mapping with zero-control-latency, for highly-realistic sound mapping. The problem is especially acute when attempting to “hear” user-interfaces, where we wish to create tactile input devices that are more expressive than a simple on/off button or switch, and which produce a rich acoustic user-feedback response, beyond a simple “beep” sound whenever a button is pushed. Previous work has been able to sonify [11, 12] user-interfaces based on triggering sound samples [13–15] — which destroys most of the input information. Sensing the original acoustic sound allows the device to conform more closely to physical reality [10, 16–19]. The result allows a user to touch highly sensitive surfaces in order to expressively control a computer system, beyond binary on/off keys or switches [15, 19–23]. However, previous frequency transformations that are based on naturally-occurring acoustic sound from human touch are afﬂicted with beating [21, 22], spectral dead zones in the input, and other artifacts which create a low-quality sonic response to touch. We therefore wish to create a new, ﬂexible algorithm for frequency transformation. A key need exists in natural user- interfaces [10], to transform subsonic and ultrasonic vibrations into expressive sound in the audible range. We desire ﬂexibility in mapping spectral ranges, and desire an audio rendering that is responsive to physical acoustics (as argued by Kapralos [16] and Parker & Heerema [17]).