Computationally Created Soundscapes with Audio Metaphor Miles Thorogood and Philippe Pasquier School of Interactive Art and Technology Simon Fraser University Surrey, BC V3T0A3 CANADA mthorogo@sfu.ca Abstract Soundscape composition is the creative practice of pro- cessing and combining sound recordings to evoke audi- tory associations and memories within a listener. We present Audio Metaphor, a system for creating novel soundscape compositions. Audio Metaphor processes natural language queries derived from Twitter for re- trieving semantically linked sound recordings from on- line user-contributed audio databases. We used a sim- ple natural language processing to create audio ﬁle search queries, and we segmented and classiﬁed au- dio ﬁles based on general soundscape composition cate- gories. We used our prototype implementation of Audio Metaphor in two performances, seeding the system with keywords of current relevance, and found that the sys- tem produced a soundscape that reﬂected Twitter activ- ity and kept audiences engaged for more than an hour. 1 Introduction Creativity is a preeminent attribute of the human condition that is being actively explored in artiﬁcial intelligence sys- tems aiming at endowing machines with creative behaviours. Artiﬁcial creative systems have simulated or been inspired by human creative processes, including, painting, poetry, and music. The aim of these systems is to produce artifacts that humans would judge as creative. Much of the successful research in musical creative systems has focussed on sym- bolic representations of music, often with corpora of musi- cal scores. Alternatively, non-symbolic forms of music have been little explored in as much detail. Soundscape composition is a type of non-symbolic mu- sic aimed to rouse listeners memories and associations of soundscapes using sound recordings. A soundscape is the audio environment perceived by a person in a given locale at a given moment. A listener brings a soundscape to mind with higher cognitive functions like template matching of the perceived world with known sound environments and deriving meaning from the triggered associations (Bottel- dooren et al. 2011). People communicate their subjective appraisal of soundscapes using natural language descrip- tions, revealing the semiotic cues of soundscape experiences (Dubois and Guastavino 2006). Soundscape composition is the creative practice of pro- cessing and combining sound recordings to evoke auditory associations and memories within a listener. It is positioned along a continuum with concrete music that uses found sound recordings, and electro-acoustic music that uses more abstracted types of sounds. Central to soundscape compo- sition, is processing sound recordings. There are a range of approaches to using sound recordings. One approach is to portray a realistic place and time by using untreated audio recordings, or, recordings with only minor editing (such as cross-fades). Another is to evoke imaginary circumstances by applying more intensive processing. In some cases, these manufactured sound environments appear imaginary, by the combination of largely untreated with more highly processed sound recordings. For example, the soundscape composition Island, by Canadian composer Barry Truax (Truax 2009), adds a mysterious quality to a recognizable sound environment by contrasting clearly discernible wave sounds against less-recognizable background drone and tex- ture sounds. Soundscape composition requires many decisions about selecting and cutting audio recordings and their artistic com- bination. These processes become exceedingly time con- suming for people when large amounts of audio data are available, as is now the case with online databases. As such, different generative soundscape composition systems have automated many sub-procedures of the composition process, but we have not found any systems in the literature to date that use natural language processing for generative sound- scape composition. Likewise, automatic audio segmentation for soundscape composition speciﬁc categories is an area not yet explored. The system described here searches online for the most recent Twitter posts about a small set of themes. Twitter pro- vides an accessible platform for millions of discussions and shared experiences through short text-based posts (Becker, Naaman, and Gravano 2010). In our research, audio ﬁle search queries are generated from natural language queries derived from Twitter. However, these requests could be a memory described by a user, a phrase from a book, or a sec- tion of a research paper. Audio Metaphor accepts a natural language query (NLQ), which is made into audio ﬁle search queries by our algo- rithm. The system searches online for audio ﬁles semanti- cally related to word features in the NLQ. The resulting au- dio ﬁle recommendations are classiﬁed and segmented based Proceedings of the Fourth International Conference on Computational Creativity 2013 1