The 26th Annual Conference of the Japanese Society for Artiﬁcial Intelligence, 2012 1K1-IOS-1a-5 Discovering Emotion Features in Symbolic Music Rafael Cabredo, Paul Salvador Inventado, Roberto Legaspi, Masayuki Numao The Institute of Scientiﬁc and Industrial Research, Osaka University Current music recommender systems only use basic information for recommending music to its listeners. These usually include artist, album, genre, tempo and other song information. Online recommender systems would include ratings and annotation tags by other people as well. We propose a recommender system that recommends music depending on how the listener wants to feel while listening to the music. The user-speciﬁc model we use is derived by analyzing brain waves of the subject while he was actively listening to emotion-inducing music. The brain waves are analyzed in order to derive the emotional state of the listener for diﬀerent segments of the music using an emotion spectral analysis method. The emotional state is used to label segments of music that are fed into a supervised machine learning technique to build an emotion model. This emotion model is used to identify the diﬀerent music features that are important for recognizing speciﬁc emotional states. 1. Introduction Music induces diﬀerent kinds of emotions. From the research of [Gabrielsson 03, Kim 10, Livingstone 10] it is known that speciﬁc music features causes these changes in emotion. For example, songs with a fast tempo, in a ma- jor key, has simple harmony and high pitch, generally make people happy and feel excited, while songs having opposite features, such as, having low tempo, in a minor key, low pitch, and complicated harmony are considered songs that can elicit sadness, despair, or melancholy. By recognizing these music features, it can be used to anticipate or even change the emotion or mood of a listener. This can be done automatically by using machine learning techniques to learn the dependencies in the music features given a ground truth of emotion labels. A common problem encountered by previous work is the limitation of the annotation for emotion. It takes a lot of time and resources to annotate music. Lin, et al. [Lin 11] reviews various work on music emotion classiﬁcation and utilize the vast amount of online social tags to improve emo- tion classiﬁcation. However, a personalized emotion model for labelling music would still be desirable. Music that is relaxing for some people may be stressful for others. Songs are also usually annotated with the most promi- nent emotion (i.e. only one emotion label per song). Multi- label classiﬁcation [Trohidis 08] can be used to have richer emotion annotations. These annotations however are still discrete labels. In this work, emotion changes in the entire song are recorded and analyzed to learn how the music features aﬀect these changes. Instead of using discrete labels, a continu- ous annotation is used to give a ﬁne-grained description of emotion changes. One method to acquire continuous emo- tion labels is to use brain waves similar to the work used to develop Constructive Adaptive User Interface (CAUI), which can arrange [Legaspi 07, Numao 97] and compose Contact: Rafael Cabredo, The Institute of Scientiﬁc and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan, tel: +81-6-6879-8426, fax:+81-6-6879-8428, cabredo@ai.sanken.osaka-u.ac.jp [Numao 02] music based on one’s impressions of music. 2. Data Collection Methodology A user speciﬁc model is built by using supervised machine learning techniques to classify songs using music features. As mentioned earlier, this task requires songs that can elicit emotions from a listener and the music features of these songs. For this research, a 29-year old female participated to se- lect and annotate songs. The music collection is a set of MIDI ﬁles comprised of 121 Japanese and Western songs having 33 Folk, 20 Jazz, 44 Pop, and 24 Rock music. By using MIDI ﬁles, the music information can be eas- ily extracted to produce high-level features for the classi- ﬁer. MIDI ﬁles also eliminate any additional emotions con- tributed by lyrics. 2.1 Emotion annotation Music emotion annotation is performed in 3 stages. First, the subject listened to all songs and manually annotated each one. The subject was instructed to listen to the entire song and was given full control on which parts of the song she wanted to listen to. After listening to each song, the subject gives a general impression on how joyful, sad, relaxing, and stressful each song was using a ﬁve-point Likert scale. Aside from the emotions felt, the subject was also asked to rate whether she was familiar with the song or not using the same scale. With this feedback, the 10 most relaxing songs and 10 most stressful songs with varying levels of familiarity to the sub- ject were chosen. The manual annotation was done in one session for approximately one and a half hours. Since collection of the emotion annotations takes a lot of time and eﬀort from the subject, it was decided to con- centrate time and resources on a certain type of emotion, speciﬁcally, relaxing music. Relaxing music was chosen be- cause these are normally the kind of music people would want to listen to on stressful days. The stressful songs are meant to serve as negative examples for the classiﬁer. In the second stage a electroencephalograph (EEG) was used to measure brain wave activity while the subject lis-