Proceedings of the NAACL HLT Workshop on Computational Approaches to Linguistic Creativity, pages 40–46, Boulder, Colorado, June 2009. c 2009 Association for Computational Linguistics Automatic Generation of Tamil Lyrics for Melodies Ananth Ramakrishnan A Sankar Kuppan Sobha Lalitha Devi AU-KBC Research Centre MIT Campus, Anna University Chennai, India AU-KBC Research Centre MIT Campus, Anna University Chennai, India AU-KBC Research Centre MIT Campus, Anna University Chennai, India ananthrk@au-kbc.org sankar@au-kbc.org sobha@au-kbc.org Abstract This paper presents our on-going work to automatically generate lyrics for a given melody, for phonetic languages such as Tamil. We approach the task of identifying the required syllable pattern for the lyric as a sequence labeling problem and hence use the popular CRF++ toolkit for learning. A corpus comprising of 10 melodies was used to train the system to understand the syllable patterns. The trained model is then used to guess the syllabic pattern for a new melody to produce an optimal sequence of syllables. This sequence is presented to the Sentence Generation module which uses the Dijkstra's shortest path algorithm to come up with a meaningful phrase matching the syllabic pattern. 1 Introduction In an attempt to define poetry (Manurung, 2004), provides three properties for a natural language arti- fact to be considered a poetic work, viz., Meaning- fulness (M), Grammaticality (G) and Poeticness (P). A complete poetry generation system must generate texts that adhere to all the three properties. In this work, our attempt would be to generate meaningful lyrics that match the melody and the poetic aspects of the lyric will be tackled in future works. According to on-line resources such as How to write lyrics (Demeter, 2001), the generated lyric must have Rhythm, Rhyme and Repetition. One of the recent attempts for automatically gen- erating lyrics for a given melody is the Tra-la- Lyrics system (Oliveira et al., 2007). This system uses the ABC notation (Gonzato, 2003) for repre- senting melody and the corresponding suite of tools for analyzing the melodies. The key aspect of the system is its attempt to detect the strong beats present in the given melody and associating words with stressed syllables in the corresponding posi- tions. It also evaluates three lyric generation strate- gies (Oliveira et al., 2007) – random words+rhymes, sentence templates+rhymes and grammar+rhymes. Of these strategies, the sentence templates+rhymes approach attempts for syntactical coherence and the grammar+rhymes approach uses a grammar to derive Portuguese sentence templates. From the demo runs presented, we see that the sys- tem can generate grammatical sentences (when us- ing an appropriate strategy). However, there is no attempt to bring Meaningfulness in the lyrics. 2 Lyric Generation for Tamil Tamil, our target language for generating lyrics, is a phonetic language. There is a one-to-one relation between the grapheme and phoneme. We make use of this property in coming up with a generic repre- sentation for all words in the language. This repre- sentation, based on the phonemic syllables, consists 40