International Journal of Computer Applications (0975 – 8887) Volume 132 – No.4, December2015 38 Festival and Festvox Framework Tools for Marathi Text- to-Speech Synthesis Sangramsing Nathusing Kayte Research Scholar Deprtment of Computer Science & IT Dr. Babasaheb Ambedkar Marathwada University, Aurangabad. ABSTRACT We describe in detail a Grapheme-to-Phoneme (G2P) converter required for the development of a good quality Marathi Text-to-Speech (TTS) system. The Festival and Festvox framework is chosen for developing the Marathi TTS system. Since Festival does not provide complete language processing support specie to various languages, it needs to be augmented to facilitate the development of TTS systems in certain new languages. Because of this, a generic G2P converter has been developed. In the customized Marathi G2P converter, we have handled schwa deletion and compound word extraction. In the experiments carried out to test the Marathi G2P on a text segment of 2485 words, 91.47% word phonetisation accuracy is obtained. This Marathi G2P has been used for phonetising large text corpora which in turn is used in designing an inventory of phonetically rich sentences. The sentences ensured a good coverage of the phonetically valid di-phones using only 1.3% of the complete text corpora. Keywords Grapheme-to-Phoneme (G2P), TTS, Festival, festvox, di- phone, ICT. 1. INTRODUCTION Marathi is an Indo-Aryan language spoken predominantly by Marathi people of Maharashtra. It is the official language and co-official language in Maharashtra and Goa states of Western India respectively, and is one of the 23 official languages of India. There were 73 million speakers in 2001; Marathi ranks 19th in the list of most spoken languages in the world. Marathi has the fourth largest number of native speakers in India. Marathi has some of the oldest literature of all modern Indo-Aryan languages, dating from about 900 AD. The major dialects of Marathi are Standard Marathi and the Varhadi dialect. There are other related languages such as Khandeshi, Dangi, Vadvali and Samavedi. Malvani Konkani has been heavily influenced by Marathi varieties, only a very small percentage of Indians use English as a means of communication. This fact, coupled with the prevalent low literacy rates make the use of conventional user interfaces difficult in India. Spoken language interfaces enabled with Text-to-Speech synthesis have the potential to make information and other ICT based services accessible to a large proportion of the population[1][2][3][18]. However, good quality Marathi TTS systems that can be used for real time deployment are not available. Though a number of research prototypes of Indian language TTS systems have been developed [2][3], none of these are of quality that can be compared to commercial grade TTS systems in languages like English, German and French. The foremost reason for this is that developing a TTS system in a new language needs inputs for resolving language specific issues requiring close collaboration between linguists and technologists. Large amount of annotated data is required for developing language- processing modules like parts of speech (POS) taggers, syntactic parsers, and intonation and duration models. This is a resource intensive task, which is prohibitive to academic researchers in emerging economies like India [3][17][18][19]. In this paper, we describe the effort at HP Labs India to develop a Hindi TTS system based on the open source TTS framework, Festival [4] [17][18][19]. This effort is a part of the Local Language Speech Technology Initiative (LLSTI) [2], which facilitates collaboration between motivated groups around the world, by enabling sharing of tools, expertise, and support and trainingfor TTS development in local languages. LLSTI aims to develop a TTS framework around Festival that will allow for rapid development of TTS systems in any language. The focus of this paper is on a Marathi G2P converter and on the design of phonetically balanced sentences. Certain linguistic characteristics of the Marathi language, like a large phone set and the schwa deletion issue, pose problems for developing unit selection inventories and Grapheme-to Phoneme converters for a Marathi TTS systems. Section 3 describes the modules and algorithms developed at HP Labs India to overcome these problems. In Section 4, the design of phonetically rich Marathi sentences is presented. Though the effort is focused on Marathi, attention has been paid to make these modules as language independent as possible and scalable to other languages. The following section outlines the reason for choosing Festival as the base for developing such a system [2]. 2. FESTIVAL FRAMEWORK The Festival framework has been used extensively by the research community in speech synthesis. It has been chosen for implementing the Marathi TTS System because of its flexible & modular architecture, ease of configuration, and the ability to add new external modules. However, TTS system development for a new language requires substantial amount of work, especially in the text processing modules. The language processing modules in Festival are not adequate for certain languages and the reliance on Scheme as a scripting language makes it difficult for linguists to incorporate the necessary language specific changes within Festival. Thus, we need new tools or modules to be plugged into Festival [4]. 3. TEXT ANALYSIS MODULES CREATED BY HP LABS INDIA In a TTS system, the G2P module converts the normalized orthographic text input into the underlying linguistic and phonetic representation. G2P conversion, therefore is the most