Proceedings of the 9th Workshop on Asian Language Resources, pages 2–9, Chiang Mai, Thailand, November 12 and 13, 2011. A Grammar Checker for Tagalog using LanguageTool Nathaniel Oco Center for Language Technologies College of Computer Studies De La Salle University 2401 Taft Avenue Malate, Manila City 1004 Metro Manila Philippines nathanoco@yahoo.com Allan Borra Center for Language Technologies College of Computer Studies De La Salle University 2401 Taft Avenue Malate, Manila City 1004 Metro Manila Philippines borgz.borra@delasalle.ph Abstract This document outlines the use of Language Tool for a Tagalog Grammar Checker. Lan- guage Tool is an open-source rule-based en- gine that offers grammar and style checking functionalities. The details of the various lin- guistic resource requirements of Language Tool for the Tagalog language are outlined and discussed. These are the tagger dictionary and the rule file that use the notation of Language Tool. The expressive power of Language Tool’s notation is analyzed and checked if Tagalog linguistic phenomena are captured or not. The system was tested using a collection of sentences and these are the results: 91% precision rate, 51% recall rate, 83% accuracy rate. 1 Credits LanguageTool was developed by Naber (2003). It can run as a stand-alone program and as an extension for OpenOffice.Org 1 and LibreOffice 2 . LanguageTool is distributed through Language- Tool’s website: http://www.languagetool.org/. 2 Introduction LanguageTool is an open-source style and grammar checker that follows a manual-based rule-creation approach. LanguageTool utilizes rules stored in an xml file to analyze and check text input. The text in- put is separated into sentences, each sentence is separated into words, and each word is assigned 1 OpenOffice.Org is available at http://www.openoffice.org/ 2 LibreOffice is available at http://www.libreoffice.org/ a part-of-speech tag based on the declarations in the Tagger Dictionary. The words and their part- of-speech are used to check for patterns that match those declared in the rule file. If there is a pattern match, an error message is shown to the user. Currently, LanguageTool supports Belaru- sian, Catalan, Danish, Dutch, English, Esperanto, French, Galician, Icelandic, Italian, Lithuanian, Malayalam, Polish, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, and Ukrainian to a certain degree. Tagalog is the basis for the Filipino language, the official language of the Philippines. Accord- ing to a data collected by Cheng et al. (2009), there are 22,000,000 native speakers of Tagalog. This makes it the highest in the country, fol- lowed by Cebuano with 20,000,000 native speakers. Tagalog is very rich in morphology, Ramos (1971) stated that Tagalog words are normally composed of root words and affixes. Dimalen and Dimalen (2007) described Tagalog as a language with “high degree of inflection”. Jasa et al. (2007) stated that the number of available Tagalog grammar checkers is limited. Tagalog is a very rich language and Language- Tool is a flexible language. The development of Tagalog support for LanguageTool provides a readily-available Tagalog grammar checker that can be easily updated. 3 Related Works Ang et al. (2002) developed a semantic analyzer that has the capability to check semantic rela- tionships in a Tagalog sentence. Jasa et al. (2007) and Dimalen and Dimalen (2007) both developed syntax-based Filipino grammar checker exten- sions for OpenOffice.Org Writer. In syntax- based grammar checkers, error-checking is based on the parser. An input is considered correct if 2