Is lemon Sufficient for Building Multilingual Ontologies for Bantu Languages? Catherine Chavula and C. Maria Keet Department of Computer Science, University of Cape Town, South Africa {cchavula,mkeet}@cs.uct.ac.za Abstract. The current enormous amount of data on the Semantic Web and its increasing uptake raises the question of how this data can be accessed in several languages. OWL provides limited support for mul- tilingualism through the use of an annotation property. However, it is known that more expressive models are required for linguistically de- manding applications. Among the possible solutions, Lexicon Model for Ontologies (lemon) enables associating linguistic information with ontol- ogy elements by separating the lexical from the ontological layer. This paper investigates whether lemon is sufficient for specifying multilingual ontologies for Bantu languages. Specifically, the paper: (i) identifies the requirements for building lexica in lemon format for Bantu languages; (ii) describes the results in overcoming some of the challenges, notably concerning noun classes; and (iii) presents some open issues that will have to be addressed to increase usability of lemon. 1 Introduction Multilingual ontologies are required to provide access to ontology-based infor- mation in the languages of the users. However, most ontologies are available in English, i.e., ontology elements are named with English terms, which, at least, brings afore the requirement to localise these ontologies to other natu- ral languages. For example, vocabularies for the Semantic Web such as Friend- Of-A-Friend (FOAF) [4] and GoodRelations [20] have annotations in English only but are widely used to annotate resources on the Web. OWL provides support for multilingualism using annotation properties such as rdfs:label and rdfs:comment; e.g., a lexicalisation of the class foaf:Person can be given in English, Chichewa, and isiZulu through adding annotations as shown in Fig. 1. However, the amount of linguistic annotation that can be included in this labelling system is limited and most multilingual applications require more data such as Part of Speech (POS) and grammatical features, among others. Moreover, ontologies are for representing knowledge, and such linguistic data need not to be included in an ontology. Several approaches that separate the ontological layer from the ter- minological layer have been proposed [6, 9, 24] and the W3C Community Group ontolex-lemon submission is under way 1 . Notably, the LExicon Model for ON- tologies (lemon) [23,24] separates the ontological layer and linguistic layer, and 1 http://www.w3.org/community/ontolex/wiki/Final_Model_Specification