AN ONTOLOGY-BASED HIV/AIDS FAQ RETRIEVAL SYSTEM Yirsaw Ayalew, Gontlafetse Mosweunyane, Barbara Moeng Department of Computer Science, University of Botswana Private Bag UB00704, Gaborone, Botswana {ayalew, mosweuny, motswirib }@mopipi.ub.bw ABSTRACT This paper presents a discussion of the implementation of an ontology-based HIV/AIDS Frequently Asked Question (FAQ) retrieval system. The main purpose of the system is to provide an answer from an existing HIV/AIDS FAQ repository for any question on HIV/AIDS asked by any person. As the identification of the best possible answer requires the understanding of the semantics of both the question and the existing question-answer pairs in the FAQ, the use of ontology is very crucial. Ontologies have been widely used in natural language processing applications especially in Question Answering Systems. The ontology for the HIV/AIDS FAQ retrieval system has been built using Text2Onto tool which has been experimentally evaluated to be the most appropriate tool as reported in our earlier work. Once the ontology is constructed, the next challenge is to make sure that the use of the domain ontology improves the performance of the FAQ retrieval System. For this purpose, we explored a number of approaches for computing semantic similarity between a user query and the existing question-answer pairs in the FAQ. Semantic similarity is computed based on inherent relationships between concepts using ontologies. Specifically, we use the semantic similarity metrics proposed by Thiagarajan et al. based on spreading activation networks (set based spreading). The results show an improvement in accuracy compared to the traditional information retrieval based question answering systems approaches. KEY WORDS FAQ retrieval system, Question answering system, HIV/AIDS ontology, Semantic similarity. 1. Introduction HIV/AIDS has affected Sub-Saharan Africa more than any region in the world. Among the Sub-Saharan Africa countries, Botswana is one which is highly affected by this pandemic. To tackle this challenge, one strategy is to educate the population and increase awareness through the provision of access to information resources. Currently people can get information about HIV/AIDS from various sources including through FAQs website and HIV/AIDS Call Centres. FAQs and call centers provide question answer service. Question answer services are becoming popular due to their ability to provide specific answers to users questions as opposed to list of potential answers as in search engines. In other words, users can directly obtain answers rather than a list of potentially relevant documents. For this reason, organizations provide FAQs to accommodate the common user questions about an organization’s services or products or anything related to the particular organization. However, the FAQs that are provided on organizations websites require users to go through the FAQ question-answer pairs to find an answer for a question a user has. An ideal solution would be to allow users to pose just their questions and a system scan the FAQs and return the answer from the question-answer pairs for which the user’s question and the question in the FAQ are identical or similar. The purpose of an HIV/AIDS call center is to provide information appropriate to individual demands. In a call center setup, people call a toll-free line managed by the call centre and ask any questions related to HIV/AIDS they may have. The operator browses the HIV/AIDS frequently asked questions (FAQ) manual and provides the answer to the caller. If the answer is not in the manual the operator escalates the question to an HIV/AIDS specialist. The caller will be advised to call again at a later time. Once the answer is provided by the HIV/AIDS specialist, the question answer will be included in the FAQ manual. This setup, though helpful in many aspects, it still has a number of inconveniencies. A more convenient solution would be to get the question answer service through mobile phones just by sending SMS (Short Message Service) questions. The ultimate goal of our research project is to develop a question answering (QA) system that can answer any question people may have about HIV/AIDS through standard mobile phones. With such a system, people can send SMS questions using mobile phones and get the answer as an SMS on their cell phone. In this paper, we focus on the development of an automated FAQ retrieval system (a special type of question answer service) on HIV/AIDS. One of the major tasks in an FAQ retrieval service is to find questions in the FAQ repository that are semantically similar to a user’s question. An automated FAQ retrieval system will automatically search the FAQ repository to see if the same or similar question exists in the repository. If the same or similar question is found, then the corresponding answer can be provided. However, determining the semantic similarity between a user question and questions in the FAQ repository is a difficult task. The difficulty is Proceedings of the IASTED International Conference Health Informatics (AfricaHI 2014) September 1 - 3, 2014 Gaborone, Botswana DOI: 10.2316/P.2014.815-008 293