978-1-4244-8551-2/10/$26.00 ©2010 IEEE ICIAfS10
Abstract— Using computers to answer natural language
questions is an interesting and challenging problem.
Generally such problems are handled under two
categories: open domain problems and close domain
problems. This paper presents a system that attempts to
solve close domain problems. Typically, in a close
domain, answers to questions are not available in the
public domain and therefore they cannot be searched
using a search engine. Hence answers have to be stored
in a database by a domain expert. Then, the challenge is
to understand the natural language question so that the
solution could be matched to the respective answer in the
database. We use a template matching technique to
perform this matching. In addition, given that our target
is to use this system with non-native English speakers,
we developed a method to overcome the mismatches we
might encounter due to spelling mistakes. The system is
developed such that the questions can be asked using
short messages from a mobile phone and therefore the
system is designed to understand SMS language in
addition to English. One of the main contributions of this
paper is the outcome presented of a deployment of this
system in a real environment.
Keywords—FAQ, Answering System, SMS, Template
Matching
I. INTRODUCTION
EVELOPING mechanisms for using computers to answer
user questions is becoming an interesting problem with
the increased use of computers. Such mechanisms
allow users to ask questions in a natural language and give a
concise and accurate answer. Understanding user questions
in natural languages requires Natural Language Processing
(NLP). Being an active area of research, NLP plays a big
role in the ICT and Question Answering (QA) systems.
Natural language processing is the computerized
approach to analyzing text based on both a set of theories
and a set of technologies. It will become important to be
able to ask queries and obtain answers, using natural
language (NL) expressions, rather than the keyword based
retrieval mechanisms. The QA system can better satisfy the
needs of users as they will provide an accurate, quicker,
convenient and effective way of giving answers to user
questions.
The approach we have adopted in this project is an
automated FAQ (Frequently Asked Question) answering
system that replies with pre-stored answers to user questions
asked in ordinary English, rather than keyword or syntax
based retrieval mechanisms. This is achieved using a
template matching technique with some other mechanisms
like disemvoweling, matching synonyms, etc.
The natural language processing technique developed for
FAQ retrieval does not analyze user queries. Instead
analysis is applied to FAQs in the database. Thus, the work
of FAQ retrieval is reduced to keyword matching creating
an illusion of intelligence. The system is both evolving and
portable. Evolving because its question answering ability
improves as more questions are asked and new FAQ entries
are created. It is portable because the system could be used
for any problem domain (closed) by changing the
knowledge base.
Typically, there are two types of question answering
systems: (1) closed-domain question answering that deals
with questions under a specific domain, and can be seen as
an easier task on one hand as the NLP systems can exploit
domain-specific knowledge frequently formalized in
ontology but harder on the other as the information is not
generally available in the public domain; and (2) open-
domain question answering that deals with questions about
nearly everything, and can rely only on general ontology
and world knowledge. On the other hand, as mentioned
earlier these systems usually have much more data available
in the public domain from which to extract the answer.
As depicted in Figure 1, there exist two methods [1], [2]
for coming up with an appropriate answer for a user
question and they are AI method and FAQ search method.
The AI method [2] focuses on answer generation by
analyzing questions and creating an “understanding” of the
question. This requires complex and advanced linguistic
analysis programs. There are three generic methods that an
answer can be generated using stored FAQs and answers [3]
and they are: (1) artificial intelligence approach; (2)
statistical techniques; and (3) template matching.
An Automatic Answering System with Template
Matching for Natural Language Questions
Tilani Gunawardena, Medhavi Lokuhetti, Nishara Pathirana, Roshan Ragel and Sampath Deegalla
Faculty of Engineering, University of Peradeniya, Peradeniya 20400 Sri Lanka
etilani@gmail.com, medhavimpl@gmail.com, nishara.pdn@gmail.com, roshanr@pdn.ac.lk and dsdeegalla@pdn.ac.lk
D
353