Tuan Tran Anh, International Journal of Computer Science and Mobile Computing, Vol.3 Issue.3, March- 2014, pg. 623-631
© 2014, IJCSMC All Rights Reserved 623
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IJCSMC, Vol. 3, Issue. 3, March 2014, pg.623 – 631
RESEARCH ARTICLE
A HYBRID APPROACH OF QUERY
EXPANSION FOR VIETNAMESE QUERY
Tuan, Tran Anh
Faculty of Informatics & Foreign Language
DakLak College of Pedagogy, Vietnam
tuanta@dlc.edu.vn
Abstract - The semantics of a user ’s query plays an important role in supporting an Information Retrieval
(IR), which returns results closer to the user's query. However, most of user’s queries sometimes do not fully
reflect the semantics. Therefore, it is necessary to add semantics to the user’s query. The paper presents how
to add semantics into a query in Vietnamese language using a hybrid method, which combines ontology and
local analysis technique, in order to expand the user’s queries. In the hybrid method, ontology – based query
expansion technique analyzes semantics relationships in order to determine similar noun phrases, and local
analysis technique is to get the most relevant documents which are to identify the context of user’s query.
Keywords - Vietnamese language query expansion; hybrid model of query expansion; query expansion; local
analysis; ontology
I. INTRODUCTION
Nowadays, Information Retrieval (IR) and Search Engine (SE) become one of the most important tools for our life. Most
of users use IR or SE when they search new information on digital library or the Internet. However, the returned results from
these systems always include some documents which are not relevant with the user’s query. One of the reasons leading to the
returned results which are not highly precise is a short query. To achieve more accurate results, a query will be added more
keywords which related to its query semantics. This is a problem of query expansion.
There are two techniques of query expansion [20]: search result – based technique, and knowledge structure – based
technique. The search result – based technique uses terms which are chosen from the retrieved documents in the relevance
feedback process. The knowledge structure – based technique can either depend on corpus or be independent of it.
The collection – dependent knowledge structure expands user’s query by adding keywords from a knowledge model as
WordNet, ontology. The semantics of user’s query is determined in this technique. In contrast to the collection – dependent
knowledge structure, collection – independent knowledge structure is based on statistical analysis of the feedback from the
corpus such as global analysis, local analysis, etc. User’s query is expanded from returned documents which have relationship
with user’s query. One of the first techniques is global analysis technique which improves returned results. This technique
analyzes the entire document corpus to determine word relationships. Otherwise, local analysis technique which uses top-
ranked retrieved documents for query expansion is to identify the context of user’s query and to achieve high efficiency in
retrieving the specific domain as medicine, computer science, etc.
Accordingly, based on the advantages’ techniques above, a query expansion model for Vietnamese query is proposed.
This model uses a hybrid method which combines ontology and local analysis technique in order to expand the user’s queries.