Tuan Tran Anh, International Journal of Computer Science and Mobile Computing, Vol.3 Issue.3, March- 2014, pg. 623-631 © 2014, IJCSMC All Rights Reserved 623 Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320088X IJCSMC, Vol. 3, Issue. 3, March 2014, pg.623 631 RESEARCH ARTICLE A HYBRID APPROACH OF QUERY EXPANSION FOR VIETNAMESE QUERY Tuan, Tran Anh Faculty of Informatics & Foreign Language DakLak College of Pedagogy, Vietnam tuanta@dlc.edu.vn Abstract - The semantics of a user ’s query plays an important role in supporting an Information Retrieval (IR), which returns results closer to the user's query. However, most of user’s queries sometimes do not fully reflect the semantics. Therefore, it is necessary to add semantics to the user’s query. The paper presents how to add semantics into a query in Vietnamese language using a hybrid method, which combines ontology and local analysis technique, in order to expand the user’s queries. In the hybrid method, ontology based query expansion technique analyzes semantics relationships in order to determine similar noun phrases, and local analysis technique is to get the most relevant documents which are to identify the context of user’s query. Keywords - Vietnamese language query expansion; hybrid model of query expansion; query expansion; local analysis; ontology I. INTRODUCTION Nowadays, Information Retrieval (IR) and Search Engine (SE) become one of the most important tools for our life. Most of users use IR or SE when they search new information on digital library or the Internet. However, the returned results from these systems always include some documents which are not relevant with the user’s query. One of the reasons leading to the returned results which are not highly precise is a short query. To achieve more accurate results, a query will be added more keywords which related to its query semantics. This is a problem of query expansion. There are two techniques of query expansion [20]: search result based technique, and knowledge structure based technique. The search result based technique uses terms which are chosen from the retrieved documents in the relevance feedback process. The knowledge structure based technique can either depend on corpus or be independent of it. The collection dependent knowledge structure expands user’s query by adding keywords from a knowledge model as WordNet, ontology. The semantics of user’s query is determined in this technique. In contrast to the collection dependent knowledge structure, collection independent knowledge structure is based on statistical analysis of the feedback from the corpus such as global analysis, local analysis, etc. User’s query is expanded from returned documents which have relationship with user’s query. One of the first techniques is global analysis technique which improves returned results. This technique analyzes the entire document corpus to determine word relationships. Otherwise, local analysis technique which uses top- ranked retrieved documents for query expansion is to identify the context of user’s query and to achieve high efficiency in retrieving the specific domain as medicine, computer science, etc. Accordingly, based on the advantages’ techniques above, a query expansion model for Vietnamese query is proposed. This model uses a hybrid method which combines ontology and local analysis technique in order to expand the user’s queries.