795
Research Article
International Journal of Current Engineering and Technology
ISSN 2277 - 4106
© 2013 INPRESSCO. All Rights Reserved.
Available at http://inpressco.com/category/ijcet
Performance and Effectiveness Examination of the IQE and AQE with
Application on Arabic Content
Abdel Salam Obeidat
a
, Sameh Ghwanmeh
a*
, Ali Al-Ibrahim
a
, Nidhal K. El-Omari
a
and Adel Mohammad
a
a
WISE University, Faculty of Information Technology, Amman, Jordan
Accepted 01 May 2013, Available online 01 August 2013, Vol.3, No.3 (August 2013)
Abstract
Literature search show that information retrieval (IR) systems on Arabic are little compared with English language.
Additionally, IR systems face many problems when used with Arabic language, including, complexity and ambiguity. The
performance and effectiveness of interactive query expansion (IQE) and automatic query expansion (AQE) represent a
key towards improved IR systems. The performance and effectiveness of IQE compared with AQE have been examined
via a series of search experiments on Arabic content. Compared with no query expansion, the experimental results
showed that AQE provides enhanced performance and effectiveness, with 54% query improvement and average precision
of 42.1. However, results revealed that IQE provides high performance and effectiveness compared with AQE, with 84%
query improvement and average precision of 43.4.
Keywords: Arabic language, Arabic content, evaluation of information retrieval, query expansion, IQE, AQE.
1. Introduction
1
Searching and Retrieving Information is defined as the
process of finding the relevant documents in a specific
database based on the user requests. Statistical and other
common methods have been used to implement the
information retrieval (IR) systems. Usually, such IR
systems contain select terms (phrases and others) from the
searched documents and an indexed file to enable easy
access to documents. The search process can be effective
if the searched results contain the maximum number of
related documents and minimum number of non-related
documents (Hani. S et al,2006; Chang, Y.C et al,1997).
Adding additional query terms to existing query is
called a query expansion and it is used to enhance the
performance and correctness of the search process. This
can be achieved by employing the user’s related
information created from the user assessment to the
relevant documents. The expansion terms can be
determined and ranked from those documents
(Strzalkowski T. et al,1998). The created expansion terms
can be used in the query by IQE or AQE (Al-Kharashi et
al,1994). It is the user decision to select the appropriate
terms in the expansion query (Robertson E, 1990; Fowkes,
H et al,2000).
Arabic Language is a common language in the world.
IR applications on Arabic are little compared with English
language IR applications. Additionally, there is a shortage
in the large test databases deals with Arabic content. IR
*Corresponding author: Sameh Ghwanmeh
systems face many problems when used with Arabic
language, including, Orthographic variations, complexity,
broken plurals, ambiguity, short vowels (Moukdad, H.,
2001; Ruthven, I., 2006; Beaulieu, M,2004).
In this paper, the performance and effectiveness of IQE
and AQE have been examined. A chain of experiments
were carried out using 242 Arabic abstracts from the Saudi
Arabian National Computer Conference. The experiments
have been conducted to provide a clear comparison
between AQE and IQE techniques. Performance
evaluation process has been performed to reveal the best
value of n in AQE that gives the optimal value of average
precision for the whole query.
2. Research Methodology
The research experiments were carried out on the Arabic
collection, details of which are given in Table 1. For each
query, the top 10 retrieved documents are used to offer a
list of probable expansion terms. The wpq method (Al-
Kharashi,1994; Efthimiadis, E., 1999) of ranking terms for
query expansion has been employed in this research; this
has been shown to provide acceptable results for both
AQE and IQE as explained in (Strzalkowski, T,1998,
Noaman, A,2012; Jinxi, X.,2002).
As the main objective of this research is to measure the
performance and effectiveness of the IQE and AQE based
on Arabic documents, the research methodology employs
the algorithm presented in (Al-Kharashi,1994) for each
query. The recall and precision values are calculated using
a full-freezing method which is a standard method of