The UPV at QA@CLEF 2006 Davide Buscaldi and Jos´ e Manuel Gomez and Paolo Rosso and Emilio Sanchis Dpto. de Sistemas Informticos y Computaci´ on (DSIC), Universidad Politcnica de Valencia, Spain {dbuscaldi, jogomez, prosso, esanchis}@dsic.upv.es August 19, 2006 Abstract This report describes the work done by the RFIA group at the Departamento de Sistemas Inform´aticos y Computaci´ on of the Universidad Polit´ ecnica of Valencia for the 2006 edition of the CLEF Question Answering task. We participated in three monolingual tasks: Spanish, Italian and French. The system used is a slightly revised version of the one we developed for the past year. The most interesting aspect of the work is the comparison between a Passage Retrieval engine (JIRS) specifically aimed to the Question Answering task and a standard, general use search engine such as Lucene. Results show that JIRS is able to return high quality passages. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing; H.3.3 Infor- mation Search and Retrieval; H.3.4 Systems and Software General Terms Measurement, Algorithms, Performance, Experimentation Keywords Question Answering, Passage Retrieval, Answer Extraction and Analysis 1 Introduction QUASAR is the mono/cross-lingual Question Answering (QA) System we developed for our first participation to the past edition of the CLEF QA task. It is based on the JIRS Passage Retrieval (PR) system, specifically oriented to this task, in contrast to most QA systems that use classical PR methods [3, 1, 7, 5]. JIRS can be considered as a language-independent PR system, because it does not use any knowledge about lexicon and syntax of the language during question and passage processing phases. One of the objectives of this participation was the comparison of JIRS with a classical PR engine (in this case, Lucene 1 ). In order to do this, we actually implemented two versions of QUASAR, which differ only for the PR engine used. With regard to the improvements over last year’s system, our efforts were focused on the Question Analysis module, which in contrast to the one used in 2005 does not use Support Vector Machines in order to classify the questions. Moreover, we moved towards a stronger integration of the modules. The 2006 CLEF QA task introduced some challenges with respect to the previous edition: list questions, the lack of a label distinguishing “definition” questions from other ones, and the 1 http://lucene.apache.org