GA-based System for Achieving High Recall and Precision in Information Retrieval Ammar Al-Dallal 1 and and Rasha Shaker Abdulwahab 2 1 School of Information Systems Computing and Mathematics, Brunel University, U.K 2 College of Information Technology, Ahlia University, Bahrain {Ammar.AlDallal@brunel.ac.uk,rasha_sh_abdul-wahab@ahliauniversity.edu.bh} Abstract. The main purpose of this paper is to highlight the impor- tance of retrieving relevant documents by developing new system capable of managing and organizing the retrieved documents. In particular, at- tention will be paid to Genetic Algorithm for developing such system. Genetic Algorithm influences the process of Information retrieval by di- rectly contributing to retrieve relevant documents through its mecha- nism which was inspired by nature. In this paper we propose GA system that has promising results in terms of recall and precision. These results are achieved via developing a new hybrid crossover operator and a new fitness evaluation function. Results are compared with two well known techniques applied in IR domain which are Okapi-BM25 and Bayesian in- terface network model. The comparative study shows that precision and recall of the retrieved documents by the proposed method outperforms these two techniques. Keywords: Genetic Algorithm, information retrieval (IR), Terms Prox- imity, Term Distance, Crossover, Evaluation function 1 Introduction Rapid growth to the number of Web pages needs continuous challenges for help- ing Web users to find relevant information from the Internet. Information Re- trieval (IR) is an essential and useful technique for Web users so studying of such system have increased since the coming of the World Wide Web. In recent years, emphasis in the applicability of Artificial Intelligence (AI) has been increased with IR. One of the AI areas is Evolutionary Computation (EC) which is based on models of natural selection. A classical and important technique in EC is Genetic Algorithm (GA). The GA is biologically inspired and has many mechanisms inspired by natural evolution. Because of its parallel mechanism with high-dimensional space, GA has been used to solve many of scientific and engineering problems. This in turn led to encourage researchers for using this algorithm in IR. Besides, GA plays an important approach to provide suitable information for the user’s needs. IR and GA integrated to avoid web users suffering from specific problems when trying to retrieve useful information such that: