International Journal of Computer Science and Applications, Technomathematics Research Foundation Vol. 14, No. 2, pp. 31 – 46, 2017 31 INFORMATION RETRIEVAL FOR QUESTION ANSWERING SYSTEM USING KNOWLEDGE BASED QUERY RECONSTRUCTION BY ADAPTED LESK AND LATENT SEMANTIC ANALYSIS SARAVANAKUMAR KANDASAMY, ASWANI KUMAR CHERUKURI School of Information Technology and Engineering, VIT University, Vellore, Tamilnadu, India ksaravanakumar@vit.ac.in, cherukuri@acm.org The purpose of accessing search engines by most of the users is for answers rather than set of documents. Identifying set of relevant documents for a question is the first part of any question answering systems. The accuracy of a question answering system is highly depending on the quality of the set of documents that have been chosen for further investigation. In this paper, a method is proposed to improve the information retrieval part of the open domain question answering system. The authors found the list of documents that may be relevant to the given question and rank them by using a) alternate queries to increase the quality of the context represented in the query without compromising the semantic meaning, b) search engine results and c) application of the concept of Latent Semantic Analysis (LSA). The result shows that the top ranked documents for a given factoid question contain solid answers. Keywords: Information retrieval; Question answering system; Query disambiguation; Natural language processing; Query expansion; LSA. 1. Introduction Due to the invention of World Wide Web and Internet, we have access to an enormous amount of data that are available worldwide. The available information is growing in a faster rate every day. This growth rate causes one to ask the questio n “How easily one could access the data?” We have Search Engines as the very simple answer to deal with this issue. Search engines are handling the user queries to some extent. They are able to fetch all the documents available in the Internet to your screen in mill-seconds. Most of the times those documents are ranked appropriate to the search query raised by the user. But the size of the available data always increases the number of documents as search results. Now, the major problems are “Are we getting all the websites that are very much related to our query?”, “Are the results are optimally ranked?”, “How do we get the answer we are looking for?” etc. If these questions are not properly answered, then there arise various problems for any user who use search engines. Some of them are as follows; a) one may not get proper words to frame his/her query, b) one may not be good in English to frame the query, c) the suggested results may not contain the result which is expected by the user, d) the fatigue of user who could not get the relevant results, e) the expected results may need to be searched from a suggested document which is very large,