Extended Boolean Operations in Latent Semantic Indexing Search Jivko Steftchev Jeliazkov Preslav Ivanov Nakov Abstract. The paper presents method for the usage of Boolean expressions for information retrieval based on Latent Semantic Indexing (LSI). The basic binary Boolean expressions such as OR, AND and NOT(AND- NOT) and their combinations have been implemented. The proposed method adds a new functionality to the classic LSI method capabilities to process user queries typed in natural language (such as English, Bulgarian or Russian) used in the "intelligent" search engines. This gives the user the opportunity of combining not only distinct words or phrases but also whole texts (documents) using all kinds of Boolean expressions. An evaluation of the implementations has been performed using a text collection of religious and sacred texts. Introduction The classic search engines give the user the opportunity to use keywords and/or Boolean expressions containing keywords. The “intelligent” search engines can process queries in natural language but do not permit the usage of Boolean expressions. We focus on the design of appropriate functions and mechanisms that will give the user the opportunity to combine free-form queries with Boolean operations in order to get better search results. The goal is achieved by combining the classic LSI algorithm with sophisticated implementation of the appropriate Boolean operations. Latent Semantic Indexing The Latent Semantic Indexing (LSI) is a powerful statistical technique for information retrieval. It is a two-stage process that consists of (see [2,3,4] for details): off-line construction of document index, and on-line respond to user queries. The off-line part is the training part when LSI creates its index. First a large word-to-document matrix X is constructed where the cell (i,j) contains the occurrence frequency of the i-th word into the j-th document. After that, a singular value decomposition (SVD) is performed, which gives as a result three matrices D, T (both orthogonal) and S (diagonal), such that X=DST t . Then all the three matrices are truncated in such a way that if we multiply the truncated ones D, Sand Twe get a new matrix Xwhich is the least-squares