Journal of Uncertain Systems Vol.1, No.2, pp.109-123, 2007 Online at: www.jus.org.uk A Smart Query Formulation for an Eﬃcient Web Search Giuseppe Fenza, Vincenzo Loia ∗ , Sabrina Senatore Dipartimento di Matematica e Informatica, Universit´a degli Studi di Salerno via Ponte Don Melillo 84084 Fisciano (SA) - Italy {gfenza, loia, ssenatore}@unisa.it Received 3 January 2007; Accepted 30 January 2007 Abstract Traditional search engines rely on keyword-based matching, recovering the documents which present some occurrences of the input keywords, but ignore at all the data meaning of the re- trieved documents. Thus, long lists of pages links are returned but actually only a handful of pages contain reference to relevant web resources and meet the needs of users. The exigency of major awareness in the interpretation of web data yields new approaches and methodologies for improving the web search and retrieval, by taking into account the context of information, related to the user query. This work presents an approach for supporting the user in the Web search activity: it achieves the interpretation of the input query and, on the basis of the the local knowledge, replies by providing (links of) web pages which are more relevant to the content mean- ing of the input query. The approach combines intrinsic potential of the agent-based paradigm with the modeling of knowledge through techniques of soft computing. The agents encode the semantics of data, by exploiting ontologies, in order to grasp the actual query meaning. The information elicited by the query interpretation represents an add-on, aimed at augmenting the system knowledge, exploited in the discovery of web pages which match the user request. c  2007 World Academic Press, UK. All rights reserved. Keywords: Web search, clustering, ontologies, multi-agent system 1 Introduction In recent years, the Web has been rapidly increased, by providing a large amount of information, often unstructured (or semi-structured) spread over diﬀerent hosts in a distributed environment. Also, the growth of the Internet usage and contents makes diﬃcult the information access, making the task of Web search highly critical. Current Web search engines are built to provide an answer to all the requests, independent of the special needs of any individual user. The same query, submitted to a typical search engine returns identical results, regardless of the expectations and needs of diﬀerent submitting users. Thus, only a handful of the Web sources are related to user query, because no relationship between the requested information and the found one exists. The search engine quality is mainly measured as retrieval capabilities; some search techniques are based on the classiﬁcation of the discovered documents and then referenced in a search database by human experts or by artiﬁcial agents (softbots) (i.e. Yahoo!, Google) [24]. Interesting research initiatives of famous search engines, try to improve the query results, through the interpretation of the user request and the removal of noisy data as well as to cluster information according to the most likely meaning, through some rife ranking methods. Some approaches converge to the deﬁnition and building of domain speciﬁc Web search engines which collect and index the relevant Web pages, by exploiting crawlers which gather only domain-speciﬁc pages [4],(even though the crawler-based search ∗ Corresponding author: V. Loia (loia@unisa.it).