Personalization Techniques for Web Search Results Categorization John Garofalakis 2,3 , Theofanis Matsoukas 2 , Yannis Panagis 1,2 , Evangelos Sakkopoulos 1,2 and Athanasios Tsakalidis 1,2 {garofala, sakkopul, tsak}@cti.gr, {matsouka,panagis}@ceid.upatras.gr 1 RA Computer Technology Institute Internet and Multimedia Technologies RU 5 61 Riga Feraiou Str. 26110, Greece 2 University of Patras Computer Engineering & Informatics Dpt 26500 Patras, Greece 3 Hellenic Open University 84 K Palama Str. 26442 Patras, Greece Abstract Generic web search is designed to serve all users, in- dependent of the individual needs and without any adap- tation to personal requirements. We propose a novel technique 1 that performs post-categorization to the re- sults of popular search engines at the client’s side. A user profile is built based on user's choices from a category hierarchy (explicitly given requirements) and user's search history (implicitly logged choices). Caching is utilized in order to provide improved responses. An ex- perimental prototype has been implemented based on results coming from a popular search engine. The ex- perimental results indicate strongly that the proposed mechanism is both effective and efficient. 1. Introduction World Wide Web (www) is an enormous source of in- formation, which is renewed and increased continuously. As the amount of information changes and grows rapidly, it creates lots of new challenges for Web search environ- ments. Users often feel that it is at least time-consuming or even in vain to search the web for information they need. In order to improve web searching experience, we pro- pose an environment that takes into consideration the individual interests of each user to present the results categorized. Explicit category selection is available in the proposed personalized environment. Additionally, implic- itly chosen categories are also recorded when selection of web search results is performed. The user’s interests are rendered into an adaptive profile [1][4]. The profile changes over time in order to reflect interest obsolescence onto categories. To achieve categorization, a high performance classi- fier [3] is utilized. However, the classifier is built with multiple instances in the environment. Each user may 1 Further details concerning the techniques utilized and the environment implementation and operational details may be found in the address below: http://students.ceid.upatras.gr/~sakkopul/mysearch2.htm choose to provide data of his/ her personal choices in or- der to perform extra training concerning the categories of personal interest rather than others. In addition two different categorizing operation modes are available in the environment. The user may choose categorization to be performed based on the full text page or summary text only of the searching results. Evaluation of the operation modes shows that the former method results are better in categorization, while the latter in quicker responses. In order to decrease the response time, a categorization caching technique is implemented. The rest of the paper is organized as follows. In sec- tion 2, related work is presented. In section 3, details are presented concerning the personal profile built for each user. In the sequel, section 4 presents the techniques fol- lowed by the use of the classifier. Section 5 includes the environment operational specifications. Details of the two categorizing modes, the user-driven training and imple- mentation issues are also presented. Section 6 presents the caching strategy followed. Following in section 7 user- based evaluation of the environment is presented. Finally, section 8 concludes. Figure 1: MySearch Personalized Login Page 1