Dynamically Generating Context-Relevant Sub-Webs Art Vandenberg 1 , Vijay K. Vaishnavi 2 , Saravanaraj Duraisamy 2 , Tianjie Deng 2 1 Georgia State University, Information Systems & Technology, P.O. Box 3994, Atlanta Georgia, 30302-3994, U.S.A. 2 Georgia State University, Computer Information Systems Department, P.O. Box 4015, Atlanta Georgia, 30302-4015, U.S.A. {avandenberg, vvaishna}@gsu.edu, {sduraisamy1, tdeng1}@student.gsu.edu Abstract. There is unprecedented growth of Web information but challenges on mining this vast information resource remain. This paper addresses designing an effective prototype tool that dynamically generates sub-webs of information from a web-based resource (World Wide Web or a subset). Sub- webs present context-relevant results to individuals or groups. Given that the prototype tool is technically implemented from multiple components, each of which has efficacy, there still remains the challenge of devising an appropriate evaluation of the complete model. This is difficult when the search scope is the entire World Wide Web and a vast number of result pages are technically good on Recall but low on Precision. This paper describes an iterative approach to finding an effective technical prototype using an evaluation method that can a) reasonably model the search environment of the World Wide Web and b) provide convincing metrics for evaluating efficacy of solutions. Keywords: Context; Sub-Web; Web Mining; Evaluation; Metrics. 1 Introduction There is unprecedented growth of information available on the Web in all fields of human endeavor but challenges on mining this vast digital information remain. This paper addresses the goal of a research prototype tool that can generate sub-webs of information dynamically from a web-based resource (the entire World Wide Web or a subset thereof such as the NSF or NIH portal) – where sub-webs will present context- relevant results to users (individuals or groups). Researchers in all fields of human endeavor, including science and engineering, recognize the potential and the challenges of exponential growth of information in the World Wide Web [1], [2], [3] and [4]. Taming the Web has spurred considerable research and commercial activity, such as [5], [6], [7] and [8]. The available approaches can be broadly grouped into search engines [5] and [9], directories [10], [11] and [12] or web user adaptation and personalization systems [4], [13], [14], [15], [16], [17] and [18]. Simple keyword searches may return hundreds (even thousands) of individual web pages but often with two deficiencies: (1) keywords may not explicitly reflect the relevant context of a user’s requirements and (2) the results may not provide context-