Using Web-Mining for Academic Measurement and Scholar Recommendation in Expert Finding System Chi-Jen Wu, Jen-Ming Chung, Cheng-Yu Lu*, Hahn-Ming Lee, and Jan-Ming Ho Institute of Information Science, Academia Sinica, Taiwan Dep. of CSIE, National Taiwan University of Science and Technology, Taiwan *Corresponding Author E-mail:cylu@iis.sinica.edu.tw Abstract—Scholars usually spend great deal of time on searching and reading papers of key researchers. However, to objectively determine key researcher of a topic relies on several measurements, such as publication, citation, recent academic activities. In this paper, a prototype of scholars searching and recommendation system based on a web mining approach in expert finding system is proposed. The system gives and recommends the ranking of scholars and turns out top-k scholars. A new ranking measure is designed, namely p- index, to reveal the scholar ranking of a certain field. We use a real-world dataset to test the robustness, the experiment results show our approach outperforms other existing approaches and users are highly interested in using the system again. Keywords-Academic Measure; Web Mining; Expert Finding System;Performance Indexing; I. I NTRODUCTION In an expert finding system (EFS), it is required to recom- mend important researchers of a research topic. Generally, researchers are judged by counting his/her publications in- stead of considering the quality of his/her papers. However, the aspect should be switched to concern the quality of their publications. Therefore an interesting challenge arises, how to measure and recommend important/famous scholars of a research topic? In fact, constructing rankings of scholar au- thorities is a relatively new subfield of information retrieval research. This problem is different to the traditional expert finding problem [1] [2], in essence, the goal of EFS is to identify a list of people with relevant expertise of a topic. However, the scholar searching problem is a deeper expert finding problem, it is not only identifying right scholars who possess a required knowledge, but ranking their level of authority in the research field. Generally, how to find key researchers is more complex and difficult than finding experts; Particular, there is no standard specifying the criteria or popular qualifications necessary for particular levels of authority of scholar. In this paper, we propose a new design of a scholar searching system prototype by using web mining approach; We first focus on the problems of scholar finding and scholar ranking. Our system assorts the ranking of scholars with relevant expertise of a research area, such as ”Signal Processing”, ”Data Mining”, and turns out top-k important scholars. For scholar finding, search engines are employed to analyze documents of a certain topic, and extract the authors from the received documents. Then we estimate the extracted author’s relevance to the topic on web pages through statistical analysis. We assume that authors with a plenty of articles about a certain topic are more likely to be a candidate expert and authors with highly cited papers are indicative of the authorities. For scholar ranking, we design p-index, a ranking function for positioning scholars. The ranking criteria of scholars are based on publications, citations by computing the query results from the scientific literature digital archive, such as Google Scholar and MS Libra Academic Search. The ranking function, called p- index, is a novel measure to estimate an individual scholar’s impact of a single field. The p-index is to indicate the total citation of a scholar’s papers is m% percentage of total citation of whole papers in this research field. II. RELATED RESEARCH Expert finding is a task of finding right scholars of a certain topic with high relevance. Within a research com- munity, such as computer science, there should be many possible candidates who are relevant to a given topic, the expert finding operation retrieves a list of expert candidates who are deemed the most likely scholars for this topic. Second, expert ranking [3] [4] assorts the levels of authority among the candidates, and it involves analysis of reputation, publication, citation, and activities among a list of candidate scholars. Finally, expert profiling [5] [6] to dig and extract the profile information of an individual scholar from the Web, it includes basic information, contact information, and the educational history. In this section, we describe the related work includes the above three components. Traditional expert finding is to identify a list of people who are with appropriate skills and knowledge related a given topic [7]. Most previous approaches rely on the development of an expert database by deploying manual processes [8], or base on the text, citation or document analysis in matching user’s research topic [2] [9] [10] [11]. We are aware that a few systems employed expert rank- ing techniques, such as Arnetminer, Libra, and CiteSeerX. 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology 978-0-7695-4513-4/11 $26.00 © 2011 IEEE DOI 10.1109/WI-IAT.2011.137 288 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology 978-0-7695-4513-4/11 $26.00 © 2011 IEEE DOI 10.1109/WI-IAT.2011.137 288