Using Web-Mining for Academic Measurement and Scholar Recommendation in
Expert Finding System
Chi-Jen Wu, Jen-Ming Chung, Cheng-Yu Lu*, Hahn-Ming Lee‡, and Jan-Ming Ho
Institute of Information Science, Academia Sinica, Taiwan
‡Dep. of CSIE, National Taiwan University of Science and Technology, Taiwan
*Corresponding Author E-mail:cylu@iis.sinica.edu.tw
Abstract—Scholars usually spend great deal of time on
searching and reading papers of key researchers. However,
to objectively determine key researcher of a topic relies on
several measurements, such as publication, citation, recent
academic activities. In this paper, a prototype of scholars
searching and recommendation system based on a web mining
approach in expert finding system is proposed. The system
gives and recommends the ranking of scholars and turns out
top-k scholars. A new ranking measure is designed, namely p-
index, to reveal the scholar ranking of a certain field. We use a
real-world dataset to test the robustness, the experiment results
show our approach outperforms other existing approaches and
users are highly interested in using the system again.
Keywords-Academic Measure; Web Mining; Expert Finding
System;Performance Indexing;
I. I NTRODUCTION
In an expert finding system (EFS), it is required to recom-
mend important researchers of a research topic. Generally,
researchers are judged by counting his/her publications in-
stead of considering the quality of his/her papers. However,
the aspect should be switched to concern the quality of their
publications. Therefore an interesting challenge arises, how
to measure and recommend important/famous scholars of a
research topic? In fact, constructing rankings of scholar au-
thorities is a relatively new subfield of information retrieval
research. This problem is different to the traditional expert
finding problem [1] [2], in essence, the goal of EFS is to
identify a list of people with relevant expertise of a topic.
However, the scholar searching problem is a deeper expert
finding problem, it is not only identifying right scholars
who possess a required knowledge, but ranking their level
of authority in the research field. Generally, how to find
key researchers is more complex and difficult than finding
experts; Particular, there is no standard specifying the criteria
or popular qualifications necessary for particular levels of
authority of scholar.
In this paper, we propose a new design of a scholar
searching system prototype by using web mining approach;
We first focus on the problems of scholar finding and
scholar ranking. Our system assorts the ranking of scholars
with relevant expertise of a research area, such as ”Signal
Processing”, ”Data Mining”, and turns out top-k important
scholars. For scholar finding, search engines are employed
to analyze documents of a certain topic, and extract the
authors from the received documents. Then we estimate
the extracted author’s relevance to the topic on web pages
through statistical analysis. We assume that authors with a
plenty of articles about a certain topic are more likely to
be a candidate expert and authors with highly cited papers
are indicative of the authorities. For scholar ranking, we
design p-index, a ranking function for positioning scholars.
The ranking criteria of scholars are based on publications,
citations by computing the query results from the scientific
literature digital archive, such as Google Scholar and MS
Libra Academic Search. The ranking function, called p-
index, is a novel measure to estimate an individual scholar’s
impact of a single field. The p-index is to indicate the total
citation of a scholar’s papers is m% percentage of total
citation of whole papers in this research field.
II. RELATED RESEARCH
Expert finding is a task of finding right scholars of a
certain topic with high relevance. Within a research com-
munity, such as computer science, there should be many
possible candidates who are relevant to a given topic, the
expert finding operation retrieves a list of expert candidates
who are deemed the most likely scholars for this topic.
Second, expert ranking [3] [4] assorts the levels of authority
among the candidates, and it involves analysis of reputation,
publication, citation, and activities among a list of candidate
scholars. Finally, expert profiling [5] [6] to dig and extract
the profile information of an individual scholar from the
Web, it includes basic information, contact information, and
the educational history. In this section, we describe the
related work includes the above three components.
Traditional expert finding is to identify a list of people
who are with appropriate skills and knowledge related a
given topic [7]. Most previous approaches rely on the
development of an expert database by deploying manual
processes [8], or base on the text, citation or document
analysis in matching user’s research topic [2] [9] [10] [11].
We are aware that a few systems employed expert rank-
ing techniques, such as Arnetminer, Libra, and CiteSeerX.
2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
978-0-7695-4513-4/11 $26.00 © 2011 IEEE
DOI 10.1109/WI-IAT.2011.137
288
2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
978-0-7695-4513-4/11 $26.00 © 2011 IEEE
DOI 10.1109/WI-IAT.2011.137
288