Chapter XXIII
Duplicate Journal Title
Detection in References
Ana Kovacevic
University of Belgrade, Serbia
Vladan Devedzic
University of Belgrade, Serbia
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
absTraCT
Our research efforts are oriented towards applying text mining techniques in order to help librarians
make more informative decisions when selecting learning resources to be included in the library’s offer.
The proper selection of learning resources to be included in the library’s offer is one of the key factors
determining the overall usefulness of the library. Our task was to match abbreviated journal titles from
citations with journals in existing digital libraries. The main problem is that for one journal there is
often a number of different abbreviated forms in the citation report, hence the matching depends on
the detection of duplicate records. We used character-based and token-based metrics together with a
generated thesaurus for detecting duplicate records.
inTroDUCTion
Digital libraries need to continuously improve
their collections. Knowing how a digital library
and its collection are used is inextricably tied to
the library’s ability to sustain itself, improve its
services, and meet its users’ needs (McMartin,
Iverson, Manduca, Wolf, & Morgan, 2006).
In Serbia, the major provider of digital learning
resources is KOBSON
1
(Consortium of Serbian
Libraries), which provides Serbian students,
teachers, and researchers with access to foreign
journals and other learning resources (Kosanović,
2002). Since the available funds are rather modest,
the appropriate selection of journals to be made
available through KOBSON is highly important
and poses a challenge for their staff. Accordingly,
our research efforts are aimed at helping librar-
ians in general and KOBSON staff in particular
to identify the journals that would be of interest