G. Pasi et al. (Eds.): Qual. Issues in the Management of Web Information, ISRL 50, pp. 145–158.
DOI: 10.1007/978-3-642-37688-7_7 © Springer-Verlag Berlin Heidelberg 2013
Chapter 7
Quality-Based Knowledge Discovery
from Medical Text on the Web
Example of Computational Methods in Web
Intelligence
Andreas Holzinger, Pinar Yildirim,
Michael Geier, and Klaus-Martin Simonic
Abstract. The MEDLINE database (Medical Literature Analysis and Retrieval
System Online) contains an enormously increasing volume of biomedical articles.
Consequently there is need for techniques which enable the quality-based
discovery, the extraction, the integration and the use of hidden knowledge in those
articles. Text mining helps to cope with the interpretation of these large volumes
of data. Co-occurrence analysis is a technique applied in text mining. Statistical
models are used to evaluate the significance of the relationship between entities
such as disease names, drug names, and keywords in titles, abstracts or even entire
publications. In this paper we present a selection of quality-oriented Web-based
tools for analyzing biomedical literature, and specifically discuss PolySearch,
FACTA and Kleio. Finally we discuss Pointwise Mutual Information (PMI),
which is a measure to discover the strength of a relationship. PMI provides an
indication of how more often the query and concept co-occur than expected by
change. The results reveal hidden knowledge in articles regarding rheumatic
diseases indexed by MEDLINE, thereby exposing relationships that can provide
important additional information for medical experts and researchers for medical
decision-making and quality-enhancing.
1 Introduction
MEDLINE (Medical Literature Analysis and Retrieval System Online) is a
bibliographic database for the life sciences and includes bibliographic information
for papers of academic journals covering a broad range of biomedical and health
care topics. Moreover, MEDLINE covers much of the literature in biology and