Citation: Karatzas, E.; Baltoumas,
F.A.; Kasionis, I.; Sanoudou, D.;
Eliopoulos, A.G.; Theodosiou, T.;
Iliopoulos, I.; Pavlopoulos, G.A.
Darling: A Web Application for
Detecting Disease-Related
Biomedical Entity Associations with
Literature Mining. Biomolecules 2022,
12, 520. https://doi.org/10.3390/
biom12040520
Academic Editor: Lukasz Kurgan
Received: 1 March 2022
Accepted: 28 March 2022
Published: 30 March 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
biomolecules
Article
Darling: A Web Application for Detecting Disease-Related
Biomedical Entity Associations with Literature Mining
Evangelos Karatzas
1,
*
,†
, Fotis A. Baltoumas
1,
*
,†
, Ioannis Kasionis
1,†
, Despina Sanoudou
2,3,4
,
Aristides G. Eliopoulos
3,4,5
, Theodosios Theodosiou
6
, Ioannis Iliopoulos
6
and Georgios A. Pavlopoulos
1,3,
*
1
Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”,
16672 Vari, Greece; gkasionis2@gmail.com
2
Clinical Genomics and Pharmacogenomics Unit, 4th Department of Internal Medicine, School of Medicine,
National and Kapodistrian University of Athens, 11527 Athens, Greece; dsanoudou@bioacademy.gr
3
Center for New Biotechnologies and Precision Medicine, School of Medicine,
National and Kapodistrian University of Athens, 11527 Athens, Greece; eliopag@med.uoa.gr
4
Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou Street, 11527 Athens, Greece
5
Department of Biology, School of Medicine, National and Kapodistrian University of Athens, Mikras Asias 75,
11527 Athens, Greece
6
Department of Basic Sciences, School of Medicine, University of Crete, 71003 Heraklion, Greece;
theodosios.theodosiou@gmail.com (T.T.); iliopj@med.uoc.gr (I.I.)
* Correspondence: karatzas@fleming.gr (E.K.); baltoumas@fleming.gr (F.A.B.);
pavlopoulos@fleming.gr (G.A.P.)
† These authors contributed equally to this work.
Abstract: Finding, exploring and filtering frequent sentence-based associations between a disease and
a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume
of publications increases. Darling is a web application, which utilizes Name Entity Recognition to
identify human-related biomedical terms in PubMed articles, mentioned in OMIM, DisGeNET and
Human Phenotype Ontology (HPO) disease records, and generates an interactive biomedical entity
association network. Nodes in this network represent genes, proteins, chemicals, functions, tissues,
diseases, environments and phenotypes. Users can search by identifiers, terms/entities or free text
and explore the relevant abstracts in an annotated format.
Keywords: text-mining; data integration; bioinformatics; named-entity recognition; literature-
derived associations
1. Introduction
PubMed
®
today (02/2022) hosts more than 33 million biomedical abstracts, whereas
PubMed Central
®
Open Access Subset (PMC OA Subset) [1] contains more than
7 Million full-text articles. The ever-increasing amount of literature is posing numer-
ous challenges for bioscientists, as parsing these texts and extracting associations among
biomedical entities is neither easy nor trivial. This is particularly true for disease-related
research, where a wealth of knowledge on the relations between bioentities (genes, proteins,
chemicals, etc.) and pathological conditions is available, especially since the rise of high-
throughput experimental methods [2]. There is, therefore, a great need for the development
of effective and user-friendly methods for the automated recognition, visualization and
analysis of disease-related bioentity associations.
Towards this end, several text-mining approaches have been implemented [3–7]. Bio-
TextQuest [8], for example, retrieves PubMed articles and clusters them based on their
biomedical terms. DrugQuest [9] applies text mining on the DrugBank database [10], in
order to explore drug associations. DISEASES [11] is a system for extracting disease–gene
associations from biomedical abstracts. PREGO [12] uses text mining to link microor-
ganisms with environmental processes and functions. Reflect [13] and EXTRACT [14]
Biomolecules 2022, 12, 520. https://doi.org/10.3390/biom12040520 https://www.mdpi.com/journal/biomolecules