c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 1 2 ( 2 0 1 3 ) 640–648
j o ur na l ho me pag e: www.intl.elsevierhealth.com/journals/cmpb
TMT-HCC: A tool for text mining the biomedical
literature for hepatocellular carcinoma (HCC)
biomarkers identification
Rania A. Abul Seoud
a
, Mai S. Mabrouk
b,*
a
Faculty of Engineering, Department of Electrical Engineering, Communication and Electronics Section, El Fayoum
University, Fayoum 63514, Egypt
b
Faculty of Engineering, Department of Biomedical Engineering, Misr University for Science and Technology (MUST
University), Al Motamyez Distinct Al, Al Mehwar Road, 00202, Egypt
a r t i c l e i n f o
Article history:
Received 1 June 2013
Received in revised form 4 July 2013
Accepted 22 July 2013
Keywords:
Text mining
Hepatocellular carcinoma (HCC)
Copy number variation
Biomedical literature
Biomarkers
a b s t r a c t
Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related mortality
worldwide. New insights into the pathogenesis of this lethal disease are urgently needed.
Chromosomal copy number alterations (CNAs) can lead to activation of oncogenes and inac-
tivation of tumor suppressors in human cancers. Thus, identification of cancer-specific
CNAs will not only provide new insight into understanding the molecular basis of tumor
genesis but also facilitate the identification of HCC biomarkers using CNA.
This paper presents the TMT-HCC system; it is a tool for text mining the biomedical lit-
erature for hepatocellular carcinoma (HCC) biomarkers identification. TMT-HCC provides
researchers with a powerful way to identify and discern molecular biomarkers of HCC to
inform diagnosis, prognosis, and treatment driver genes with causal roles in carcinogenesis
is to detect genomic regions that under frequent alterations in cancers (CNAs). TMT-HCC
also extracts protein–protein interactions from the full text of the scientific papers. The
results provided that the integration of genomic and transcriptional data offers powerful
potential for identifying novel cancer genes in HCC pathogenesis.
© 2013 Elsevier Ireland Ltd. All rights reserved.
1. Introduction
Hepatocellular carcinoma (HCC) is the fifth most common
cancer worldwide and the third most common cause of
cancer-related death, with an overall 5-year survival rate of
<5% [1]. Long-term survival of HCC patients is poor, partly due
to HCC recurrence, which up to 80% of the patients experience
even after curative resection [2]. In Egypt, HCC is one of the
most prevalent cancer types. It is the second most common
∗
Corresponding author. Tel.: +20 1001662403.
E-mail addresses: r-abulseoud@k-space.org (R.A.A. Seoud), msm eng@yahoo.com (M.S. Mabrouk).
malignancy in males and the fifth in females this results in
that liver cancer is most causes death in Egypt than other
types of cancer [3].
Chronic hepatitis and liver cirrhosis have been recognized
as important risk factors for the development of hepatocellu-
lar carcinoma (HCC). Prognosis and survival of HCC are still
poor, mainly because of diagnosis at a late stage and/or recur-
rence of the disease [4,5].
The outcome of HCC patients still remains dismal due
to the difficulty in detecting the disease at its early stage;
0169-2607/$ – see front matter © 2013 Elsevier Ireland Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.cmpb.2013.07.014