2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
© IEEE 2021. This article is free to access and download, along with
rights for full text and data mining, re-use and analysis.
455
Predicting Drugs for COVID-19/SARS-CoV-2 via
Heterogeneous Graph Attention Networks
Yahui Long
12
, Yu Zhang
2
, Min Wu
3
, Shaoliang Peng
1
, Chee Keong Kwoh
2
, Jiawei Luo
1∗
, and Xiaoli Li
3∗
1
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China
2
School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
3
Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 138632, Singapore
∗
Corresponding Authors: luojiawei@hnu.edu.cn and xlli@i2r.a-star.edu.sg
Abstract—Coronavirus Disease-19 (COVID-19) has led to
global epidemics with high morbidity and mortality. However,
there are currently no proven effective drugs targeting COVID-
19. Identifying drug-virus associations can not only provide
insights into the understanding of drug-virus interaction mech-
anism, but also guide and facilitate the screening of compound
candidates for antiviral drug discovery. In this work, we propose
a novel framework of Heterogeneous Graph Attention Networks
for Drug-Virus Association predictions, named HGATDVA. First,
we fully incorporate multiple sources of biomedical data to
construct abundant features for drugs and viruses. Second, we
construct two drug-virus heterogeneous graphs. For each graph,
we design a self-enhanced graph attention network (SGAT) to
explicitly model the dependency between a node and its local
neighbors and derive the graph-specific representations for nodes.
Third, we further develop a neural network architecture with
tri-aggregator to aggregate the graph-specific representations to
generate the final node representations. Experiments on two
datasets were conducted to demonstrate the effectiveness of our
proposed method in identifying candidate drugs for viruses.
Index Terms—COVID-19, Drug, Heterogeneous graph atten-
tion networks, Association prediction.
I. I NTRODUCTION
C
ORONAVIRUS Disease-19 (COVID-19) is an infectious
disease caused by SARS-CoV-2 (severe acute respiratory
syndrome coronavirus 2) [1]. SARS-CoV-2 is an enveloped,
positive-sense, single-stranded RNA betacoronavirus of the
family Coronaviridae [2] [3]. Coronaviruses (CoVs) typically
affect the respiratory tract of mammals, including humans,
and lead to mild to severe respiratory tract infections [4].
COVID-19 has led to global epidemics with high morbidity
and mortality. However, there are currently no antiviral drugs
with proven clinical efficacy for the treatment of COVID-
19. Identifying drug-virus associations is very useful for drug
development, as well as disease prevention and treatment.
Considering that conventional experiment methods are time-
consuming, laborious and expensive, computational methods
provide a low cost complementary and can guide the screening
of candidate compounds for drug discovery.
More recently, several computational methods have been
proposed for drug-microbe association prediction. For exam-
ple, Zhu et al. [5] presented a KATZ-based method named
HMDAKATZ for drug-microbe predictions using drug chem-
ical similarity and Gaussian kernel similarity. Long et al. [6]
proposed a novel computational model named HNERMDA
to predict drug-microbe associations based on a heteroge-
neous network. Following that, Long et al. [7] developed
another prediction model called GCNMDA to infer latent
drug-microbe associations combining with microbial protein
interactions and drug chemical information. However, all the
above existing methods do not fully consider the biological
knowledge associated with viruses. Very recently, Andersen
et al. [8] released a comprehensive database called DrugVirus
that records experimentally and clinically validated drug-
virus associations. In addition, there are many other available
knowledge databases for drugs and viruses, such as Drugbank
[9], Uniprot [10] and Virhostnet [11]. The availability of these
rich data provides a golden opportunity for us to develop deep
learning methods for drug-virus association predictions. In
particular, graph neural network, e.g., graph attention network
(GAT) proposed by Veliˇ ckovi´ c et al. [12], is a promising deep
learning technique due to its powerful capability for graph-
structured data.
However, there exist several challenges when using GAT for
drug-virus predictions. First, biological data related to drugs
and viruses are often heterogeneous and different source data
represent distinct biological meanings. Thus it is a challenge to
integrate them as effective input features in a GAT framework
for drug-virus predictions. Second, the observed/known drug-
virus associations are limited and sparse so that it brings great
challenges for using GAT to model drug-virus associations. To
deal with above issues, we developed a novel Heterogeneous
Graph Attention Network (HGAT) based framework for Drug-
Virus Association prediction (HGATDVA). In particular, we
first built two networks/graphs for drug-virus prediction, i.e.,
a drug-virus heterogeneous network with known drug-virus
associations and a drug-host-virus heterogeneous network by
integrating drug-target interactions with virus-host (human)
protein interactions. Then we exploited multiple biomedical
data, e.g., virus genome sequences, drug chemical structure
information, viral protein sequences, drug-drug interactions,
etc., to derive input features for drugs, viruses and proteins.
For each graph, we designed a self-enhanced attention mecha-
nism to learn graph-specific representation for each node. We
further developed a Multilayer perceptron (MLP) based tri-
aggregator to combine graph-specific representations and thus
2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 978-1-7281-6215-7/20/$31.00 ©2020 IEEE DOI: 10.1109/BIBM49941.2020.9313472