A Predictive Model for Drug-Drug Interaction
Using a Similarity Measure
Abirami Ariyur Mahadevan Anagha Vishnuvajjala Naman Dosi Shrisha Rao
Abstract—Drug-drug interaction causes potential impact on
patients when a second drug is administered during the duration
of action of the first. It may result in the delay or decrease in
the absorption of rate of drugs or enhance their absorption.
This also in turn may affect the action of drugs or induce
adverse effects on patients. There exists a need to study the
drug-drug interactions, and their potential effects on the human
system, including for drugs not yet approved. This paper proposes
using eight features (substructure, targets, transporters, enzymes,
pathways, indications, side-effect and off-side-effect, obtained
from five different databases - PubChem, Drugbank, KEGG,
SIDER, Offsides) and a similarity-based ensemble prediction
model to identify the potential drug-drug interactions. The
proposed ensemble model uses the Jaccard’s coefficient method
for identifying similarity measures between drugs. This similarity
indices are given to a neighbor recommender method and random
walk method for the base prediction of drug-drug interaction.
This predictive model is improved by an ensemble model by
using a genetic algorithm for weight calculation, and logistic
regression for classification. The empirical results show that the
ensemble model yields >90% accuracy while predicting the drug-
drug interactions.
Index Terms—Drug-drug interactions, machine learning, sim-
ilarity measures, Jaccard’s coefficient, ensemble model, random
walk method, neighbor recommender method
I. I NTRODUCTION
Drugs are constantly being sought to fight diseases. How-
ever, drugs have serious downsides, in terms of side-effects as
well as interactions with other drugs. Drug-drug interactions
(DDIs) are very common. Some are beneficial to patients,
like antidotes administered after overdoses, and drugs used
to combat undesirable side-effects of others used in treatment
of serious diseases. However, some are potentially harmful
and have to be identified at an early stage. Some interactions
may have low risk and may be of little clinical significance.
It can take years to clinically check DDIs for every pair of
drugs. Sometimes DDIs may not get detected in clinical trials.
Moreover, it may take many years to check DDIs of all known
drugs with a newly discovered one before it is introduced to
the market. Hence, there exists the need for efficient DDI-
checking with fewer time-consuming, expensive, and risky
clinical trials.
Drug-drug interaction occurs when two drugs which are
co-administered interact and cause an adverse reaction or
unexpected side effects. It can be caused through prescribed
medicines, overdose and/or by prolonged use of medicines.
Between 2009 and 2012, 38.1% of U.S. adults aged 18–
44 used three or more prescription drugs during a 30-day
time period [1]. The percentage of drug usage increases
substantially with age, becoming 67.2% for ages 45–64, and
89.8% for age 65 years or older respectively. The number of
incidents of adverse drug reactions increases exponentially, if
a patient takes four or more drugs [2]. However, identifying
all possible interactions between all drugs is computationally
intractable.
DDI detection and remediation requires domain knowledge
and the competence to act without undue mental stress to
patients and caregivers. Investigations to clinically observe
drug interactions are undertaken before marketing, and may
assist pharmaceutical companies as well as physicians in
gaining confidence about drugs.
Some labor-intensive techniques like in-silico methods, in-
vitro methods, in-vivo experiments, and clinical trials may
identify DDIs, but they are time-consuming [3]. Statistical
methods and machine learning methods were developed to
detect the adverse reactions of drugs and drug-drug inter-
actions by analyzing health reports and records. Researchers
have also used drug data from literature and health reports and
created public databases in order to facilitate the development
of classification and prediction methods [3].
Testing all drugs under all possible conditions is impractical
and unethical also, hence machine learning is sought to be
used. Among machine learning methods that can be used to
predict DDIs, there can be two approaches: similarity-based
methods, and classification-based methods. In either case, a
model is to be created to analyze how drugs interact with
other drugs, and used to predict how a new drug would interact
with a known one. Similarity-based models assume that similar
drugs interact leading to DDIs. Classification-based models
consider DDI prediction as a binary classification task in which
they use two kinds of data; drug pairs that cause DDIs and
drug pairs that do not cause DDIs. In the binary classification,
positive labels are given to known interactions between the
two drugs; the interactions between other pairs of drugs to be
detected using the prediction model.
In this paper we choose similarity-based DDIs because
many times the consequences (side effects) of two drugs add
up and lead to a DDI. Sometimes similar drugs work in a
similar way leading to a DDI because the body cannot sustain
both the drugs at the same time.
The Anatomical Therapeutic Chemical classification system
(ATC) was used in order to characterize the adverse drug-
drug interactions and predict their potential interactions [4]. 978-1-7281-1462-0/19/$31.00 ©2019 IEEE