Development of a simple, interpretable and easily
transferable QSAR model for quick screening antiviral
databases in search of novel 3C-like protease (3CLpro)
enzyme inhibitors against SARS-CoV diseases
V. Kumar and K. Roy
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur
University, Kolkata, India
ABSTRACT
In the context of recently emerged pandemic of COVID-19, we have
performed two-dimensional quantitative structure-activity relation-
ship (2D-QSAR) modelling using SARS-CoV-3CLpro enzyme inhibi-
tors for the development of a multiple linear regression (MLR)
based model. We have used 2D descriptors with an aim to develop
an easily interpretable, transferable and reproducible model which
may be used for quick prediction of SAR-CoV-3CLpro inhibitory
activity for query compounds in the screening process. Based on
the insights obtained from the developed 2D-QSAR model, we have
identifed the structural features responsible for the enhancement
of the inhibitory activity against 3CLpro enzyme. Moreover, we
have performed the molecular docking analysis using the most
and least active molecules from the dataset to understand the
molecular interactions involved in binding, and the results were
then correlated with the essential structural features obtained from
the 2D-QSAR model. Additionally, we have performed in silico
predictions of SARS-CoV 3CLpro enzyme inhibitory activity of
a total of 50,437 compounds obtained from two anti-viral drug
databases (CAS COVID-19 antiviral candidate compound database
and another recently reported list of prioritized compounds from
the ZINC15 database) using the developed model and provided
prioritized compounds for experimental detection of their perfor-
mance for SARS-CoV 3CLpro enzyme inhibition.
ARTICLE HISTORY
Received 30 April 2020
Accepted 27 May 2020
KEYWORDS
2D-QSAR; MLR; 3CLpro
enzyme; SARS-CoV-2; Covid-
19; docking and virtual
screening
Introduction
Severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) is a positive-sense single
stranded RNA virus, informally known as the corona virus [1,2]. The virus was frst
identifed in Wuhan, China by the Centres for Disease Control and Prevention (CDC) in
the year of 2019 [2,3]. Since then, the virus has spread to others countries by infected
people, and now this is a threat to global health [3,4].The virus spreads from person to
person through close contact with someone who has the infection [5]. The disease is most
infectious when a person is symptomatic [6]. However it is possible for someone without
CONTACT K. Roy kunal.roy@jadavpuruniversity.in
Supplemental data for this article can be accessed at: https://doi.org/10.1080/1062936X.2020.1776388.
SAR AND QSAR IN ENVIRONMENTAL RESEARCH
2020, VOL. 31, NO. 7, 511–526
https://doi.org/10.1080/1062936X.2020.1776388
© 2020 Informa UK Limited, trading as Taylor & Francis Group