DDTEC-610; No of Pages 9
Please cite this article in press as: Xia X, et al. Graph-based generative models for de Novo drug design, Drug Discov Today: Technol (2020), https://doi.org/10.1016/j.ddtec.2020.11.004
TECHNOLOGIES
DRUGDISCOVERY
TODAY
Graph-based generative models
for de Novo drug design
Xiaolin Xia, Jianxing Hu, Yanxing Wang, Liangren Zhang,
Zhenming Liu
*
State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Xueyuan Road 38, Haidian
District, 100191 Beijing, China
The discovery of new chemical entities is a crucial part
of drug discovery, which requires the lead compounds
to have desired properties to be pharmaceutically ac-
tive. De novo drug design aims to generate and opti-
mize novel ligands for macromolecular targets from
scratch. The development of graph-based deep gener-
ative neural networks has provided a new method. In
this review, we gave a brief introduction to graph
representation and graph-based generative models
for de novo drug design, summarized them as four
architectures, and concluded each’s characteristics.
We also discussed generative models for scaffold-
and fragment-based design and graph-based generative
models’ future directions.
Section editors: Johannes Kirchmair – University of Vienna,
Department of Pharmaceutical Chemistry, Althanstrasse
14, 1090 Vienna, Austria.
Introduction
The development of new technologies is always having a
profound impact on the evolution of drug discovery [1].
Classical pharmacology [2], aka forward pharmacology,
relies on screening in vitro or in vivo to identify substances
with desirable therapeutic effects and to identify and vali-
date targets. With the development of bioinformatics, espe-
cially after the sequencing of the human genome, reverse
pharmacology [2], which usually identifies protein target
first and performs the in vivo efficacy the last, has become
popular.
As a reverse pharmacology method, de novo drug design is
the design of bioactive compounds by incremental construc-
tion of a ligand model within a model of the receptor or
enzyme active site, the structure of which is known from
X-ray or nuclear magnetic resonance data (receptor-based
design) or known ligands (ligand-based design) [3]. It has
been estimated that the synthesizable chemical space might
be as large as 10
60
–10
100
molecules, wherein 10
23
–10
60
[4]
could be possible potential drug-like compounds, but only
10
8
–10
10
have been synthesized. High-throughput screening
[5] and high-throughput virtual screening [6] can only search
for the database part of the chemical space, while de novo drug
design has the potentiality to discover new bioactive com-
pounds. Generative modeling, which learns from the chemi-
cal databases and generates hypotheses for searching under
the iceberg, can be viewed as a de novo drug design variation.
Recent success has proved deep learning to be applicable
for reducing the time and cost of drug discovery [7]. Based on
molecular graph representation that bridges between real
molecules and the data format in computers for deep learning
Drug Discovery Today: Technologies
Vol. xxx, No. xx 2019
Editors-in-Chief
Kelvin Lam – Simplex Pharma Advisors, Inc., Boston, MA, USA
Henk Timmerman – Vrije Universiteit, The Netherlands
*Corresponding author.: L. Zhang (liangren@bjmu.edu.cn), Z. Liu (zmliu@bjmu.edu.cn)
1740-6749/$ © 2020 Elsevier Ltd. All rights reserved. https://doi.org/10.1016/j.ddtec.2020.11.004 1