Computers and Chemical Engineering 28 (2004) 425–434
Pharmaceutical product design using combinatorial optimization
S. Siddhaye
a
, K. Camarda
a,∗
, M. Southard
a
, E. Topp
b
a
Department of Chemical and Petroleum Engineering, University of Kansas, Lawrence, KS 66045, USA
b
Department of Pharmaceutical Chemistry, University of Kansas, Lawrence, KS 66045, USA
Received 17 June 2002; received in revised form 11 August 2003; accepted 11 August 2003
Abstract
A two-step computational method for designing new molecules in medicinal chemistry is described. In the first step, topological indices are
used to develop structure-based correlations for properties of interest. Zeroth and first order connectivity indices are employed to develop linear
correlations for three physical properties of interest in pharmaceutical chemistry: octanol–water partition coefficient (OWPC), melting point
and water solubility. These correlations are then used within an optimization framework to design molecules having the desired properties.
This step involves formulating a mixed integer linear program (MILP) which includes the property correlations, structural constraints which
ensure that a stable, connected molecule is formed, and an objective function which minimizes the deviation from a set of property targets.
A new data structure, known as a partitioned adjacency matrix, is employed to allow the connectivity index definitions to be written linearly,
such that they can be included in an MILP and solved using a standard branch-and-bound method. The connectivity of the molecule is ensured
by the inclusion of network flow constraints within the formulation. Three examples show the efficacy of this approach.
© 2003 Elsevier Ltd. All rights reserved.
Keywords: Molecular design; Optimization
1. Introduction
The current experimental trial and error approach for the
discovery of new drug molecules starts with the identifica-
tion of a large number of potential candidate molecules using
heuristic rules which generate variants on an initial structure,
often from a natural product. These candidate molecules are
then synthesized and tested to determine if they possess the
desired biological effect. However, the majority of these lead
compounds are typically ineffective, often due to the fact that
physical properties such as solubility are not in the required
range. Since this synthesis and testing approach is clearly
time-consuming and expensive, a computational method for
the discovery and screening of pharmaceutical compounds
is desirable. The challenges involved in the development
of such a method are significant: physical and biochem-
ical properties must be estimated from only a molecular
structure, and a large combinatorial optimization problem
must be solved in order to find the best molecular structure
for a given pharmaceutical application. This work applies
computer-aided molecular design (CAMD) techniques to
∗
Corresponding author.
E-mail address: camarda@ku.edu (K. Camarda).
the pharmaceutical design problem. While previous re-
searchers have employed a rule-based approach along with
computational property predictions (Kier & Hall, 1976) or a
brute-force analysis of a tremendous number of alternatives
(Hairston, 1998), this research applies a new formulation to
solve the drug design problem as a mixed integer linear pro-
gram (MILP), which greatly improves the efficiency of the
method while still finding globally optimal solutions. The
property estimations are achieved using four topological in-
dices, which are numerical values that accurately describe a
molecular structure and can thus serve as descriptors for cor-
relations. Structural feasibility and connectivity constraints
are added to the formulation to ensure that a stable, con-
nected molecule is formed. To solve the optimization prob-
lem formulated to find a novel molecular structure, a new
data structure known as a partitioned adjacency matrix has
been used to convert the problem into a mixed integer linear
program, which can then be solved by standard techniques.
CAMD methods have been used by many resear-
chers for the design of a wide variety of molecules.
Venkatasubramanian, Chan, and Caruthers (1994) used a
group contribution method to predict properties, and em-
ployed a genetic algorithm to solve the mixed integer
nonlinear programs (MINLP) which resulted. Maranas
0098-1354/$ – see front matter © 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compchemeng.2003.08.011