Prediction of Ultraviolet Spectral Absorbance Using Quantitative Structure-Property
Relationships
William L. Fitch*
Roche Bioscience, 3401 Hillview Avenue, Palo Alto, California 94304
Malcolm McGregor
Affymax Inc., 4001 Miranda Avenue, Palo Alto, California 94304
Alan R. Katritzky,* Andre Lomaka, and Ruslan Petrukhin
Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, P.O. Box 117200,
Gainesville, Florida 32611-7200
Mati Karelson*
Department of Chemistry, University of Tartu, 2 Jakobi Str., Tartu, Estonia
Received November 6, 2001
High performance liquid chromatography (HPLC) with ultraviolet (UV) spectrophotometric detection is a
common method for analyzing reaction products in organic chemistry. This procedure would benefit from
a computational model for predicting the relative response of organic molecules. Models are now reported
for the prediction of the integrated UV absorbance for a diverse set of organic compounds using a quantitative
structure-property relationship (QSPR) approach. A seven-descriptor linear correlation with a squared
correlation coefficient (R
2
) of 0.815 is reported for a data set of 521 compounds. Using the sum of ZINDO
oscillator strengths in the integration range as an additional descriptor allowed reduction in the number of
descriptors producing a robust model for 460 compounds with five descriptors and a squared correlation
coefficient 0.857. The descriptors used in the models are discussed with respect to the physical nature of
the UV absorption process.
INTRODUCTION
In organic synthesis there have been two traditional
methods for performing quantitative analysis. For novel
molecules, the method has been to purify the new structure
to homogeneity and then weigh the sample. For known
samples, a pure reference standard could be obtained which
allowed for chromatographic quantitation in comparison to
the new batch. In the modern drug discovery laboratory,
analysts are asked to quantify the amount of target compound
in hundreds of novel samples each day. These molecules
are made in submilligram amounts and have never been
synthesized before. The only information available is the
structure of the molecule and properties which can reliably
be calculated from the structure. Because the new tools of
combinatorial chemistry allow so many compounds to be
synthesized in a short time, the old strategy of purify and
weigh is no longer satisfactory.
The analytical chemistry community has only begun to
address this need.
1-4
Approaches to quantitation of unknowns
include (i) NMR;
5,6
(ii) HPLC with evaporative light scat-
tering detection,
7,8
and probably the most successful solution
to this problem, and (iii) the recent development and
popularization of the combustion-based chemiluminescent
nitrogen detector (CLND) for HPLC.
9,10
Response in this
latter detector can be predicted solely from structure because
all nitrogens burn to the same analyte. The application of
this detector to assessing high throughput parallel synthesis
is rapidly increasing in popularity.
11
A specific application of interest to Affymax is the need
to quantify compounds synthesized in encoded split-pool
libraries for high throughput screening. These experiments
are done for the quality control of a split pool encoded
library.
12,13
In this technique, subnanomole amounts of
compound are synthesized. Their structures are confirmed
by LC/MS, while the LC/UV signal is used to assess purity.
It is difficult to assess these samples for amount because
none of the standard quantitation techniques (weighing,
NMR, ELSD, CLND) has sufficient sensitivity. Better
knowledge of released concentration could improve our
understanding of hit rates and overall success in bead-based
high throughput screening.
14
A second application for generic quantitation is in the
impurity profiles of drug substance for regulatory approval.
In this process a relatively pure substance is tested by HPLC/
UV/MS. All impurities above 0.1% should be identified
and quantified. The MS data and knowledge of the process
are often sufficient to identify the impurities but quantita-
tion requires the laborious synthesis of a standard pure
sample.
* Corresponding author phone: (352)392-0554; e-mail: katritzky@
chem.ufl.edu.
830 J. Chem. Inf. Comput. Sci. 2002, 42, 830-840
10.1021/ci010116u CCC: $22.00 © 2002 American Chemical Society
Published on Web 06/05/2002