Prediction of Ultraviolet Spectral Absorbance Using Quantitative Structure-Property Relationships William L. Fitch* Roche Bioscience, 3401 Hillview Avenue, Palo Alto, California 94304 Malcolm McGregor Affymax Inc., 4001 Miranda Avenue, Palo Alto, California 94304 Alan R. Katritzky,* Andre Lomaka, and Ruslan Petrukhin Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, Florida 32611-7200 Mati Karelson* Department of Chemistry, University of Tartu, 2 Jakobi Str., Tartu, Estonia Received November 6, 2001 High performance liquid chromatography (HPLC) with ultraviolet (UV) spectrophotometric detection is a common method for analyzing reaction products in organic chemistry. This procedure would benefit from a computational model for predicting the relative response of organic molecules. Models are now reported for the prediction of the integrated UV absorbance for a diverse set of organic compounds using a quantitative structure-property relationship (QSPR) approach. A seven-descriptor linear correlation with a squared correlation coefficient (R 2 ) of 0.815 is reported for a data set of 521 compounds. Using the sum of ZINDO oscillator strengths in the integration range as an additional descriptor allowed reduction in the number of descriptors producing a robust model for 460 compounds with five descriptors and a squared correlation coefficient 0.857. The descriptors used in the models are discussed with respect to the physical nature of the UV absorption process. INTRODUCTION In organic synthesis there have been two traditional methods for performing quantitative analysis. For novel molecules, the method has been to purify the new structure to homogeneity and then weigh the sample. For known samples, a pure reference standard could be obtained which allowed for chromatographic quantitation in comparison to the new batch. In the modern drug discovery laboratory, analysts are asked to quantify the amount of target compound in hundreds of novel samples each day. These molecules are made in submilligram amounts and have never been synthesized before. The only information available is the structure of the molecule and properties which can reliably be calculated from the structure. Because the new tools of combinatorial chemistry allow so many compounds to be synthesized in a short time, the old strategy of purify and weigh is no longer satisfactory. The analytical chemistry community has only begun to address this need. 1-4 Approaches to quantitation of unknowns include (i) NMR; 5,6 (ii) HPLC with evaporative light scat- tering detection, 7,8 and probably the most successful solution to this problem, and (iii) the recent development and popularization of the combustion-based chemiluminescent nitrogen detector (CLND) for HPLC. 9,10 Response in this latter detector can be predicted solely from structure because all nitrogens burn to the same analyte. The application of this detector to assessing high throughput parallel synthesis is rapidly increasing in popularity. 11 A specific application of interest to Affymax is the need to quantify compounds synthesized in encoded split-pool libraries for high throughput screening. These experiments are done for the quality control of a split pool encoded library. 12,13 In this technique, subnanomole amounts of compound are synthesized. Their structures are confirmed by LC/MS, while the LC/UV signal is used to assess purity. It is difficult to assess these samples for amount because none of the standard quantitation techniques (weighing, NMR, ELSD, CLND) has sufficient sensitivity. Better knowledge of released concentration could improve our understanding of hit rates and overall success in bead-based high throughput screening. 14 A second application for generic quantitation is in the impurity profiles of drug substance for regulatory approval. In this process a relatively pure substance is tested by HPLC/ UV/MS. All impurities above 0.1% should be identified and quantified. The MS data and knowledge of the process are often sufficient to identify the impurities but quantita- tion requires the laborious synthesis of a standard pure sample. * Corresponding author phone: (352)392-0554; e-mail: katritzky@ chem.ufl.edu. 830 J. Chem. Inf. Comput. Sci. 2002, 42, 830-840 10.1021/ci010116u CCC: $22.00 © 2002 American Chemical Society Published on Web 06/05/2002