Fast Determination of 13 C NMR Chemical Shifts Using Artificial Neural Networks J. Meiler,* R. Meusinger, and M. Will § Institute of Organic Chemistry, Marie - Curie - Strasse 11, University of Frankfurt, D-60439 Frankfurt, Germany, Institute of Organic Chemistry, University of Mainz, D-55099 Mainz, Germany, and BASF AG Ludwigshafen, D-67056 Ludwigshafen, Germany Received March 15, 2000 Nine different artificial neural networks were trained with the spherically encoded chemical environments of more than 500 000 carbon atoms to predict their 13 C NMR chemical shifts. Based on these results the PC-program “C_shift” was developed which allows the calculation of the 13 C NMR spectra of any proposed molecular structure consisting of the covalently bonded elements C, H, N, O, P, S and the halogens. Results were obtained with a mean deviation as low as 1.8 ppm; this accuracy is equivalent to a determination on the basis of a large database but, in a time as short as known from increment calculations, was demonstrated exemplary using the natural agent epothilone A. The artificial neural networks allow simultaneously a precise and fast prediction of a large number of 13 C NMR spectra, as needed for high throughput NMR and screening of a substance or spectra libraries. INTRODUCTION NMR spectroscopy is undoubtedly one of the most important methods used for structure determination of chemical compounds. In recent years the power of NMR methods and the sophistication of spectrometers increased clearly. This was achieved by a number of new techniques. 1 Only a few of them should be named here. The measurement time was decreased drastically by pulsed field gradients, double or single quantum coherence methods, and finally by the so-called “tubeless NMR”. This is the fitting of conventional high-resolution NMR spectrometers with flow- probes or special micro sample probes. Shorter NMR measuring times are required above all by the high through- put methods developed in combinatorial chemistry. With increasing amounts of spectral data available a new bottle- neck has emerged: data analysis. Precise and fast computer programs are necessary to enhance the productivity here. Munk gave recently a vivid presentation of the evolution of computer enhanced structure elucidation exemplary by the structure determination of the antibiotic actinobolin. 2 In the 1960s the computer assisted elucidation of unknown struc- tures required several man years using the structure generator ASSEMBLE. Forty years later, with both, more sophisticated NMR spectroscopic methods and computer software, the time required to determine the structure has been reduced to several days (time for data collection included). Now the program SESAMI generated four candidate structures in 5 min CPU time using only the available 1D and 2D NMR data. Lindel et al. also use both the NMR spectroscopic detectable connections between nuclei and their chemical shifts, 3 in their program COCON (constitutions from con- nectivities) which was developed for the generation of all possible constitutions for complex natural products. The efforts which are spent for the development of efficient structure elucidation programs shall be presented here by two other current examples. CISOC-SES 4 is a computer assisted expert system that utilizes 1D and 2D NMR data. Recently the NMR assignment of a biologically active triterpenoid was shown by Peng et al. 5 With the program LSD (Logic for Structure Determination) Nuzillard demon- strated impressively the potential of systematic structure elucidation of small molecules combining modern NMR spectroscopy with artificial intelligence at the example of gibberellic acid. 6 However, in most practical cases an elucidation of a completely unknown structure is not required. The more common type of structure determination is the structure verification. In this case, enough information is available perhaps on the basis of well-known synthetic reaction paths to propose a probable structure. The structure information which is achieved via the chemical shift is usually sufficient here. NMR Chemical Shift Prediction. Atomic nuclei of one isotope located within one molecule in different chemical environments are shielded differently by their electron cloud. As a result, different resonance frequencies are observed during an NMR experiment exciting these isotopes. If these frequencies are measured as differences to the resonance frequency of an inner standard, they are designated as “chemical shifts”. The chemical shift value combines two advantages for structural analysis. It is an easily obtainable spectral parameter, and its dependence on chemical structure is well-known. 7 The chemical shift of a carbon is, in addition to its state of hybridization, mainly influenced by the kind and number of the bond atoms and by their distances to the observed carbon. The chemical shift of a carbon atom can be influenced by another atom in two different ways: electron interaction over covalent bonds or through space. In solution the second effect appears possibly as a “solvent effect”. However, electron interaction through space is only important if the distance between the observed and influenc- ing atom is small. It has to be considered specially during * Corresponding author phone: ++49 69 798 29 798; fax: ++49 69 798 29 128; e-mail: mj@org.chemie.uni-frankfurt.de. ² University of Frankfurt. University of Mainz. § BASF AG Ludwigshafen. 1169 J. Chem. Inf. Comput. Sci. 2000, 40, 1169-1176 10.1021/ci000021c CCC: $19.00 © 2000 American Chemical Society Published on Web 08/25/2000