“In Silico” Design of New Uranyl Extractants Based on Phosphoryl-Containing Podands: QSPR Studies, Generation and Screening of Virtual Combinatorial Library, and Experimental Tests A. Varnek* and D. Fourches Laboratoire d’Infochimie, UMR 7551 CNRS, Universite ´ Louis Pasteur, 4, rue B. Pascal, Strasbourg 67000, France V. P. Solov’ev and V. E. Baulin Institute of Physiologically Active Compounds, Russian Academy of Sciences, 142432 Chernogolovka, Moscow Region, Russia A. N. Turanov Institute of Solid State Physics, Russian Academy of Sciences, 142432 Chernogolovka, Moscow Region, Russia V. K. Karandashev Institute of Microelectronics Technology and High Purity Materials, 142432 Chernogolovka, Moscow Region, Russia D. Fara and A. R. Katritzky Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611 Received January 7, 2004 This paper is devoted to computer-aided design of new extractants of the uranyl cation involving three main steps: (i) a QSPR study, (ii) generation and screening of a virtual combinatorial library, and (iii) synthesis of several predicted compounds and their experimental extraction studies. First, we performed a QSPR modeling of the distribution coefficient (logD) of uranyl extracted by phosphoryl-containing podands from water to 1,2-dichloroethane. Two different approaches were used: one based on classical structural and physicochemical descriptors (implemented in the CODESSA PRO program) and another one based on fragment descriptors (implemented in the TRAIL program). Three statistically significant models obtained with TRAIL involve as descriptors either sequences of atoms and bonds or atoms with their close environment (augmented atoms). The best models of CODESSA PRO include its own molecular descriptors as well as fragment descriptors obtained with TRAIL. At the second step, a virtual combinatorial library of 2024 podands has been generated with the CombiLib program, followed by the assessment of logD values using developed QSPR models. At the third step, eight of these hypothetical compounds were synthesized and tested experimentally. Comparison with experiment shows that developed QSPR models successfully predict logD values for 7 of 8 compounds from that “blind test” set. 1. INTRODUCTION Solvent extraction is a widely used technique for selective separation and concentration of metals in biphasic water/ organic solvent systems. It involves a cation-ligand com- plexation in one of the liquid phases or at the liquid/liquid interface, accompanied by transfer of the complexes into bulk organic phase. Development of new extraction systems with desirable properties generally proceeds in empirical manner because of complexity of studied processes. Indeed, thermo- dynamic parameters of extraction depend on many variables (the nature of metal(s), conterion(s), ligand(s), pH, organic solvent, and background compounds), and, therefore, their theoretical modeling represents a very difficult task. In fact, in silico design of new extraction systems with desired characteristics could be possibly based on an informational system involving (i) a comprehensive database, (ii) an expert system which models quantitative structure- property relationships (QSPR), and (iii) a generator of combinatorial libraries. Figure 1 illustrates links between these modules: experimental data collected in the database are treated by the expert system which establishes relation- ships between structure of compounds and their extraction properties. Then, structure-property models are applied to screen a virtual combinatorial library leading to potential * Corresponding author phone: 33-390-241549; fax: 33-390-241545; e-mail: varnek@chimie.u-strasbg.fr. 1365 J. Chem. Inf. Comput. Sci. 2004, 44, 1365-1382 10.1021/ci049976b CCC: $27.50 © 2004 American Chemical Society Published on Web 07/26/2004