Regulatory Perspectives in the Use and Validation of QSAR. A Case Study: DEMETRA Model for Daphnia Toxicity CHIARA PORCELLI, ELENA BORIANI, ALESSANDRA RONCAGLIONI, ANTONIO CHANA, AND EMILIO BENFENATI* Istituto di Ricerche Farmacologiche Mario Negri, via La Masa 19, 20156 Milan, Italy Received June 14, 2007. Revised manuscript received October 29, 2007. Accepted October 30, 2007. The DEMETRA acute toxicity model toward the water flea ( Daphnia magna) was used as a case study to outline a validation method compatible with regulatory use. Reliability, predictive power, uncertainty, and applicability were verified with an external test set of pesticides. Predictions for this external set using the DEMETRA model, developed ad hoc for pesticides, were compared with the results using ECOSAR and TOPKAT as benchmarks. The evaluation considered statistical parameters and the presence of errors, with their size and sign. DEMETRA gave good statistical predictions, and the maximum error of the outliers was lower than that with the other two models. DEMETRA gave a limited number of false negatives, and the use of defined rules indicated the level of uncertainty was acceptable. Introduction Quantitative structure–activity relationship (QSAR) studies are increasingly challenged to evaluate huge numbers of compounds. The European legislation REACH specifies the use of QSAR models to predict properties of industrial chemicals (1). QSAR studies are based on the premise that biological activities may be related to certain chemical properties. A large number of relationships have been reported, where the biological effect, mathematically de- termined as the output of the model, was defined in relation to some chemical parameters, identified as the model inputs (2). The partition coefficient between octanol and water, given as a logarithm called logP or logKow , has been used in many equations, sometimes in combination with other properties such as electronic factors (2, 3). For example, logP is the key factor in programs commonly used to predict aquatic toxicity, such as in the U.S. EPA software ECOSAR (4). Another program giving toxicity predictions is TOPKAT (5), which evaluates the so-called “optimal prediction space” (OPS). The identification of boundaries characterizing the model’s validity has been studied and debated (6). It is widely accepted that, in the case of aquatic toxicity, models mainly based on logP are quite reliable for com- pounds acting through a narcotic effect; however, they fail, in particular for more reactive compounds, because of the need to detect more varied mechanisms, and this is a serious drawback if models are required to play a role in protecting the environment. The EC-funded project DEMETRA, Development of Environmental Modules for Evaluation of Toxicity of pesticide Residues in Agriculture, was aimed at developing predictive models for ecotoxicity of pesticides (7). Compared to industrial chemicals, pesticides are quite complex because they typically contain several functional groups, and are intended to have a toxic effect on specific targets through a variety of biological mechanisms, not all fully identified. The DEMETRA models are publicly available through the Internet (8). Here we report a further validation test of the model for Daphnia magna acute toxicity, on a very large set of more than 100 compounds. We also report the values predicted with ECOSAR and TOPKAT. These two programs were chosen because they give an automatic prediction of toxicity using QSAR, similarly to DEMETRA and therefore usable by regulators. We examine the results in terms of practical utility, unambiguity, and applicability domains of the models. Materials and Methods Biological Data and Structure Availability. The new test set was organized by collecting data from the HAIR ecotoxicity database (9). This contains data for about 242 pesticides extracted from the German database of the Federal Biological Research Centre for Agriculture and Forestry (BBA). The main sources of BBA data are EU reports on active substances used in plant protection products, published before 2006. However, some data come from various other protocols. An initial screening was done in order to avoid polymers, inorganic compounds, mixtures of molecules, and mismatch between CAS number and name. Out of the remaining compounds we only collected data on water flea (Daphnia magna) LC50 48 h, which is the dose that kills 50% of the fleas after 48 h exposure. Finally, we divided the subset into a new test set (135 compounds) and a set of compounds already used for DEMETRA modeling (74 chemicals). Acute toxicity values were converted to the negative of the logarithm of LC50 . For each new compound the chemical structure was checked and downloaded from the ChemIDplus Web site (10), then saved as an MDL mol file. A first analysis compared data in the new database with those already used for the DEMETRA modeling. Only 16 of the 74 common compounds had identical LC50 values in the two databases (probably the same experiment) while the other 58 showed a correlation coefficient, R 2 , of 0.89 between the log scale values of the two series. Although this indicates a good correlation between the two databases, proving agreement between values and protocol, 15 of these figures differed by more than a factor of 4, and 6 differed by more than 1 order of magnitude. This is a major limit in building predictive QSAR models: the experimental values, i.e. the model input, are an intrinsic source of uncertainty. Molecular Descriptors. The three models require descrip- tors calculated on the basis of the bidimensional structure. OpenBabel v 1.0.0.1 was used to create the MDL mol files (12). OpenBabel was also used to generate the SMILES codes needed by the TOPKAT and ECOSAR models. The 16 chemical descriptors needed for the DEMETRA model (8) were calculated with DRAGON free version 3.0 (11). All logP values reported in the HAIR database for the new test set are within the range of logP for DEMETRA training compounds. DEMETRA Model. The DEMETRA model for Daphnia is based on a training set of 220 compounds. The software was built through a hybrid model approach: the final model is composed of three individual models (one based on partial * Corresponding author e-mail: benfenati@marionegri.it; tel: +390239014420; fax: +390239014735. Environ. Sci. Technol. 2008, 42, 491–496 10.1021/es071430t CCC: $40.75 2008 American Chemical Society VOL. 42, NO. 2, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY 9 491 Published on Web 11/30/2007