3D-MEDNEs: An Alternative “in Silico” Technique for Chemical Research in Toxicology. 2. Quantitative Proteome-Toxicity Relationships (QPTR) based on Mass Spectrum Spiral Entropy Maykel Cruz-Monteagudo, †,‡ Humberto González-Díaz,* ,⊥ Fernanda Borges, † Elena Rosa Dominguez, ‡ and M. Natália D.S. Cordeiro § Physico-Chemical Molecular Research Unit, Department of Organic Chemistry, Faculty of Pharmacy, UniVersity of Porto, 4150-047 Porto, Portugal, Applied Chemistry Research Center, Faculty of Chemistry and Pharmacy, Central UniVersity of Las Villas (UCLV), Santa Clara 54830, Cuba, Unit of Bioinformatics & ConnectiVity Analysis (UBICA), Institute of Industrial Pharmacy, and Department of Organic Chemistry, Faculty of Pharmacy, UniVersity of Santiago de Compostela, 15782 Santiago de Compostela, Spain, and REQUIMTE, Department of Chemistry, Faculty of Sciences, UniVersity of Porto, 4169-007, Porto, Portugal ReceiVed August 20, 2007 Low range mass spectra (MS) characterization of serum proteome offers the best chance of discovering proteome-(early drug-induced cardiac toxicity) relationships, called here Pro-EDICToRs. However, due to the thousands of proteins involved, finding the single disease-related protein could be a hard task. The search for a model based on general MS patterns becomes a more realistic choice. In our previous work (González-Díaz, H., et al. Chem. Res. Toxicol. 2003, 16, 1318–1327), we introduced the molecular structure information indices called 3D-Markovian electronic delocalization entropies (3D-MEDNEs). In this previous work, quantitative structure-toxicity relationship (QSTR) techniques allowed us to link 3D- MEDNEs with blood toxicological properties of drugs. In this second part, we extend 3D-MEDNEs to numerically encode biologically relevant information present in MS of the serum proteome for the first time. Using the same idea behind QSTR techniques, we can seek now by analogy a quantitative proteome-toxicity relationship (QPTR). The new QPTR models link MS 3D-MEDNEs with drug-induced toxicological properties from blood proteome information. We first generalized Randic’s spiral graph and lattice networks of protein sequences to represent the MS of 62 serum proteome samples with more than 370 100 intensity (I i ) signals with m/z bandwidth above 700–12000 each. Next, we calculated the 3D-MEDNEs for each MS using the software MARCH-INSIDE. After that, we developed several QPTR models using different machine learning and MS representation algorithms to classify samples as control or positive Pro-EDICToRs samples. The best QPTR proposed showed accuracy values ranging from 83.8% to 87.1% and leave-one-out (LOO) predictive ability of 77.4–85.5%. This work demonstrated that the idea behind classic drug QSTR models may be extended to construct QPTRs with proteome MS data. Introduction The ability to predict the toxic effects of potential new drugs is crucial to prioritizing compound pipelines and eliminating costly failures in drug development. The inability to accurately predict toxicity early in drug development cost the pharmaceuti- cal industry $8 billion in 2003, approximately one-third the cost of all drug failures. Indeed, predictive toxicology and “omics” technologies are of growing interest to government regulators, who have called for more predictive toxicology and toxicoge- nomics or toxicoproteomics approaches to be used in assessing drug safety. Predictive toxicology is still in its early stages, characterized by the use of gene or protein expression profiles to gain a basic understanding of whether a compound has a “clean” or “messy” profile. The tremendous advantages of these approaches, as well as pressure from the FDA to improve toxicology testing in drug development, indicate that advance- ments in predictive toxicology will play an increasing and accelerating role in drug development (1). Specifically, cardiotoxicity is a serious adverse effect of chemotherapy ranging from relatively benign arrhythmias to potentially lethal conditions (2, 3), where the extent and severity of the necrosis can be monitored by the levels of bioactive markers (4). However, the number of new biomarkers reaching routine clinical use remains unacceptably low (5, 6). At the same time, body fluids are a protein-rich information reservoir that contains the traces of what the blood has encountered on its constant perfusion and percolation throughout the body (7). In this sense, the blood proteome is changing constantly as a consequence of the perfusion of the organ undergoing drug-induced damage, and this process then adds to, subtracts from, or modifies the circulating proteome (8, 9). So, a blood proteome represents a potential target for the detection of proteome-(early drug-induced cardiac toxicity) relationships called here Pro-EDICToRs (7). Thus, due to the optimal performance in the low mass range exhibited by mass spectra (MS), the use of this method applied to proteomics may offer the best chance for the study of Pro-EDICToRs type phenomena. However, due to the thousands of intact and cleaved proteins in the human serum proteome, finding the single * To whom correspondence should be addressed. Tel: +34-981-563100. Fax: +34-981 594912. E-mail: gonzalezdiazh@yahoo.es or qohumbe@ usc.es. † Physico-Chemical Molecular Research Unit, University of Porto. ‡ UCLV. ⊥ University of Santiago de Compostela. § REQUIMTE, University of Porto. Chem. Res. Toxicol. 2008, 21, 619–632 619 10.1021/tx700296t CCC: $40.75 2008 American Chemical Society Published on Web 02/08/2008