Analytical Reproducibility in 1 H NMR-Based Metabonomic Urinalysis Hector C. Keun,* ,† Timothy M. D. Ebbels, Henrik Antti, Mary E. Bollard, Olaf Beckonert, Go ¨tz Schlotterbeck, Hans Senn, Urs Niederhauser, Elaine Holmes, John C. Lindon, and Jeremy K. Nicholson Biological Chemistry, Biomedical Sciences, Faculty of Medicine, Imperial College of Science, Technology and Medicine, London, SW7 2AZ, U.K., and Pharma Preclinical Research Basel, F. Hoffmann-La Roche AG, CH-4070-Basel, Switzerland Received July 10, 2002 Metabonomic analysis of biofluids and tissues utilizing high-resolution NMR spectroscopy and chemometric techniques has proven valuable in characterizing the biochemical response to toxicity for many xenobiotics. To assess the analytical reproducibility of metabonomic protocols, sample preparation and NMR data acquisition were performed at two sites (one using a 500 MHz and the other using a 600 MHz system) using two identical (split) sets of urine samples from an 8-day acute study of hydrazine toxicity in the rat. Despite the difference in spectrometer operating frequency, both datasets were extremely similar when analyzed using principal components analysis (PCA) and gave near-identical descriptions of the metabolic responses to hydrazine treatment. The main consistent difference between the datasets was related to the efficiency of water resonance suppression in the spectra. In a 4-PC model of both datasets combined, describing all systematic dose- and time-related variation (88% of the total variation), differences between the two datasets accounted for only 3% of the total modeled variance compared to ca. 15% for normal physiological (pre-dose) variation. Further- more, <3% of spectra displayed distinct inter-site differences, and these were clearly identified as outliers in their respective dose-group PCA models. No samples produced clear outliers in both datasets, suggesting that the outliers observed did not reflect an unusual sample composition, but rather sporadic differences in sample preparation leading to, for example, very dilute samples. Estimations of the relative concentrations of citrate, hippurate, and taurine were in >95% correlation (r 2 ) between sites, with an analytical error comparable to normal physiological variation in concentration (4-8%). The excellent analytical reproducibility and robustness of metabonomic techniques demonstrated here are highly competitive compared to the best proteomic analyses and are in significant contrast to genomic microarray platforms, both of which are complementary techniques for predictive and mechanistic toxicology. These results have implications for the quantitative interpretation of metabonomic data, and the establishment of quality control criteria for both regulatory agencies and for integrating data obtained at different sites. Introduction In recent years, the science of toxicology has begun to explore the potential of novel “-omics” technologies, namely, genomics, proteomics, and metabonomics, which respectively can characterize in a highly parallel fashion the response of living systems to chemical exposure in terms of gene expression, protein expression, or metabolic regulation (1-3). These technologies offer rapid, mecha- nistic information, are often noninvasive or minimally invasive, and are to some degree quantitative. Thus, they facilitate incorporation of toxicological data at earlier stages of drug development, with potential savings of many millions of dollars. While these approaches utilize different analytical techniques and generate varying biochemical data, they provide complementary informa- tion and face common challenges that must be addressed for their successful application to toxicity assessment (2). As the use of -omics technologies evolves from es- sentially qualitative measurements, it becomes ever more crucial to assess the reliability of data generated from these new technologies. Reproducibility (4) and robust- ness are clearly important for any ‘real world’ implemen- tation, but will also influence answers to fundamental questions, such as whether or not signature profiles of chemicals and other stressors can be confidently defined. For any analytical technique, high reproducibility can increase quantitative accuracy and sensitivity and, by decreasing the number of replicates necessary for a given task, can also increase sample throughput. Ultimately, in systems biology approaches, this translates into the use of fewer experimental animals. The generation of databases for pooling of such data from different studies and their interpretation, particularly for regulatory agen- cies in the case of toxicological applications, will require * To whom correspondence should be addressed. Tel: 44-(0)20-7594- 3142. Fax: 44-(0)20-7594-3226. Email: h.keun@ic.ac.uk. Imperial College of Science, Technology and Medicine. F. Hoffmann-La Roche AG. 1380 Chem. Res. Toxicol. 2002, 15, 1380-1386 10.1021/tx0255774 CCC: $22.00 © 2002 American Chemical Society Published on Web 10/17/2002