ADMET Evaluation in Drug Discovery. 11. PharmacoKinetics Knowledge Base (PKKB): A Comprehensive Database of Pharmacokinetic and Toxic Properties for Drugs Dongyue Cao, , Junmei Wang, §, Rui Zhou, Youyong Li, Huidong Yu, and Tingjun Hou* ,, Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu 215123, China College of Pharmaceutical Science, Soochow University, Suzhou, Jiangsu 215123, China § Department of Biochemistry, The University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, Texas 75390, United States ABSTRACT: Good and extensive experimental ADMET (absorption, dis- tribution, metabolism, excretion, and toxicity) data is critical for developing reliable in silico ADMET models. Here we develop a PharmacoKinetics Knowledge Base (PKKB) to compile comprehensive information about ADMET properties into a single electronic repository. We incorporate more than 10 000 experimental ADMET measurements of 1685 drugs into the PKKB. The ADMET properties in the PKKB include octanol/water partition coecient, solubility, dissociation constant, intestinal absorption, Caco-2 permeability, human bioavailability, plasma protein binding, blood-plasma partitioning ratio, volume of distribution, metabolism, half-life, excretion, urinary excretion, clearance, toxicity, half lethal dose in rat or mouse, etc. The PKKB provides the most extensive collection of freely available data for ADMET properties up to date. All these ADMET properties, as well as the pharmacological information and the calculated physiochemical properties are integrated into a web-based information system. Eleven separated data sets for octanol/water partition coecient, solubility, blood-brain partitioning, intestinal absorption, Caco- 2 permeability, human oral bioavailability, and P-glycoprotein inhibitors have been provided for free download and can be used directly for ADMET modeling. The PKKB is available online at http://cadd.suda.edu.cn/admet. INTRODUCTION Drug discovery and development is a time-consuming and ex- pensive process. It was estimated that 4060% of new chemical entity (NCE) failures can be attributed to poor ADMET (absorption, distribution, metabolism, excretion, and toxicity) proles. 1,2 ADMET properties can be predicted from the chemical structures, so that huge number of compounds can be evaluated prior to be synthesized and assayed. 35 Theoretical predictions of ADMET properties have been proven to be ecient in recent years. 37 The lack of enough high quality experimental data for training reliable models has been the major hurdle to model ADMET properties. 6,8,9 When the sample size used in training is limited, the in silico models cannot give robust and accurate predictions, especially for the ADMET properties involving complex processes, such as bioavailability, metabolism, toxicity, etc. 6 Traditionally, the available experimental data sets for ADMET modeling in the public domain are often limited in quantity and quality. This is particularly true for in vivo properties obtained directly from human, where data is typically only available for compounds in clinic development. 8 Encouragingly, the available large data sets are expanding in the recent years. For example, three extensive data sets for intestinal absorption, oral bioavailability in human, and P-gp inhibitors were reported by our group. 1013 Nevertheless, further developments on the availability of ADMET data for the public use are still necessary. With more available ADMET data, it will be helpful to integrate all these data of a variety of ADMET properties from dierent sources into a single information system. The PK/DB reported by Moda and the co-worker is one information system providing the service. 14 The PK/DB incorporates 1389 com- pounds and 4141 pharmacokinetic measurements for 8 ADME properties. And the data in the PK/DB were directly taken from the reported publicly available ADME data sets without careful curation. For instance, in the PK/DB, two core data sets, the intestinal absorption data set with 687 molecules and the oral bioavailability data set with 660 molecules, were directly taken from the data sets reported by us. 11,12 Thus the PK/DB only provides a limited data information. Here, we develop the PKKB (PharmacoKinetics Knowledge Base) to house 2- and 3D chemical structures, pharmacological information, experimental or calculated physiochemical pro- perties, and particularly high quality ADMET data of drugs. Received: February 29, 2012 Article pubs.acs.org/jcim © XXXX American Chemical Society A dx.doi.org/10.1021/ci300112j | J. Chem. Inf. Model. XXXX, XXX, XXXXXX