Some Case Studies on Application of ‘‘r m 2 ’’ Metrics for Judging Quality of Quantitative Structure–Activity Relationship Predictions: Emphasis on Scaling of Response Data Kunal Roy,* [a] Pratim Chakraborty, [b] Indrani Mitra, [a] Probir Kumar Ojha, [a] Supratik Kar, [a] and Rudra Narayan Das [a] Quantitative structure–activity relationship (QSAR) techniques have found wide application in the fields of drug design, property modeling, and toxicity prediction of untested chemicals. A rigorous validation of the developed models plays the key role for their successful application in prediction for new compounds. The r m 2 metrics introduced by Roy et al. have been extensively used by different research groups for validation of regression-based QSAR models. This concept has been further advanced here with introduction of scaling of response data prior to computation of r m 2 . Further, a web application (accessible from http://aptsoftware.co.in/rmsquare/ and http://203.200.173.43:8080/rmsquare/) for calculation of the r m 2 metrics has been introduced here. The present study reports that the web application can be easily used for computation of r m 2 metrics provided observed and QSAR- predicted data for a set of compounds are available. Further, scaling of response data is recommended prior to r m 2 calculation. V C 2013 Wiley Periodicals, Inc. DOI: 10.1002/jcc.23231 Introduction A small modification in the molecular structure of a congeneric se- ries of molecules may result in an extensive variation in their bio- logical activities as seen in the case of optically active isomers of quinine, the levo form showing antiarrythmic activity while the dextro form exhibiting antimalarial activity. Similar observation is seen with the R and S forms of warfarin, the S isomer being more active than the R form. [1] This kind of relationship between molec- ular structure and changes in biological activity is the center of focus for the field of quantitative structure–activity relationships (QSAR). Therefore, molecules with similar structure will possess similar bioactivities for similar proteins/receptors/enzymes, and the changes in structure will be represented through changes in the bioactivities. [2] There are several QSAR methods to assist the design of compounds for medical/agricultural use. The relation- ship between the property of a compound and its structural fea- tures was first noticed in the 1930s by Hammett [3] and later fur- ther studied by Hansch et al. [4] in the mid 1960s. Further, the concept of molecular shape analysis [5] enabled to incorporate the three-dimensional (3D) aspects of the compound into the QSAR model, describing the 3D structure–activity relationships in a quantitative manner. The popularity of commercial programs such as comparative molecular field analysis (CoMFA) [6] and CATALYST [7] has limited both the evaluation and the use of other 3D-QSAR methodologies. Often well-known issues associated with CoMFA and CATALYST have come to be viewed as shortcomings that sim- ply are accepted as working limitations in a 3D-QSAR analysis. Over the years of development, many methods, algorithms and techniques have been discovered and applied in QSAR stud- ies. The construction of a QSAR model follows five prime steps: [2] (i) selection of a dataset with series of known response data, (ii) calculation of descriptors, (iii) splitting of the dataset into training and test sets for model development and its subsequent valida- tion, (iv) construction of models using different chemometric tools, and (v) validation of the developed model based on internal and external validation statistics. Additionally, the development of 3D-QSAR models includes two more steps for their successful execution: conformation analysis of the molecules and their align- ment status with respect to the most active compound. [8] In gen- eral, the question is: given a QSAR model and a query compound for prediction, can the developed QSAR model be reliably used to provide an accurate and reliable prediction? The answer lies in the predictive quality of the developed model, which is deter- mined based on different validation statistics. Thus, validation of QSAR models plays the most crucial role in defining the applic- ability of the QSAR model for the prediction of designed mole- cules. [9] Initially, verification of the correlation between chemical features of the molecules and the biological activity was of prime interest during the development of the QSAR model. Later, the focus was gradually shifted toward the predictive power of the model than simply unveiling the quantitative relationships. Sub- sequently, a sequel of statistical tools has been introduced over the last few years to determine model predictive ability followed by its application in prediction of untested molecules. [10] [a] K. Roy, I. Mitra, P. K. Ojha, S. Kar, R. N. Das Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India E-mails: kunalroy_in@yahoo.com; kroy@pharma.jdvu.ac.in [b] P. Chakraborty Apt Software Avenues Pvt Ltd, Unit G301, Block DC, City Centre, Sector 1, Salt Lake, Kolkata 700 064, India V C 2013 Wiley Periodicals, Inc. Journal of Computational Chemistry 2013, 34, 1071–1082 1071 SOFTWARE NEWS AND UPDATES WWW.C-CHEM.ORG