Quantitative Structure-Property Relationship Study to Predict Speed of Sound in Diverse Organic Solvents from Solvent Structural Information Bahram Hemmateenejad* and Poorandokht Ilani-kashkouli Chemistry Department, Shiraz University, Shiraz, Iran * S Supporting Information ABSTRACT: The interaction of solvents with ultrasonic waves is of drastic importance and has been the subject of many studies in recent years. In this study, the eect of solvent structural parameters on the speed of sound in chemical solvents was investigated through a quantitative structure-property relationship (QSPR). Genetic algorithm-multiple linear regression (GA- MLR) analysis was employed to select the most relevant subset of descriptors and, then, to develop the model. The validity of the obtained 10-parameter model was assessed by most widely used validation techniques. The predictive power of the model was evaluated by use of an external data set. The high level of accuracy of results approved the model. According to the model, those solvents that have stronger solvent-solvent interactions can create a more appropriate medium for passing and propagating sound waves and will result in higher speed of sounds. 1. INTRODUCTION Ultrasound is a type of energy that can help analytical chemists in almost all their laboratory tasks, from cleaning to detection. Ultrasonic technique has been employed to investigate the properties of any substance to understand the nature of molecular interactions in pure liquid, 1 liquid mixtures, 2 and ionic interactions in electrolytic solutions. 3 The speed of sound is one of the physical properties that help one to understand the nature of the liquid state. Moreover, its measurement in the liquid state gives information about physicochemical behavior of liquid mixtures such as molecular association and dissociation. 4,5 There are also some literature reports concerning the prediction of speed of sound in solvents using physically derived models. 6-10 The major drawback of most previous models is that they are suggested for limited solvent systems. In some previous models, the procedure for calculation of parameters is complicated. Thus, there is a need to generate simple models, which are able to predict the speed of sound in a broad range of solvents. Quantitative structure-property relationship (QSPR) is one of the most widely used methods employed to develop molecular based models for various physicochemical properties of materials. 11-19 QSPR analysis is now a well-established and highly respected technique to correlate diverse simple and complex physicochemical properties of a component by its molecular structure, through a variety of descriptors. The basic strategy of QSPR analysis is to nd optimum quantitative relationships from molecular structures that can be used to predict the properties. 20-22 Multiple linear regression (MLR) and partial least-squares (PLS) regression are two commonly used linear regression methods in QSPR studies. 23-25 While MLR produces models that are easier to interpret, the modelsperformance is highly aected by the presence of collinear predictor variables, especially when the number of samples is not so large compared to the number of predictors. This leads to obtaining chancy or overtted models. 26,27 On the other hand, PLS regression, because of its capability to optimize the models complexity (by projecting the data into a reduced dimension space called PLS components), can model data sets in the presence of collinear variables or even when the number of variables is higher than the number of samples. Nevertheless, by selecting an appropriate subset of descriptors (number of descriptors lower than 1 / 5 the number of molecules with low degree of collinearity), MLR is preferred over PLS for its computational simplicity and easier interpretability of its generated models. In this work, a quantitative structure-property relationship (QSPR) is developed to predict the speed of sound of pure solvents at ambient temperature and pressure. 20,28 For this purpose, multiple linear regression (MLR) 29 and genetic algorithm (GA) 29,30 are implemented to study the eect of the solvents structural invariants on the speed of sound. 2. COMPUTATIONS AND DATA ANALYSIS 2.1. Data Set. A database of experimental speed of sound for 201 pure solvents at ambient temperature and pressure (1 atm of pressure and 298.15 K) was collected from 86 dierent references. Additionally, their experimental uncertainties were also included in the database. The solventsnames, their corresponding experimental uncertainties, and original reference for each data point are presented as Supporting Information (Table S1). 2.2. Descriptor Generation. The molecular descriptors are parameters that are directly calculated from chemical structure of compounds by use of some particular mathematical algorithms. A wide variety of descriptors have been reported for use in quantitative structure-activity relationship (QSAR) Received: June 20, 2012 Revised: September 20, 2012 Accepted: October 12, 2012 Published: October 12, 2012 Article pubs.acs.org/IECR © 2012 American Chemical Society 14884 dx.doi.org/10.1021/ie3016297 | Ind. Eng. Chem. Res. 2012, 51, 14884-14891