Exhaustive QSPR Studies of a Large Diverse Set of Ionic Liquids: How Accurately Can
We Predict Melting Points?
Alexandre Varnek* and Natalia Kireeva
Laboratoire d’Infochimie, UMR 7551 CNRS, Universite ´ Louis Pasteur, 4, rue B. Pascal,
Strasbourg 67000, France
Igor V. Tetko
GSF- Institute for Bioinformatics, Neuherberg D-85764, Germany, and Institute of Bioorganic &
Petrochemistry, Kiev, Ukraine
Igor I. Baskin
Department of Chemistry, Moscow State University, Moscow 119992, Russia
Vitaly P. Solov’ev
Institute of Physical Chemistry, Russian Academy of Sciences, Leninskiy prospect 31a,
Moscow 119992, Russia
Received November 4, 2006
Several popular machine learning methodssAssociative Neural Networks (ANN), Support Vector Machines
(SVM), k Nearest Neighbors (kNN), modified version of the partial least-squares analysis (PLSM),
backpropagation neural network (BPNN), and Multiple Linear Regression Analysis (MLR)simplemented
in ISIDA, NASAWIN, and VCCLAB software have been used to perform QSPR modeling of melting point
of structurally diverse data set of 717 bromides of nitrogen-containing organic cations (FULL) including
126 pyridinium bromides (PYR), 384 imidazolium and benzoimidazolium bromides (IMZ), and 207 quaternary
ammonium bromides (QUAT). Several types of descriptors were tested: E-state indices, counts of atoms
determined for E-state atom types, molecular descriptors generated by the DRAGON program, and different
types of substructural molecular fragments. Predictive ability of the models was analyzed using a 5-fold
external cross-validation procedure in which every compound in the parent set was included in one of five
test sets. Among the 16 types of developed structure - melting point models, nonlinear SVM, ASNN, and
BPNN techniques demonstrate slightly better performance over other methods. For the full set, the accuracy
of predictions does not significantly change as a function of the type of descriptors. For other sets, the
performance of descriptors varies as a function of method and data set used. The root-mean squared error
(RMSE) of prediction calculated on independent test sets is in the range of 37.5-46.4 °C(FULL), 26.2-
34.8 °C(PYR), 38.8-45.9 °C(IMZ), and 34.2-49.3 °C(QUAT). The moderate accuracy of predictions can
be related to the quality of the experimental data used for obtaining the models as well as to difficulties to
take into account the structural features of ionic liquids in the solid state (polymorphic effects, eutectics,
glass formation).
1. INTRODUCTION
Ionic liquids (IL) have received a great attention due to
their green and tuneable properties. The negligible vapor
pressures allow for their potential use as an alternative for
organic volatile solvents.
1,2
Careful choice of cation/anion
combination permits fabrication of IL with physical and
chemical properties well fitted to a specific problem. One
of the most important physical properties of IL, melting point
(mp), was a subject of numerous studies (see book
3
and
references therein). Melting point characterizing a passage
from solid to liquid state has a very complex relationship
with the structure of constituent ions because of many
different factors.
4
Thus, both in solid and liquid phases,
various types of interactions between ions should be taken
into account: electrostatic and van der Waals interactions,
hydrogen bonds, and aromatic π-π-stacking. The symmetry
and conformational flexibility of individual species play an
important role because they affect the crystal packing and,
hence, melting points. Another problem is related to the phase
content of the solids. Unlike high-melting salts, certain types
of IL (i.e., halides of imidazolium cations
5
) melt from eutectic
mixtures of several crystalline polymorphs. Usually, the
eutectic temperature is considerably lower than melting
points of individual polymorphs. One should not also exclude
formation of glasses instead of crystalline phases which is
quite typical low-melting IL.
6
In this case, mp represents the
glass transition temperature which is rather different from
melting point of the corresponding crystalline state.
* Corresponding author e-mail: varnek@chimie.u-strasbg.fr; http://
infochem.u-strasbg.fr.
1111 J. Chem. Inf. Model. 2007, 47, 1111-1122
10.1021/ci600493x CCC: $37.00 © 2007 American Chemical Society
Published on Web 03/24/2007