Contents lists available at ScienceDirect Journal of Controlled Release journal homepage: www.elsevier.com/locate/jconrel Predicting physical stability of solid dispersions by machine learning techniques Run Han a,1 , Hui Xiong b,1 , Zhuyifan Ye a,1 , Yilong Yang a , Tianhe Huang a , Qiufang Jing b , Jiahong Lu a , Hao Pan c , Fuzheng Ren b, , Defang Ouyang a a State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China b Engineering Research Centre of Pharmaceutical Process Chemistry, Ministry of Education; Shanghai Key Laboratory of New Drug Design; School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China c School of Pharmaceutical Science, Liaoning University, No.66 Chongshanzhong Road, Shenyang, Liaoning 110036, China ARTICLEINFO Keywords: Solid dispersion Physical stability Machine learning Molecular modeling ABSTRACT Amorphous solid dispersion (SD) is an efective solubilization technique for water-insoluble drugs. However, physical stability issue of solid dispersions still heavily hindered the development of this technique. Traditional stability experiments need to be tested at least three to six months, which is time-consuming and unpredictable. In this research, a novel prediction model for physical stability of solid dispersion formulations was developed by machine learning techniques. 646 stability data points were collected and described by over 20 molecular de- scriptors. All data was classifed into the training set (60%), validation set (20%), and testing set (20%) by the improved maximum dissimilarity algorithm (MD-FIS). Eight machine learning approaches were compared and random forest (RF) model achieved the best prediction accuracy (82.5%). Moreover, the RF models revealed the contribution of each input parameter, which provided us the theoretical guidance for solid dispersion for- mulations. Furthermore, the prediction model was confrmed by physical stability experiments of 17β-estradiol (ED)-PVP solid dispersions and the molecular mechanism was investigated by molecular modeling technique. In conclusion, an intelligent model was developed for the prediction of physical stability of solid dispersions, which beneft the rational formulation design of this technique. The integrated experimental, theoretical, modeling and data-driven AI methodology is also able to be used for future formulation development of other dosage forms. 1. Introduction Over 40% of drugs is water-insoluble, which is one of the most important issues in pharmaceutical researches [1]. Oral drugs with low solubility easily lead to low dissolution rate and then low bioavail- ability [2,3]. Currently many pharmaceutical techniques have been used to improve drug solubility and dissolution rates to increase bioa- vailability of insoluble drugs, including micronization, nano-crystal- lization, solvent deposition, complexation, micellar solubilization, pH adjustment, self-emulsifying drug delivery system, cocrystal, and solid dispersion techniques [4–8]. In recent years, amorphous solid disper- sion (SD) attracts more and more attention due to its convenient pre- paration and superior solubilization efect. SD could be described as the drug molecules dispersed in the polymer matrix with the disordered structure to form amorphous supersaturation solution [9,10]. The amorphous systems are thermodynamically unstable with high tendency to undergo phase separation or recrystallization during sto- rage [11,12]. Thus, physical stability of solid dispersions has become the key issue to hinder the commercialization of this technique [13]. Currently, the physical stability of SDs needs at least three months to six months by trial-and-error experiments, which is time-consuming and unpredictable. If unsuccessful, the long cycle has to be re-tested. Moreover, the mechanism of physical stability of solid dispersions is still poorly understood [14]. In recent years, several theories about the SD stability were discussed, such as the solubility parameters and Tg prediction model [15]. However, these theoretical models need large amount of physicochemical information of each component and plenty of professional knowledge. Moreover, the prediction capability of these models was quite limited with the uncontrolled error due to the mathematic hypothesis. Machine learning (ML) is a branch of artifcial intelligence. ML al- gorithms is capable of “learning” and predicting the complex systems https://doi.org/10.1016/j.jconrel.2019.08.030 Received 22 March 2019; Received in revised form 19 August 2019; Accepted 26 August 2019 Corresponding authors. E-mail addresses: fzren@ecust.edu.cn (F. Ren), defangouyang@um.edu.mo (D. Ouyang). 1 Equally contributed to the manuscript. Journal of Controlled Release 311–312 (2019) 16–25 Available online 26 August 2019 0168-3659/ © 2019 Elsevier B.V. All rights reserved. T