Received: 18 February 2009, Revised: 8 June 2009, Accepted: 10 June 2009, Published online in Wiley InterScience: 28 July 2009 Bayesian regularization: application to calibration in NIR spectroscopy C. E. Alciaturi a,b * and G. Quevedo b The use of a Bayesian regularization algorithm is proposed for calibration in near-infrared spectroscopy (NIR) with linear models. The algorithm used in this work is based upon the concepts developed by MacKay for inference and model comparison in artificial neural networks. It is demonstrated that this algorithm is fast, easy to use, and shows good generalization properties without previous dimensionality reduction. Examples are shown for NIR spectroscopy calibration and synthetic data. Copyright ß 2009 John Wiley & Sons, Ltd. Supporting information may be found in the online version of this paper. Keywords: Bayesian regularization; linear models; calibration; near-infrared spectroscopy 1. INTRODUCTION A problem frequently found in instrumental analysis is, given a matrix of independent variables, X, and a dependent variable y , to create a model that not only explains the relationship between them, but also has good generalization properties (gives good responses for new cases). One of the most important areas of application where this problem is apparent is near infrared (NIR) spectroscopy. An experiment typically produces more independent variables (absorbance at given wavelengths) than calibration measurements (samples). This is usually handled by variable selection or dimensionality reduction, the latter being done by projection to a lower dimension in principal components regression (PCR) [1] or partial least squares (PLS) [1–12]. The last algorithm is also known as Projection to Latent structures. The number of latent variables is usually determined by cross-validation [1,2,10]. However, Bayesian statistics, that make use of additional (prior) information, can improve the quality of models, and help in model selection without having to appeal to cross-validation, so currently there is a strong interest in its application in spectroscopy [13–15]. Bayesian statistics is a quantitative expression of the principle known as ‘‘Ockham’s razor’’ (‘‘Entities should not be multiplied without necessity’’ or ‘‘among competing hypotheses, favor the simplest one’’). Complicated models may fit a given set of data exactly, but are not good predictors of new data [16]. Different approaches have been used in regression: one of the most popular is known as Markov Chain Monte Carlo (MCMC) [13–15,17]; another is MacKay’s ‘‘Bayesian backprop’’ [18–24]. Dimensionality reduction has been usually done by principal components analysis (PCA) [13–15,17]. Some software is available for Bayesian statistics [15]; however, user-friendly software seems to be still lacking for applications [13]. Bayesian backprop is the algorithm applied here. In this work it is proved that Bayesian regularization is applicable to ‘‘large p, small n’’ regression problems, without variable selection or dimensionality reduction, as was done in previous work [15]. 2. MACKAY ´ S BAYESIAN APPROACH FOR MODEL SELECTION A Bayesian framework for model comparison and regularization was demonstrated by MacKay for neural networks [21]. The methods described are applicable not only to neural networks, but also to other linear or nonlinear models in regression, classification, or density estimation problems. Models that are ‘‘too complex’’ and/or ‘‘under-regularized’’ are penalized and inferred to be less probable. This approach (called ‘‘Bayesian backprop’’) selects regularized models that have been shown (empirically) to have good generalization properties. Different priors and basis sets are objectively compared by quantitatively evaluating the ‘‘evidence’’ for them. The evidence measures how probable the model is, given the data. An effective algorithm, known as Gauss–Newton Bayesian Regularization (GNBR) was described by Foresee and Hagan and applied to the training of feed-forward neural networks [19]. This algorithm is the one implemented here, for the first time, for applications in spectroscopy with linear systems with many parameters (wavelengths), and in this paper it is called ‘‘linear-GNBR.’’ 3. THE GNBR (GAUSS–NEWTON BAYESIAN REGULARIZATION) ALGORITHM The GNBR algorithm [19,20] is based upon MacKay’s work on Bayesian interpolation [21,22]. This method constrains the magnitude of the network weights and improves generalization, as has been proved in applications [18]. An implementation of Bayesian regularization is found in the Neural Network Toolbox of Matlab [20]. In least-squares calculations, regression finds the parameters that give a minimum of the sum of squared errors. With regularization the objective function becomes F ¼ aE D þ bE W (1) (www.interscience.wiley.com) DOI: 10.1002/cem.1253 Research Article * Correspondence to: C. E. Alciaturi, Instituto Zuliano de Investigaciones Tecno- lo ´gicas (INZIT), Apartado Postal 331, Maracaibo, Venezuela. E-mail: alciaturi@gmail.com a C. E. Alciaturi Instituto Zuliano de Investigaciones Tecnolo ´gicas (INZIT), Apartado Postal 331, Maracaibo, Venezuela b C. E. Alciaturi, G. Quevedo Postgrado de Ingenierı ´a, Universidad del Zulia (LUZ), Edificio Fobeca, Avenida Universidad, Maracaibo, Venezuela J. Chemometrics 2009; 23: 562–568 Copyright ß 2009 John Wiley & Sons, Ltd. 562