Wheat our characterization using NIR and spectral lter based on Ant Colony Optimization Cassiano Ranzan a,b, , Axel Strohm b , Lucas Ranzan a , Luciane F. Trierweiler a , Bernd Hitzmann b , Jorge O. Trierweiler a a Intensication Modeling Simulation Control and Optimization of Process Group, GIMSCOP Universidade Federal do Rio Grande do Sul, Chemical Engineering Department, 90040-040 Porto Alegre, RS, Brazil b Universität Hohenheim, FG Prozessanalytik und Getreidetechnologie Institut für Lebensmittelwissenschaft und Biotechnologie, Stuttgart, Germany abstract article info Article history: Received 25 November 2013 Received in revised form 20 January 2014 Accepted 22 January 2014 Available online 31 January 2014 Keywords: Chemometric modeling On-line process monitoring Near infrared reectance Flour characterization The key objective for process optimization is to obtain higher productivity and prot in chemical or bio-chemical process. To achieve this, we must apply control techniques that closely correlate with our ability to characterize this process. Within this context, optical sensors associated with chemometrical modeling are considered a nat- ural choice due to their low response time as well as their non-intrusive and high sensibility characteristics. Usu- ally, chemometrical modeling is based on PCR (Principal Component Regression) and PLS (Partial Least Squares). However, since optical techniques are highly sensible and bio-chemical mediums are highly complex, these methodologies can be replaced by using chemometrical modeling based on Pure Spectra Components (PSCM). Our study applies PCR, PLS and PSCM for protein prediction in our samples measured with near infrared reec- tance (NIR), comparing the three methodologies for on-line sensor project. We also outline the development of a spectral lter based on PSCM associated with Ant Colony Optimization. The results lead to our conclusion that the use of optical techniques works best when PSCM analysis is applied, as it allows the development of a spectral sensor for protein quantication in our samples with less than twenty NIR wavelengths evaluated, selected from a total of 1150. The ltering tool showed favorable results in condensing relevant information from NIR spectral data, increasing R 2 from sample prediction by almost 60% for PCR models and 40% for PLS models, using 10% and 20% of full spectral data, conrming the viability of ltering methods. © 2014 Elsevier B.V. All rights reserved. 1. Introduction The ability to develop advanced control and optimization tools is in- timately correlated with the ability to measure the state variables [1,2]. Optical sensors are noninvasive, continuous and present low response time and cost with high sensitivity and resolution. More specically, spectroscopy measurements such as uorescence spectroscopy, near infrared reectance (NIR), multivariate FT-IR spectroscopy, Raman spectroscopy, and others [1,36] allow us to detect several analytes simultaneously [7]. All these features make optical sensors one of the most promising tools to be applied in chemical and biochemical pro- cesses [1,8]. Spectral methods provide a very large amount of data that must be pre-processed to provide practical information for the user [911]. Therefore, the use of mathematical modeling is required in order to effectively measure analyte concentrations and/or material properties. As dened by Varmuza and Filzmoser [12], chemometrics concerns the extraction of relevant information from chemical data with mathe- matical and statistical tools. Successful methods to handle such data have been developed in the eld of chemometrics: linear multivariate statistics such as multiple linear regression with factor analysis (FA- MLR), Stepwise Multi Linear Regression (Stepwise MLR), Partial Least Squares (PLS), Genetic Function Algorithm (GFA), Genetic PLS (G/PLS), Principal Component Analysis (PCA) or Principal Component Regres- sion (PCR), as well as non-linear tools, such as Articial Neural Network (ANN) [36,13,14]. The most applicable methods are PCA, PCR and PLS, useful for quantitative analysis of spectroscopy data [15,16]. These tech- niques are meant to provide a synthetic description of large data sets, allowing evaluations across the spectrum [17]. PCA is a powerful tool for data analysis, able to identify patterns in the data set and express data in a manner that highlights similarities and differences. Once patterns are found, the data set can be com- pressed without losing the main information. Several kinds of analyses use it to extract information related to physical and chemical properties from uorescence matrices or for dimensionality reduction of uores- cence spectra in several systems [6,10,1820]. PCR and PLS are commonly used with spectral data. After identifying the Principal Components, which account for most of the variance, these components can be used in regression. This method can transform Chemometrics and Intelligent Laboratory Systems 132 (2014) 133140 Corresponding author. E-mail addresses: cassiano@enq.ufrgs.br (C. Ranzan), Jorge@enq.ufrgs.br (J.O. Trierweiler). 0169-7439/$ see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.chemolab.2014.01.012 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemolab