Empirically Minimax Aftine Mineralogy Estimates from Fourier Transform Infrared Spectrometry Using a Decimated Wavelet Basis PHILIP B. STARK,* MICHAEL M. HERRON, and ABIGAIL MATTESON Department of Statistics, University of California, Berkeley, California 94720 (P.B.S.); and Schlumberger-Doll Research, Old Quarry Road, Ridgefield, Connecticut 06877 (M.M.H., A.M.) The Fourier transform infrared (FT-IR) spectrum of a rock contains information about its constituent minerals. Using the wavelet transform, we roughly separate the mineralogical information in the FT-IR spec- trum from the noise, using an extensive set of training data for which the true mineralogy is known. We ignore wavelet coefficients that vary too much among repeated measurements on rocks with the same min- eralogy, since these are likely to reflect analytical noise. We also ignore those that vary too little across the entire training set, since they do not help to discriminate among minerals. We use the remaining wavelet coefficients as the data for the problem of estimating mineralogy from FT-IR data. For each mineral of interest, we construct an affine estimator 2 of the mass fraction x of the mineral of the form 2 = ~. fi, + b, where is a vector, ~, is the vector of retained wavelet coefficients, and b is a scalar. We find ~ and b by minimizing the maximum error of the esti- mator over the training set. When applying the estimator, we "truncate" to keep the estimated mineralogy between 0 and I. The estimators typ- ically perform better than weighted nonnegative least-squares. Index Headings: Fourier transform infrared spectroscopy; Wavelets; Minimax estimation; Inverse problems. INTRODUCTION Infrared radiation can be used to excite characteristic vibrational modes of chemical compounds. The Fourier transform infrared (FT-IR) absorbance spectrum of a rock thus contains information about the mineralogy of the rock. 1 Figure 1 shows the FT-IR absorbance spectra of three minerals--quartz, kaolinite and calcite--rep- resenting three major classes of sedimentary minerals: silicates, carbonates, and clays. In theory, the spectrum of a mixture of minerals is a linear combination of the spectra of the individual minerals (Beer's law). Thus, in principle, finding the mineralogical composition of a rock from its FT-IR spectrum and the FT-IR spectra of a set of mineral standards should be straightforward. The large differences among the spectra in Fig. i support this pos- sibility. In practice, three complications arise. The first of these is "noise" from variability in the analytical procedure and measurement errors. The sec- ond is the fact that the samples are contaminated by unpredictable quantities of water adsorbed from the at- mosphere and organic solvents used to prepare the KBr pellet carrier. The effect of this variability on mineralogy estimates is amplified by the third complication: many minerals have nearly identical FT-IR spectra. Figure 2 shows the spectra of calcite, dolomite, and siderite. The spectra are quite similar; the dominant feature in all Received 12 April 1993. * Author to whom correspondence should be sent. three is the peak at about 1435 cm -1, which is the C-O stretch common to carbonate minerals. The features that allow one to distinguish among these three spectra are the smaller peaks near 875, 711, and <500 cm -1, where the absorbance is less than 30 % of its value in the larger peak. The near colinearity of the spectra of different minerals impairs our ability to estimate their mass frac- tions. 2 In the presence of observational noise, one can trade off the fraction of one of these minerals against that of another without significantly affecting the fit to the spectral data. Furthermore, partly because of these trade-offs, least-squares fitting of mineralogy to spectral data often produces mineralogy estimates that are neg- ative; nonnegative least-squares remedies this. Both rely on the linearity of the relation between the spectrum and the amount of each mineral present, and finding a weight matrix to account for the noise covariance across the spectrum is nontrivial. Fortunately, in FT-IR mineralogy estimation we can construct training data sets in which the true mineralogy mixtures are known. The FT-IR spectrum has 3601 points at wavenumbers of 4000 to 400 cm-1; from these mea- surements we are interested in estimating the mass frac- tions of 14 nominally different minerals (some chemically distinct minerals are nominally equivalent--see below). Even if we restrict attention to linear estimators of min- eralogy, i.e., estimators that take the vector dot product of the spectrum with a fixed vector to estimate the quan- tity of a given mineral (the least-squares estimate has this form), we would need more than 3601 training mixtures to determine the vector. It would be desirable to have even more, so that the elements would be (for- mally) overdetermined. Constructing thousands of train- ing mixtures is currently prohibitively expensive, and still would not necessarily give reliable estimates. It is an empirical fact that many real-world signals have parsimonious wavelet expressions. FT-IR spectra look like superpositions of wavelets, and some of the visual diagnostics of FT-IR spectra are small, narrow "blips" on top of larger, broader peaks. Similarly, some sources of "noise" (e.g., contamination by organics or water) have broad, smooth signatures in the FT-IR spec- trum. Wavelet decompositions are ideal to extract such features. The approach we adopt here is to separate signal from noise in a crude way using wavelets, discarding wavelet coefficients that seem to be dominated by noise. This procedure reduces the number of "data" so that no linear or affine estimator can perfectly predict the mineralogies 1820 Volume 47, Number 11, 1993 0003-7028/93/4711-182052.00/0 APPLIED SPECTROSCOPY © 1993 Societyfor Applied Spectroscopy