Facies recognition using a smoothing process through Fast Independent Component Analysis and Discrete Cosine Transform Alexandre Cruz Sanchetta a,n , Emilson Pereira Leite b , Bruno César Zanardo Honório b a Rua Mendeleiev, s/n, Cidade Universitária “Zeferino Vaz”, Barão Geraldo, Campinas, São Paulo 13083-970, Brasil b Rua João Pandiá Calógeras, 51, Cidade Universitária “Zeferino Vaz”, Barão Geraldo, Campinas, São Paulo 13983-970, Brasil article info Article history: Received 6 June 2012 Received in revised form 14 March 2013 Accepted 25 March 2013 Available online 3 April 2013 Keywords: Discrete Cosine Transform Independent Component Analysis Automatic classiﬁcation Reservoir characterization abstract We propose a preprocessing methodology for well-log geophysical data based on Fast Independent Component Analysis (FastICA) and Discrete Cosine Transform (DCT), in order to improve the success rate of the K-NN automatic classiﬁer. The K-NN have been commonly applied to facies recognition in well-log geophysical data for hydrocarbon reservoir modeling and characterization. The preprocess was made in two different levels. In the ﬁrst level, a FastICA based dimenstion reduction was applied, maintaining much of the information, and its results were classiﬁed; In second level, FastICA and DCT were applied in smoothing level, where the data points are modiﬁed, so individual points have their distance reduced, keeping just the primordial information. The results were compared to identify the best classiﬁcation cases. We have applied the proposed methodology to well-log data from a petroleum ﬁeld of Campos Basin, Brazil. Sonic, gamma-ray, density, neutron porosity and deep induction logs were preprocessed with FastICA and DCT, and the product was classiﬁed with K-NN. The success rates in recognition were calculated by appling the method to log intervals where core data were available. The results were compared to those of automatic recognition of the original well-log data set with and without the removal of high frequency noise. We conclude that the application of the proposed methodology signiﬁcantly improves the success rate of facies recognition by K-NN. & 2013 Elsevier Ltd. All rights reserved. 1. Introduction Well-log data have been used in many areas of geological and geophysical data analysis, such as in reservoir characterization where models of subsurface properties that take into account details about rock physics and the ﬂuids contained in the rocks are constructed (Avseth et al., 2005; Coconi-Morales et al., 2010; Doyen, 2007; Dubrule, 1994). Another example is the use of well-log data to predict seismic parameters related to Amplitude vs. Offset (AVO) data such as Vp, s, Poisson's ratio s ðÞ, among others, that aid in the comprehension of reservoirs (Rutheford and Willians, 1989). Well-log data has also been widely used in structural and stratigraphic mapping. In order to connect the well-log data with other geological or geophysical information, it is important to correlate them with lithofacies described from core samples. However, such direct information is often not available to the entire length of the wells, mainly because of ﬁnancial restrictions. Therefore, pattern recog- nition methods must be applied for prediction or classiﬁcation. To mention a few examples of application of pattern recogni- tion methods in well-log data, Grana et al. (2012) had constructed a complete statistical workﬂow for obtaining petrophysical prop- erties at the well location and the corresponding facies classiﬁca- tion; Messina and Langer (2011) have implemented unsupervised algorithms based on self-organizing maps and cluster analysis to analyze and to interpret volcanic tremor data; Turlapaty et al. (2010) proposed a method based on wavelet-based feature extrac- tion and one-class support vector machines to analyze satellite remote sensing data applied to soil moisture and vegetation mapping; and Rosati and Cardarelli (1997) applied texture features based on gray tone spatial dependence matrices to classify patterns observed on magnetic anomaly maps. In fact, all types of data can be separated into several subsets, where each data element present in a random subset contains some information in common with the other elements in that subset. In other words, it is possible to classify all elements based on some common characteristics identiﬁed in the data set. This is the basic principle of automatic classiﬁcation (MacQueen, 1967). The several approaches to automatic classiﬁcation can be divided into two major groups: supervised methods and unsupervised methods (Duda and Hart, 1973; Mitchell, 1997; Schuerman, 1996). Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/cageo Computers & Geosciences 0098-3004/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.cageo.2013.03.021 n Correspondence to: Rua Doutor Geraldo de Campos Freire, 567, Barão Geraldo, Campinas, São Paulo 13083-480, Brasil. Tel.: +55 19 3287 6618 (residence), mobile: +55 19 8811 9176. E-mail addresses: alexandr@dep.fem.unicamp.br (A.C. Sanchetta), emilson@ige. unicamp.br (E.P. Leite), brunohonorio@gmail.com (B.C.Z. Honório). Computers & Geosciences 57 (2013) 175–182