Contents lists available at ScienceDirect Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo Case study Quantitative thickness prediction of tectonically deformed coal using Extreme Learning Machine and Principal Component Analysis: a case study Xin Wang a , Yan Li b , Tongjun Chen c,e, , Qiuyan Yan a , Li Ma d a School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu, China b School of Agricultural, Computational and Environmental Sciences, University of Southern Queensland, Toowoomba, Queenland, Australia c School of Resource and Earth Science, China University of Mining and Technology, Xuzhou, Jiangsu, China d Key Laboratory of Coal Resources Exploration and Comprehensive Utilization, Ministry of Land and Resources, Xian, Shanxi, China e Key Laboratory of CBM Resource & Reservoir Formation Process, Ministry of Education, Xuzhou, Jiangsu, China ARTICLE INFO Keywords: Thickness prediction Tectonically deformed coal Extreme learning machine Seismic attribute Principal component analysis Cross validation ABSTRACT The thickness of tectonically deformed coal (TDC) has positive correlation associations with gas outbursts. In order to predict the TDC thickness of coal beds, we propose a new quantitative predicting method using an extreme learning machine (ELM) algorithm, a principal component analysis (PCA) algorithm, and seismic attributes. At rst, we build an ELM prediction model using the PCA attributes of a synthetic seismic section. The results suggest that the ELM model can produce a reliable and accurate prediction of the TDC thickness for synthetic data, preferring Sigmoid activation function and 20 hidden nodes. Then, we analyze the applicability of the ELM model on the thickness prediction of the TDC with real application data. Through the cross validation of near-well traces, the results suggest that the ELM model can produce a reliable and accurate prediction of the TDC. After that, we use 250 near-well traces from 10 wells to build an ELM predicting model and use the model to forecast the TDC thickness of the No. 15 coal in the study area using the PCA attributes as the inputs. Comparing the predicted results, it is noted that the trained ELM model with two selected PCA attributes yields better predication results than those from the other combinations of the attributes. Finally, the trained ELM model with real seismic data have a dierent number of hidden nodes (10) than the trained ELM model with synthetic seismic data. In summary, it is feasible to use an ELM model to predict the TDC thickness using the calculated PCA attributes as the inputs. However, the input attributes, the activation function and the number of hidden nodes in the ELM model should be selected and tested carefully based on individual application. 1. Introduction Tectonically deformed coal (TDC) is a kind of coal which their composition had been physically and chemically deformed under the movement of tectonic stress in the previously geological period (Cao et al., 2003; Frodsham and Gayer, 1999). In the present research, the occur- rences of gas outbursts have direct associations with the TDC. The thicker the TDC thickness, the higher the probability of gas outbursts (Cao et al., 2003; Xue et al., 2012). Mining unpredicted thick TDC areas would set miners in very high risks (Hackley and Martinez, 2007; Ju and Li, 2009; Li et al., 2003; Pan et al., 2012, 2015). If the TDC thickness can be predicted quantitatively and accurately, safe coal mining would be easier to achieve. Currently most of the research in the literature are qualitative and focus on the prediction distribution and seismic characterization of the TDC. Extreme Learning Machine (ELM) method, proposed by Huang et al. (2006), is an improvement of single-hidden layer feed-forward neural networks (SLFNs). The learning speed of the ELM can be thousands of times faster than the traditional learning algorithms, like articial neural networks (ANNs), while obtaining better generalization performance (Huang, 2014). In addition, the ELM has many other advantages, such as easy to implement, quick to converge to the smallest training error, small norms of weights and good generalization performance (Huang et al., 2006). Therefore, it has been widely used in regression, multiclass classication, data analysis of non-linear time series, environmental data analysis, water level forecasting of stream- ow and pattern recognition (Benoît et al., 2013; Butcher et al., 2013; De Lima et al., 2016; Deo and Sahin, 2016; Leuenberger and Kanevski, 2015; Yang and Zhang, 2016). Seismic is a main reliable method to forecast the characteristics of coal beds. The most used seismic data in coal beds characterization are seismic attributes which are mathematically or geometrically derivate values of coal beds reection (Chopra and Marfurt, 2007; Ge et al., http://dx.doi.org/10.1016/j.cageo.2017.02.001 Received 26 July 2016; Received in revised form 28 January 2017; Accepted 1 February 2017 Corresponding author at: School of Resource and Earth Science, China University of Mining and Technology, Xuzhou, Jiangsu, China. E-mail addresses: wxgrin@163.com (X. Wang), Yan.Li@usq.edu.au (Y. Li), tjchen@cumt.edu.cn (T. Chen), mary248@163.com (L. Ma). Computers & Geosciences 101 (2017) 38–47 Available online 03 February 2017 0098-3004/ © 2017 Elsevier Ltd. All rights reserved. MARK