Article Spatial extreme learning machines: An application on prediction of disease counts Marcos O Prates Abstract Extreme learning machines have gained a lot of attention by the machine learning community because of its interesting properties and computational advantages. With the increase in collection of information nowadays, many sources of data have missing information making statistical analysis harder or unfeasible. In this paper, we present a new model, coined spatial extreme learning machine, that combine spatial modeling with extreme learning machines keeping the nice properties of both methodologies and making it very flexible and robust. As explained throughout the text, the spatial extreme learning machines have many advantages in comparison with the traditional extreme learning machines. By a simulation study and a real data analysis we present how the spatial extreme learning machine can be used to improve imputation of missing data and uncertainty prediction estimation. Keywords Bayesian method, extreme learning machines, integrated nested Laplace approximation, missing data, spatial modeling 1 Introduction Artificial neural networks (ANNs) are nonlinear structures inspired by the functioning of the human brain: receive stimuli, compile these stimuli, and transmit a response based on learning. The method is represented by neurons, layers, and synapses and involves several techniques of treatment and definition of parameters. When the relationship between observations and covariates is complex, ANNs are an appropriate tool to learn the underlying information by creating a system that captures the patterns available in the data. For more details, Prieto et al. 1 provides a comprehensive overview about ANN applications and capabilities. Specifically, feedforward neural networks (FNNs) have shown to be efficient to find solutions in problems with complex nonlinear mapping between the inputs and response and, also, provide alternative models for phenomena that are hard to be handled by parametric techniques. Multiple layers networks have been used to model complex data; however, it has been shown in theory that single-layer feedforward neural networks (SLFNNs) can approximate any continuous function. 2 Although SLFNNs success and applicability it is well known that (1) the traditional backpropagation algorithm can stop in local minima providing undesired results, (2) the network can provide overfit to the data by the training algorithm, and (3) gradient-based learning is computationally costly in most applications. 3 To overcome some of these limitations extreme learning machines (ELMs) 3 were proposed as a much more computationally efficient alternative to train SLFNNs providing results as good as traditional SLFNN. Huang et al. 3 showed that it is not necessary to estimate all parameters in a SLFNN, instead the hidden layer linear coefficients can be randomly chosen without losing the capacity of making prediction (generalization performance). This property is essential to make ELM extremely efficient to be fitted. Also, it guarantees that ELMs will overcome some of the drawbacks presented in the traditional SLFNN. Comparison between ELM and a variety of machine learning methods was performed to check its generalization capabilities. 4–6 Recently, Lin et al. 7 showed that ELMs still suffer from generalization problem and an l 2 regularization can improve the generalization capability of the method. For a detailed review about ELM, see Huang et al. 8 Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil Corresponding author: Marcos O Prates, Department of Statistics, Universidade Federal de Minas Gerais, Av. Anto ˆ nio Carlos 6627, Belo Horizonte, Minas Gerais 31310-240, Brazil. Email: marcosop@est.ufmg.br Statistical Methods in Medical Research 0(0) 1–12 ! The Author(s) 2018 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0962280218767985 journals.sagepub.com/home/smm