Training regression ensembles by sequential target correction and resampling Ricardo Ñanculef a , Carlos Valle a, , Héctor Allende a , Claudio Moraga b,c a Department of Computer Science, Universidad Técnica Federico Santa María, CP 110-V Valparaíso, Chile b European Centre for Soft Computing, 33600 Mieres, Spain c Faculty of Computer Science, Dortmund University of Technology, 44221 Dortmund, Germany article info Article history: Received 20 August 2009 Received in revised form 7 March 2011 Accepted 24 January 2012 Available online 2 February 2012 Keywords: Ensemble learning Diversity Negative Correlation Bagging AdaBoost Regression estimation abstract Ensemble methods learn models from examples by generating a set of hypotheses, which are then combined to make a single decision. We propose an algorithm to construct an ensemble for regression estimation. Our proposal generates the hypotheses sequentially using a simple procedure whereby the target map to be learned by the base learner at each step is modified as a function of the previous step error. We state a theorem that relates the overall upper error bound of the composite hypothesis obtained within this procedure to the training errors of the individual hypotheses. We also demonstrate that the proposed procedure results in a learning functional that enforces a weighted form of Negative Cor- relation with respect to previous hypotheses. Additionally, we incorporate resampling to allow the ensemble to control the impact of highly influential data points, showing that this component significantly improves its ability to generalize from the known examples. We describe experiments performed to evaluate our technique on real and synthetic data- sets using neural networks as base learners. These results show that our technique exhibits considerably better prediction errors than the Negative Correlation (NC) method and that its performance is very competitive with that of the Bagging and AdaBoost algorithms for regression estimation. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction Ensemble methods [31,14,27,5] have been demonstrated to be a powerful and flexible way to improve the performance of a base learning algorithm in a variety of machine learning scenarios, including classification [45,41,19], regression [7,8], nov- elty detection [32], times series forecasting [46] and clustering [10,37]. The basic idea consists of extracting a model from data by combining a set of simple models that are constructed and organized to achieve a desired goal. The Bagging [4], Ada- Boost [34], Mixture of Experts [17], Stacking [44] and Negative Correlation [22] algorithms, as well as their many variations, are well-known examples of this class of methods. The concept of diversity is commonly used by the ensemble community to denote the differences among the individual components to be combined. Because the replication of multiple exact copies of the same model does not provide an advan- tage over the use of a single instance of such a model, much research has been directed to the definition of strategies to mea- sure and generate diversity in a useful way [6,20]. The most commonly investigated method to promote diversity among the models in an ensemble is probably the manipulation of the training data used to build the individual models. The Bagging [4] 0020-0255/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2012.01.035 Corresponding author. E-mail addresses: jnancu@inf.utfsm.cl (R. Ñanculef), cvalle@inf.utfsm.cl (C. Valle), hallende@inf.utfsm.cl (H. Allende), mail@claudio-moraga.eu (C. Moraga). Information Sciences 195 (2012) 154–174 Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins