IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, VOL. 4, NO. 3, JULY 2013 671 Using Data-Mining Approaches for Wind Turbine Power Curve Monitoring: A Comparative Study Meik Schlechtingen, Ilmar Ferreira Santos, and Soﬁane Achiche Abstract—Four data-mining approaches for wind turbine power curve monitoring are compared. Power curve monitoring can be applied to evaluate the turbine power output and detect devia- tions, causing ﬁnancial loss. In this research, cluster center fuzzy logic, neural network, and -nearest neighbor models are built and their performance compared against literature. Recently de- veloped adaptive neuro-fuzzy-interference system models are set up and their performance compared with the other models, using the same data. Literature models often neglect the inﬂuence of the ambient temperature and the wind direction. The ambient tem- perature can inﬂuence the power output up to 20%. Nearby ob- stacles can lower the power output for certain wind directions. The approaches proposed in literature and the ANFIS models are compared by using wind speed only and two additional inputs. The comparison is based on the mean absolute error, root mean squared error, mean absolute percentage error, and standard devi- ation using data coming from three pitch regulated turbines rating 2 MW each. The ability to highlight performance deviations is in- vestigated by use of real measurements. The comparison shows the decrease of error rates and of the ANFIS models when taking into account the two additional inputs and the ability to detect faults earlier. Index Terms—Condition monitoring, data mining, fuzzy neural networks, machine learning, neural networks, power generation, power system faults, signal analysis, wind energy. NOMENCLATURE ANFIS Adaptive neuro-fuzzy interference system. CCFL Cluster center fuzzy logic. k-NN -nearest neighbor. M5P Quinlan’ M5 algorithm for including trees. MAE Mean absolute error. MAPE Mean absolute percentage error. MF Membership function. MLP Multilayer-perceptron. NN Neural network. Manuscript received October 02, 2012; revised December 06, 2012; accepted January 05, 2013. Date of publication February 14, 2013; date of current version June 17, 2013. M. Schlechtingen is with the Department of Technical Operation Wind Off- shore, EnBW Erneuerbare Energien GmbH, 20459 Hamburg, Germany (e-mail: m.schlechtingen@enbw.com). I. F. Santos is with the Department of Mechanical Engineering, Section of Solid Mechanics, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark. S. Achiche is with the Department of Mechanical Engineering, Machines Design Section, Ecole Polytechnique de Montréal, Montréal, QC, H3C 3A7, Canada. Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/TSTE.2013.2241797 NR Normal range. REP Representative. RMS Root mean squared error. SCADA Supervisory control and data acquisition. SD Standard deviation. WEC Wind energy converter. I. INTRODUCTION I N THE past decades, the cumulated worldwide installed ca- pacity of wind energy converters (WECs) grew exponen- tially. One of the reasons is the achieved reduction in cost of energy that has today reached a level where it is almost com- parable to conventionally generated power from coal and gas ﬁred power plants. More and more wind turbine operators de- cide to trade their energy directly on the electricity market. For this purpose and to keep the cost of energy down and increase proﬁt margins, operators need to be able to prognosticate the performance of their turbines more accurately. In case of de- creased turbine performance, operators may be unable to de- liver their traded amount of energy and consequentially have to pay ﬁnes. Furthermore, ﬁnancial loss is generated as the power output of the turbine is lower than expected and the revenue is hence missing on the balance sheet. Here, power curve moni- toring can serve as an effective method to evaluate the perfor- mance as power curves for WECs describe the essential relation between wind speed and electrical power output [1]. Detected decrease allows the operator to take action to identify the root cause and improve performance. Different models were proposed in the past to estimate wind turbine power curves for performance evaluation. The basic idea of all model approaches in this context is to identify closely re- lated signals (e.g., the wind speed) to use them to build a model of the power output. After model training (learning the model the input–output, e.g., wind speed—power output relation), the model is kept ﬁxed and applied in the following using the inputs to obtain an expectation of the output. The prediction error can then be an indicator for anomaly—the prediction error is deﬁned here as the difference between the model’s output (expectation) and the real measurement. In 1997, Li et al. [2] presented a method using multilayer per- ceptron (MLP) neural networks (NNs) to predict wind power generation of stall regulated wind turbines. NNs can learn non- linear relationships between input and output data sets by use of activation functions within the hidden neurons. However, it uses a black box approach to globally ﬁt a single function to the data and thereby losing insight into the problem [3]. 1949-3029/$31.00 © 2013 IEEE