AbstractSun is the most powerful source of energy. Sunlight, or solar energy, can be used for heating, lighting and cooling buildings, generating electricity, water heating, and a variety of industrial processes. It is one of the most important clean, renewable energy resources, which comes directly from the sun in the form of radiation. For the development of successful solar systems there is immense need of solar radiation prediction. In recent years data mining and related applications are excessively used in predicting solar radiation and solar energy system design. The aim of this work is to give an overview of such predictive data mining techniques. This paper also highlights the importance of solar energy systems in terms of clean environment. KeywordsArtificial neural networks (ANN), Fossil fuels, Renewable energy, Solar systems. I. INTRODUCTION ENEWABLE energy resources, such as wind, solar and hydropower, offer clean alternatives to fossil fuels. Mostly renewable energy resources come either directly or indirectly from the sun. For example, wind energy is produced from wind blow that produced from sub heat, participates to the growth of plants that are used for biomass energy, and plays a key role in evaporation cycle and precipitation. Sun light is freely available useful resource of renewable energy that reduces operating costs, greenhouse gas (GhG) emissions, and other pollutants as well. The marginal economic and environmental benefits associated with additional solar thus depend on the operating characteristics and emission intensities of the units displaced on either the operating or build margin [1]. Solar electricity generation is non-dispatch able as it cannot be turned on and off when needed but works precisely in the presence of sun light, consequently it demands of solar radiation prediction. Due to the strong increase of solar power generation, the predictions of incoming solar energy are acquiring more importance. It is necessary to predict the amount of energy which will be produced, up to 72 h before, and deviations of Atika Qazi is with Faculty of Computer Science and Information Technology, University of Malaya, Lembah Pantai, 50603 Kuala Lumpur, Malaysia. (Corresponding author e-mail: atika@siswa.um.edu.my ). Fayaz Hussain is with UM Power Energy Dedicated Advanced Centre (UMPEDAC), Level 4, Wisma R&D, University of Malaya, Jalan Pantai Baharu, 59990 Kuala Lumpur. Ram Gopal Raj is with the Faculty of Computer Science and Information Technology, University of Malaya, Lembah Pantai, 50603 Kuala Lumpur, Malaysia. energy production are strongly penalized [2]. Recent studies of solar radiation have shown that instantaneous solar radiation exhibits a distinct bimodal character associated with clear and cloudy states. This suggests that many solar systems may simply be modeled to operate in an on/off fashion corresponding to clear/cloudy time intervals. It is shown that the average-daily solar system performance may be calculated from the product of clear-sky solar performance and the average time fraction of clear sky [3]. Therefore it is indeed important to predict the solar radiations in different time intervals by applying such techniques that gives maximum prediction accuracy. II. BACK GROUND Influential facilities of solar systems reduced the environmental impacts of combustion used in fossil fuel power generation, such as impacts from greenhouse gases and other air pollution emissions. Unlike fossil fuel power generating facilities, solar facilities have very low air emissions of air pollutants such as sulfur dioxide, nitrogen oxides, carbon monoxide, volatile organic compounds, and the greenhouse gas carbon dioxide during operations. To get the maximum benefits of solar systems, accurate prediction of solar irradiance is required. For this many applications are introduced for predication of solar radiation; however data mining applications are considered widely. Data mining is the process of discovering the hidden patterns and analyzing the data from diverse aspects and summarizing it into practical knowledge. It can be used to enhance revenues, decrease cost or sometimes both. The use of data mining in prediction and manufacturing began in the 1990’s [4] . An ID3 algorithm was generalized by Irani et.al [5] , where under diverse and overall conditions outcome of the future experiments is predicted. It is a process in the database that is used to find out and reveal the previously unknown, concealed, significant ad useful patterns [6], [7]. Prediction is the eventual aim of predictive data mining. The predictive data mining is the usually used in business applications to predict some response of interest through a statistical or artificial neural network (ANNs) model or set of models for forecasts. These predictive techniques include: Bagging (Voting, Averaging), Boosting, Stacking (Stacked Generalizations), and Meta-Learning. Artificial neural networks (ANNs) have arisen as advanced data mining tools in cases where other techniques may not produce acceptable predictive results[8]. As the term implies, neural networks Discourse on Data Mining Applications to Design Renewable Energy Systems Atika Qazi, H. Fayaz, and Ram Gopal Raj R International Conference on Advances in Engineering and Technology (ICAET'2014) March 29-30, 2014 Singapore http://dx.doi.org/10.15242/IIE.E0314204 418