J. Parallel Distrib. Comput. 104 (2017) 114–129 Contents lists available at ScienceDirect J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc A novel approach to accelerate calibration process of a k-nearest neighbours classifier using GPU Amreek Singh a,b, , Kusum Deep a , Pallavi Grover b a Indian Institute of Technology—Roorkee, Roorkee-247667, India b Snow & Avalanche Study Establishment, Chandigarh-160036, India highlights The SIMD model of parallel computing fitted into calibration sub-processes. A methodology formulated around GPU hardware architecture and memory hierarchy. Combines primitives of parallel implementations of ABC algorithm and k-NN algorithm. NVIDIA Tesla C2050 GPU used with CUDA programming framework. Over 10× acceleration achieved in calibration process. article info Article history: Received 11 December 2015 Received in revised form 30 December 2016 Accepted 5 January 2017 Available online 16 January 2017 Keywords: NVIDIA CUDA GPU SIMD ABC algorithm k-nearest neighbours Avalanche forecasting abstract General purpose data parallel computing with graphical processing unit (GPU) is much structured today with NVIDIA R CUDA and other parallel programming frameworks. Exploiting the CUDA programming framework, the present work proposes a novel methodology formulated around the GPU hardware ar- chitecture and memory hierarchy to accelerate the calibration process of a classification model named eNN10. Primarily developed for avalanche forecasting, eNN10 is based on brute force k-nearest neigh- bours (k-NN) approach and employs snow-meteorological variables to search for past days with similar conditions. The events associated with past similar days are then analysed to generate forecast. The model is required to be calibrated regularly to ensure higher degree of forecast accuracy in terms of Heidke skill score (HSS). The calibration of eNN10 is carried out by Artificial Bee Colony (ABC) algorithm, a swarm intelligence driven population based metaheuristic algorithm, and it requires thousands of HSS evalu- ations during the complete calibration process. A MATLAB sequential code for calibration runs for over 400 minutes and the proposed methodology delivered about 10× acceleration in calibration process. The methodology combines primitives of parallel implementations of brute force k-NN algorithm with that of population based metaheuristic algorithms and is scalable to deal with other similar real-world problems. The major objective of this paper is to highlight the methodology and associated future research areas. © 2017 Elsevier Inc. All rights reserved. 1. Introduction k-nearest neighbours (k-NN) is a popular non-parametric method for data exploration and classification. Snow & Avalanche Study Establishment (SASE) of India developed a k-NN based classification model, named as eNN10, for forecasting of snow avalanches in Indian Himalaya [49–51]. An avalanche is rapid flow of snow mass down the slope of a mountain. It can occur Corresponding author at: Snow & Avalanche Study Establishment, Chandigarh- 160036, India. E-mail address: amreek@sase.drdo.in (A. Singh). due to several reasons ranging from increased load on snowpack due to new snowfall and metamorphic changes in snowpack to natural causes like rains, earthquakes, rockfall, etc. As an avalanche poses serious threat to lives of people living/venturing in the area, its forecast is of extreme importance. The information about past avalanche occurrences associated with nearest neighbours identified by eNN10 model leads to forecast of future event (occurrence or non-occurrence of avalanche). However, in order to achieve high forecast accuracy from k-NN based models, certain model parameters are required to be pre-determined as emphasized by many researchers [8,17,46]. Mathematically it is an optimization problem and the act of determining optimal values of these model parameters is termed as model calibration [13]. http://dx.doi.org/10.1016/j.jpdc.2017.01.003 0743-7315/© 2017 Elsevier Inc. All rights reserved.