Abstract—This work is to design an accelerated SVM (Support Vector Machine) which is suitable for Android operating system. SVM is widely used in the health-related applications. The SVM provides a potential classification technology based on the pattern recognition method and statistical learning theory. This paper proposes a parallel SVM algorithm based on GPU accelerator. GPU can provide better performance on matrix multiplication through parallelization which is the main drawback of conventional SVM execution. The cross validation function in the personal computer is designed and improved, and SVM training function in the mobile devices in addition. Through the above approach, the influence of matrix calculation on the whole system can be reduced to a certain extent. In the experiment of image classification, compared to the serial SVM, the proposed approach can achieve 3.3x speed up in the PC, and 1.5x speed up in the mobile devices. But the accuracy rate is not greatly improved both. Since the experiment mainly focuses on improving the execution time, no optimization is considered on the prediction process. Index Terms—Support vector machine algorithm, parallel computing, GPU and OpenCL based SVM, image classification, matrix multiplication. I. INTRODUCTION Recently, more and more wearable manufacturers focus on the health-related products, and provide their supporting software clients and applications. At present, the wearable devices are used as the attachment of mobile phones. Thus, data rendering and processing need to rely on mobile smart phones. In general, Android phones offer a variety of sensors, such as direction, gravity, distance, acceleration and so on. With these sensor data, it is possible to provide health care services for users. SVM (Support Vector Machine) is a potential classification technology based on the pattern recognition method and statistical learning theory. In the field of health care applications, SVM algorithm is widely used to analyze human behavior. Through the analysis and modeling of the training data, the unknown data can be predicted. Such a process includes target detection, feature extraction, modeling, prediction and other processes. But the SVM consumes a lot of memory space, because of a large-scaled Manuscript received November 5, 2016; revised December 30, 2016. This work was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the Global IT Talent support program (IITP-2016-H0905-15-1003) supervised by the IITP (Institute for Information and Communication Technology Promotion). The authors are with the Yonsei University, Republic of Korea (e-mail: nanyiyan@yonsei.ac.kr, liquanzhe@yonsei.ac.kr, kumcun@yonsei.ac.kr, sdkim@yonsei.ac.kr). data. Usually, it takes a lot of time to train various test data. Thus, it can be a main problem when SVM is applied to the mobile devices. There are many studies related to the optimization of SVM algorithm in OpenCL (Open Computing Language) framework. The characteristics of OpenCL, such as shared virtual memory, dynamic parallelism and general memory space, greatly improve programming flexibility to avoid redundant data transfer. Sparse linear algebra, causing a huge computational load, is the main field of SVM, because the SVM solves the support vector by means of quadratic programming. In this paper, we utilize the GPU accelerator to improve the performance with the proposed optimization method. An accelerated SVM is suggested with the modification of original LIBSVM (A Library for Support Vector Machines). The paper implements parallelization of the raw dataset which is passed to the cross-validation function in order to reduce computational complexity. Furthermore, the proposed optimization method is applied to the RBF (Radial basis function) kernel function in the mobile device. In the experiment, the CIFAR-10 dataset is used to implement image classification. The performance of the accelerated SVM is evaluated both in the PC and mobile devices. The proposed parallel approach becomes 3.3 times faster than the serial computing in the PC, and 1.5 times faster in the mobile device. As a result, the total image classification time has been improved significantly without reducing the accuracy rate. II. RELATED WORK Many researches have been proposed for accelerating SVM’s computational speed since it performs expensive computation during training big-scale dataset. In this section, several approaches are introduced for speeding up performance of SVM’s computation. Cagnini et al. [1] presented a technique that parallelized SVM method within a GPU together with OpenCL framework in order to improve efficiency of binary classification tasks and SVM computations. The authors first identified the most computationally expensive functions and then parallelized these functions. [2] The proposed approach achieved a significant speedup compared to sequential and CUDA-based approach. GPU-Accelerated SVM Training Algorithm Based on PC and Mobile Device Yi-Yan Nan, Quan-Zhe Li, Jin-Chun Piao, and Shin-Dug Kim International Journal of Knowledge Engineering, Vol. 2, No. 4, December 2016 182 doi: 10.18178/ijke.2016.2.4.076 This paper covers the related work in Section II. The proposed new approach is described in Section III. Experiment and results are followed in Section IV. Finally, Section V concludes this paper.