OpenCL-based Remote Offloading Framework for Trusted Mobile Cloud Computing Heungsik Eom, Pierre St Juste, Renato Figueiredo Advanced Computing and Information Systems Laboratory Electrical and Computer Engineering University of Florida, Gainesville, Florida, USA {hseom, pstjuste, renato}@acis.ufl.edu Omesh Tickoo, Ramesh Illikkal, Ravishankar Iyer Intel Corporation 2111 N.E. 25th Avenue Hillsboro, Oregon, USA {omesh.tickoo, ramesh.g.illikkal, ravishankar.iyer}@intel.com Abstract—OpenCL has emerged as the open standard for parallel programming for heterogeneous platforms enabling a uniform framework to discover, program, and distribute parallel workloads to the diverse set of compute units in the hardware. For that reason, there have been efforts exploring the advantages of parallelism from the OpenCL framework by offloading GPGPU workloads within an HPC cluster envi- ronment. In this paper, we present an OpenCL-based remote offloading framework designed for mobile platforms by shifting the motivation and advantages of using the OpenCL framework for the HPC cluster environment into mobile cloud computing where OpenCL workloads can be exported from a mobile node to the cloud. Furthermore, our offloading framework handles service discovery, access control, and data privacy by building the framework on top of a social peer-to-peer virtual private network, SocialVPN. We developed a prototype implementation and deployed it into local- and wide-area environments to evaluate the performance improvement and energy implications of the proposed offloading framework. Our results show that, depending on the complexity of the workload and the amount of data transfer, the proposed architecture can achieve more energy efficient performance by offloading than executing locally. Keywords-Mobile device, OpenCL, heterogeneity, paral- lelism, virtual private networks, energy consumption I. I NTRODUCTION Heterogeneity is now the norm in commodity computing systems where platforms possess a mix of computing units such as CPUs, GPU, and other specialized accelerators. OpenCL has therefore emerged as the open standard for parallel programming for these heterogeneous platforms. By providing a common standard along with the necessary toolchain, OpenCL enables a uniform framework to discover, program, and distribute parallel workloads to the diverse set of compute units in the hardware. Graphics process- ing units (GPUs), in particular, have reached the extended coverage due to their rapidly expanding use in general purpose computing (GPGPU) with parallel programming of the OpenCL standard. For that reason, there have been efforts exploring the advantages of parallelism from the OpenCL framework by offloading GPGPU workloads within an HPC cluster environment [1], [2]. The primary motivation for offloading within a cluster is for more efficient utilization of resources by allowing multiple compute nodes to share the same GPU for general purpose computing. These researchers clearly demonstrate that OpenCL (and CUDA)-based remote offloading is a viable option which saves power through more efficient sharing of heterogeneous compute units over the network despite the communication overheads. In our work, we shift this motivation from the HPC cluster environment to mobile platforms by considering a different perspective to this expanding body of research by adapting the OpenCL offloading approach to a mobile cloud computing scenario. Since previous works focused mainly on offloading OpenCL workloads in HPC cluster environments with high bandwidth and low latency between the nodes, it was easy to realize and assess the advantages. However, the advantages are not as clear in the mobile cloud computing scenario where OpenCL workloads are sent over the wide area on network links with much lower bandwidth and higher latencies than cluster environments. Moreover, since workloads are traversing untrusted networks in the wide-area, a layer of network encryption is necessary to ensure privacy and some level of the trust of the results from the remote compute node. This paper presents an OpenCL-based remote offloading framework designed specifically for mobile cloud computing where OpenCL workloads can be exported from a mobile node (i.e. an Android device) to the cloud (i.e. an Amazon EC2 instance with GPU access). This remote offloading framework consists of the following components: 1) a cus- tomized RPC system with optimizations for network tasking and data marshalling, 2) a service discovery mechanism which selects the compute node with the lowest latency, and 3) a virtual private networking layer which provides transparent network encryption without any modification at the application layer. Our system is implemented as a wrap- per library around the OpenCL API; thus allowing trans- parent integration of the OpenCL API with our framework without any code modification. The offloading framework also makes it possible for the developer to dynamically discover accelerators located on remote computing nodes 2013 19th IEEE International Conference on Parallel and Distributed Systems 1521-9097/13 $26.00 © 2013 IEEE DOI 10.1109/.42 240 2013 19th IEEE International Conference on Parallel and Distributed Systems 1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/.42 240 2013 19th IEEE International Conference on Parallel and Distributed Systems 1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/.42 240 2013 19th IEEE International Conference on Parallel and Distributed Systems 1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/.42 240 2013 19th IEEE International Conference on Parallel and Distributed Systems 1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/.42 240 2013 International Conference on Parallel and Distributed Systems 1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/ICPADS.2013.43 240