GPU Consolidation for Cloud Games: Are We There Yet? Hua-Jun Hong 1 , Tao-Ya Fan-Chiang 1 , Che-Run Lee 1 , Kuan-Ta Chen 2 , Chun-Ying Huang 3 , and Cheng-Hsin Hsu 1 1 Department of Computer Science, National Tsing Hua University 2 Institute of Information Science, Academia Sinica 3 Department of Computer Science and Engineering, National Taiwan Ocean University Abstract—Since the operating expense is crucial to the in- creasingly popular cloud gaming services, server consolidation is one of the key technologies for the success of these services. In this paper, we conduct extensive experiments using real GPUs and a complete cloud gaming platform to answer the following question: Are modern GPUs ready for cloud gaming? Our experiment results show that modern GPUs have low consolidation overhead, and are powerful enough to concurrently support multiple GPU-intensive cloud games. For example, when using our cloud gaming platform, the recent NVidia K2 GPU outperforms NVidia Quadro 6000 by up to 3.46 times in FPS (frame per second). Moreover, our experiments lead to two new findings that are counter to common beliefs. First, with the latest GPU virtualization technique, shared GPUs may run faster than dedicated GPUs. Second, more context switches not necessarily lead to lower FPS. Last, our experiments shed some light on further enhancements of cloud gaming platforms. For example, offloading the software video codec to the GPUs will result in better gaming experience, which is one of our future tasks. I. I NTRODUCTION The computer game industry generated five times higher revenues than the music industry, 15% more revenues than the consumer book sales, and roughly equal revenues as the movie industry in 2011 [10]. Moreover, cloud gaming has been recognized as the killer application of cloud computing [14], and attracted serious attentions from both the industry [6], [13] and the academia [11], [17]. Cloud gaming moves the games from potentially weak clients to powerful cloud servers, where the game scenes are captured, encoded, and streamed to the clients in real-time. The game scenes are decoded and rendered at clients for gamers, whose inputs are intercepted, coded, and sent back to cloud servers over the reverse channels. Cloud gaming enables gamers to play computer games anywhere, anytime on any devices, and is expected to be very popular in the near future [3]. The cloud gaming providers face a challenge for higher revenues: they want to minimize the operating expense yet achieving high gaming experience [8], [21]. One possible way to reduce the operating expense is to consolidate multiple virtual machines (VMs) onto a physical machine, so as to vir- tualize various resources, including CPUs, networks, storages, and GPUs for sharing. Server consolidation, however, has to be carefully performed, or it may result in degraded gaming experience and drive gamers away from the service. While virtualizing CPUs, network interfaces, and storages is rather mature, virtualizing GPUs is still considered experimental. In fact, several papers warn the potentially poor performance in terms of low frame rate, high response time, and low video quality when GPUs are shared among multiple VMs [4], [16]. For example, Shea and Liu [16] show that the frame rate of Doom 3 is lower than 40 FPS (frame-per-second) even if the hypervisors (Xen and KVM) are configured with one-to- one GPU pass-through, which indicates that sharing the GPU among multiple VMs is virtually impossible. Nonetheless, in the past couple of years, the GPU virtu- alization technology has been dramatically improved, which may have solved the performance issue of GPU consolidation. In this paper, we conduct detailed experiments using modern GPUs and a real cloud gaming platform called GamingAny- where (GA) [9] to answer the following question: Are modern GPUs ready for cloud gaming? In particular, we perform two types of experiments: (i) end-to-end experiments using the complete cloud gaming platform to quantify the overall gaming performance, and (ii) GPU-only experiments using only GPUs to zoom into their detailed performance. Our end- to-end experiments demonstrate that modern GPUs, such as NVIDIA K2, significantly outperform the earlier generation GPUs, such as Quadro 6000, by up to 3.46 times in FPS. This can be attributed to both Moore’s law and more advanced GPU virtualization technologies. Moreover, the cloud gaming platform is stable under different network conditions, such as various bandwidth, delay, and packet loss rate, and in the Internet. On the other hand, we also find that the cloud gaming server with the modern GPUs may become CPU-bounded. Hence, offloading the software video codec to the GPUs will further improve the gaming experience. Our GPU-only experiments reveal several insights that have never been reported in the literature. For example, we observe that: (i) virtualized GPUs may outperform pass-through GPUs, (ii) more context switches not necessarily result in lower FPS, and (iii) the hypervisor is not a bottleneck for managing the virtual GPUs. Some of the observations are different from the previous studies [16], and can be attributed to the recent advances on GPU virtualization. The findings in the GPU-only experiments show the merits of modern virtualized GPUs and shed some light on how to optimize the configurations of cloud gaming platforms. The rest of this paper is organized as follows. We survey the literature in Sec. II. Sec. III presents our testbed setup and measurement methodology. This is followed by the experiment results and discussions in Sec. IV. Sec. V concludes this paper. 978-1-4799-6882-4/14/$31.00 c 2014 IEEE