Pytomo: A Tool for Analyzing Playback Quality of YouTube Videos Parikshit Juluri ∗† , Louis Plissonneau , and Deep Medhi {pjuluri,dmedhi}@umkc.edu – CSEE – University of Missouri – Kansas City, USA louis.plissonneau@orange-ftgroup.com – Orange Labs, France Abstract—Online video services account for a major part of broadband traffic with streaming videos being one of the most popular video services. We focus on the user perceived quality of YouTube videos as it can serve as a general index for customer satisfaction. Our tool, Pytomo [1], is a tomography tool that is designed to measure the playback quality of videos as if they are being viewed by a user. We model the YouTube video player to estimate the playback interruptions as experienced by a user watching a YouTube video. We also examine topology and download statistics such as delay towards the server, download rates, and buffering duration. We aim to analyze different DNS resolvers to obtain the IP address of the video server. We study how the DNS resolution impacts the performance of the video download, and thus, the video playback quality. As the tool is intended to run on multiple ISPs, we have discovered some interesting results in YouTube distribution policies. These results can be applied to any content- delivery networks (CDN) architecture and should help users to better understand what are the key performance factors of video streaming. Index Terms—HTTP Streaming, Performance, QoE, DNS. I. I NTRODUCTION Currently, web-driven contents represent about half of the Internet traffic due to the decrease of P2P and the surge of video sharing sites [2], [3], [4], with YouTube being the most popular. Among the different online video services, streaming videos and flash videos are the most popular ones. Services such as blogs and social networks are also enabling users to embed personal videos, and thus, expand video sharing circles. In this paper, we present our tool, Pytomo, to analyze the user experience while watching a YouTube video by using active download analysis. Most of the previous work so far has usually studied either the characterization of YouTube videos, or the YouTube CDN architecture. YouTube videos characteristics’ analyses mainly focus on meta-data. Each crawler fetches the properties of the video (duration, category. . . ) to draw interesting results on caching and distribution policy evaluations [5], on comparison with “classical” web workloads [6], or on graph relations between videos [7]. Some authors complement their study with passive packet captures to analyse the streaming video sessions [8], or the behavior of users in terms of switches and jumps inside videos [9]. Previous studies provide us information about the video characteristics; however, they do not analyze the video playback quality. The YouTube CDN architecture has also been studied in [10] with NetFlow records in order to determine traffic dy- namics outside the ISP network. The YouTube server selection policy can be explained either using active measurements (on PlanetLab nodes) [11], [12], or using passive captures [13]. These measures allow us to better understand the distribution choices of YouTube videos, but they mainly focus on delay and geographical distribution of servers. In [13], the server strategy selection is also evaluated. The impact of DNS resolvers have been compared in terms of latency and caching [14]. Our work differentiates from others since we are interested not only in the delay to access the YouTube video streaming servers, but also in the perceived video playback quality. Moreover, the impact of the DNS resolver on the video playback quality has not been studied yet. We explain the methodology and the evaluation of our tool, Pytomo, in Section II. We briefly present preliminary results in Section III, and state the next steps of our work in Section IV. II. METHODOLOGY In this section, we present the methodology behind the tool. Pytomo [1] is a platform independent open-source automated analysis tool written in Python. The aim is to measure the download and to analyze the playback of YouTube videos, and thus, emulate the user’s watching experience. Interruptions and buffering of online streaming videos occur when the download throughput is lower than the encoding rate of the video. Thus, we choose the number of interruptions during the playback and the total buffering duration as the main playback quality indicators; see Section II-B on how we infer such events. A. Tool Description Pytomo performs an analysis of YouTube video downloads and helps us evaluate user experience. Pytomo emulates the user behavior by downloading a YouTube video, and then selects a number of random related links for downloading. For each video, the download statistics are collected, cal- culated, and stored in a database. In order to perform the download, we first resolve the IP address of the content server and then use this IP address to perform the analysis. By doing so, we ensure that the analysis and video download are being done on the same server. We also take care of HTTP Redirect messages obtained from video servers. a) DNS Resolution: It is possible to use a number of DNS resolvers. In our case, we use three DNS resolvers: a default ISP resolver, Google’s Public DNS (code.google. Proc. of 2011 23rd International Teletraffic Congress (ITC 2011), Poster Paper, San Francisco, September 2011.