Box-Cox transformation as an alternative method for
modeling video-on-demand popularity
María Teresa González Aparicio, R. García, Xabiel Garcia Pañeda, D. Melendi, S. Cabrero
Computer Science Department
University of Oviedo
Gijón, Asturias, Spain
{maytega, garciaroberto, xabiel, melendi, cabrerosergio}@uniovi.es
Abstract— The popularity of multimedia videos related to a wide
range of news, which were emitted in three different Spanish
local on-line newspapers, has been researched in this paper. The
statistic distribution from which the popularity came from is
unknown. In fact, throughout the literature, many papers have
modeled popularity with different distributions, such as
Mandelbrot, Stretched, Zipf-like and so on. In this paper, the
Box-Cox transformation has been proposed as a unified
approach that would cover all the former distributions. The main
advantage is its non-parametric nature and in consequence the
model selection might be avoided.
Keywords: Box-Cox, Mandelbrot, Stretched, Video-on-demand,
Zipf-like.
I. INTRODUCTION
Nowadays the presence of streaming media on the Internet
is becoming more popular, especially in web sites dedicated to
news, sports, entertainment, education and even in the business
world for marketing purposes. As a result, system designers
have to face the new features of streaming media content, such
as more computing power, an increase of bandwidth and
storage requirements or a long-lived nature in order to supply
good Web services [8]. Many technologies have emerged to
manage this type of content and to reduce the impact over the
different resources, among which could be mentioned
multicast/unicast delivery, encoding formats or complex cache
replacement policies, some of which are being improved
steadily. However, more multimedia workloads have to be
analyzed to achieve a well-known user access understanding.
In [5][10] an analysis of a video-on-demand service “La
Nueva España” was presented, one of the services that is
analyzed in this paper. Their studies highlight that content type,
subjects, content update policy and even the content success
make popularity a very difficult parameter to be modeled. A
Zipf-like distribution has been applied in stable periods of time
and an average θ was calculated. However, when the
conditions of the service change due to the arrival of new
content, a new value for θ is needed. An algorithm was defined
but a popularity pattern was not established. Indeed, modeling
user access is not an easy task, because there are so many
variables involved. Accordingly, perhaps it is better to get rid
of some of these variables and to start managing a simple
service. For instance, the number of different types of contents
on offer to the user could be reduced and focused to a specific
topic and area. In this paper, we analyze session logs from
three news video-on-demand streaming services, namely "La
Opinión A Coruña" (www.laopinioncoruna.es), "Faro de Vigo"
(www.farodevigo.es) and "La Nueva España" (www.lne.com).
Each of them belongs to a different area of Spain.
As a result, we believe that our study provides relevant
results for the design of news video-on-demand services.
Specifically, it is focused on popularity distribution.
The rest of the paper is organized as follows. Section II
reviews previous work. Section III presents a case study related
to three news on-line video-on-demand services from Spain.
An analysis of popularity with the three services has been
carried out in Section IV. Finally, conclusions and future work
are proposed in Section V.
II. RELATED WORK
The video access pattern has been analyzed in a wide range
of media services (Web, file sharing, media broadcast, video-
on-demand streaming). One of the first distributions applied to
model access pattern was Zipf-like. In [4] a workload of one
week was analyzed, with streaming-media sessions from 4,786
clients to 866 servers on the Internet, who accessed 23,738
different streaming-media objects. 78% were accessed only
once, 1% were accessed ten or more times, and the 12 most
popular objects were accessed more than 100 times each. The
popularity distribution was modeled with Zipf-like with θ
equal to 0.47. The conclusion was that accesses to streaming-
media objects were less concentrated on the popular objects.
Moreover, in [3] the behavior of the video access pattern was
studied at different time scales (one month, six months and
more than one year). Indeed, when the period was below
seven months a Zipf-like approximation was possible with θ
between 1.4 and 1.6, but not for longer periods.
In [7] sixteen workloads have been analyzed with different
delivery methods (streaming, pseudo streaming, overlay
multicast, P2P, etc), different sizes of media file, lengths of
duration (from 5 days to more than two years) and different
types of contents. The video access pattern has been fit with
Stretched Exponential distribution despite of extraneous
traffic, introduction of new content and recommendations
[13], or “fetch-at-most-once” [2].
IEEE Globecom 2010 Workshop on Ubiquitous Computing and Networks
978-1-4244-8864-3/10/$26.00 ©2010 IEEE 1798