Split & Dual Screen Comparison of Classic vs Object-based Video
Maarten Wijnants
Hasselt University ś tUL
Expertise Centre for Digital Media
Diepenbeek, Belgium
maarten.wijnants@uhasselt.be
Sven Coppers
Hasselt University ś tUL
Expertise Centre for Digital Media
Diepenbeek, Belgium
sven.coppers@uhasselt.be
Gustavo Rovelo Ruiz
Hasselt University ś tUL
Expertise Centre for Digital Media
Diepenbeek, Belgium
gustavo.roveloruiz@uhasselt.be
Peter Quax
Hasselt University
ś tUL ś Flanders Make ś EDM
Diepenbeek, Belgium
peter.quax@uhasselt.be
Wim Lamotte
Hasselt University ś tUL
Expertise Centre for Digital Media
Diepenbeek, Belgium
wim.lamotte@uhasselt.be
ABSTRACT
Over-the-top (OTT) streaming services like YouTube and Netfix
induce massive amounts of video data, hereby putting substantial
pressure on network infrastructure. This paper describes a demon-
stration of the object-based video (OBV) methodology that allows
for the quality-variant MPEG-DASH streaming of respectively the
background and foreground object(s) of a video scene. The OBV
methodology is inspired by research into human visual attention
and foveated compression, in that it allows to adaptively and dynam-
ically assign bitrate to those portions of the visual scene that have
the highest utility in terms of perceptual quality. Using a content
corpus of interview-like video footage, the described demonstration
proves the OBV methodology’s potential to downsize video bitrate
requirements while incurring at most marginal perceptual impact
(i.e., in terms of subjective video quality). Thanks to its standards-
compliant Web implementation, the OBV methodology is directly
and broadly deployable without requiring capital expenditure.
CCS CONCEPTS
· Information systems → Multimedia streaming; Web applica-
tions; · Networks → Public Internet; Application layer protocols;
· Human-centered computing → User studies;
KEYWORDS
video coding, H.264, HTTP Adaptive Streaming, MPEG-DASH, sub-
jective evaluation, Web
ACM Reference Format:
Maarten Wijnants, Sven Coppers, Gustavo Rovelo Ruiz, Peter Quax, and Wim
Lamotte. 2019. Split & Dual Screen Comparison of Classic vs Object-based
Video. In Proceedings of the 27th ACM International Conference on Multimedia
(MM ’19), October 21–25, 2019, Nice, France. ACM, New York, NY, USA, 3 pages.
https://doi . org/10. 1145/3343031. 3350582
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
MM ’19, October 21–25, 2019, Nice, France
© 2019 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-6889-6/19/10.
https://doi . org/10. 1145/3343031. 3350582
1 INTRODUCTION AND MOTIVATION
Massive amounts of video trafc are being delivered over the contem-
porary Internet [14] and these trafc volumes are poised to increase
even further in the nearby future [5]. However, when watching
videos, not all spatial regions of these videos are equally impor-
tant from a perceptual quality perspective. Evidence abounds in the
literature that human visual focus during video consumption is sub-
consciously drawn to a limited quantity of salient objects or regions
that stand out from the video background [2, 7, 15]. For instance, hu-
man faces are known to be natural attractors of visual attention [3, 8].
At the same time, complementary research shows that the sampling
density and sensitivity of the Human Visual System (HVS) gradually
decays as the distance to our eyes’ fxation area increases [3, 9]. Stated
diferently, only those spatial portions of the video where viewers
fxate on are processed by the HVS in maximal fdelity, while more
distant regions are being assigned ever lower processing bandwidth.
Driven by the just described observations, we have previously in-
troduced the object-based video (OBV) methodology [18] that allows
for the adaptive delivery of a visual scene by decomposing it into a
background and one or more foreground objects which each can be in-
dependently streamed in a quality-variant manner. As such, the OBV
methodology supports the introduction of intra-scene quality dif-
ferences during network delivery. In contrast, a classic frame-based
video encoder does not grant such versatility, since it will enact a
rather uniform bitrate and hence quality distribution over the inte-
gral visual scene. The OBV methodology can be regarded as a video-
only specialization of the more general object-based media (OBM)
paradigm [16] that is currently under active investigation by both
academia [13] and industry [10]. This demonstration illustrates the
viability of leveraging video-only OBM mechanisms to achieve signif-
icant cost reductions (i.e., in terms of network load) that entail no or
at most limited repercussions with respect to perceived video quality.
2 OBV IMPLEMENTATION
The OBV methodology is exclusively implemented using standard-
ized Web technologies (i.e., HTML5, CSS, JavaScript, WebGL) to
yield a portable, multi-platform solution that is accessible via an
of-the-shelf Web browser [18]. The methodology takes as input a
video that has been disassembled into multiple visual signals, one
per background and individual foreground object. In each such sig-
nal, pixels that have been segmented away are replaced by a fxed