Image and Vision Computing 65 (2017) 49–57
Contents lists available at ScienceDirect
Image and Vision Computing
journal homepage: www.elsevier.com/locate/imavis
Behavioral cues help predict impact of advertising on future sales
Gábor Szirtes
a,
*
, Javier Orozco
a
, István Petrás
a
, Dániel Szolgay
a
, Ákos Utasi
a
, Jeffrey F. Cohn
b
a
Realeyes OÜ, Tölgyfa utca 24, Budapest 1027, Hungary
b
University of Pittsburgh, 4322 Sennott Square, Pittsburgh, PA 15260, USA
ARTICLE INFO
Article history:
Received 1 May 2016
Received in revised form 7 March 2017
Accepted 13 March 2017
Available online 22 March 2017
Keywords:
Market research
Behavioral cue
Predictive modeling
Facial expression analysis
ABSTRACT
Advertising aims to influence consumer preferences, appraisals, action tendencies, and behavior in order
to increase sales. These are all components of emotion. In the past, they have been measured through self-
report or panel discussions. While informative, these approaches are difficult to scale to large numbers of
consumers, fail to capture moment-to-moment changes in appraisals that may be predictive of sales, and
depend on verbal mediation. We used web-cam technology to sample non-verbal responses to television
commercials from four product categories in six different countries. For each participant, head pose, head
motion, and more frequent facial expressions like smiling, surprise and disgust were automatically mea-
sured at each video frame and aggregated across subjects. Dynamic features from the aggregated series
were input to simple linear ensemble classifier with 10-fold cross-validation to predict product sales. Sales
were predicted with ROC AUC = 0.75, 95% CI [0.727,0.773] and predictions for unseen categories were con-
sistent for all, but one product groups (ROC AUC varies between 0.74 and 0.83, except for Confections with
0.61). Predictions for unseen countries showed similar pattern: ROC AUC varied between 0.71 and 0.89, with
the exception of Russia with ROC AUC 0.53. In comparison with previous attempts, our approach yielded
higher overall performance and greater generalization over not modeled factors like country or category.
These findings support the feasibility, efficiency, and predictive validity of sales predictions from large-scale
sampling of viewers’ moment-to-moment responses to commercial media.
© 2017 Elsevier B.V. All rights reserved.
1. Introduction
Advertising is about influencing consumer preferences, apprai-
sals, action tendencies, and purchases. Television and increasingly
online video commercials are a key component. Over 80 billion
dollars is spent annually on television commercials in the US
alone [1]. For the companies that produce commercials and for their
clients, there is great interest in evaluating the effectiveness of com-
mercials they produce and distribute. One approach is to correlate
television advertisements with product sales (online shopping in a
short time window around the time of tv ad) [2]. This approach
enables a gross estimate of direct influence of advertising on sales but
is blind to consumer reactions to individual commercials. For that, it
is necessary to assess consumer responses to specific commercials in
relation to product sales.
One solution is to ask viewers to report on their responses to com-
mercials. Focus groups, personal interviews, random-digit phone
This paper has been recommended for acceptance by Mohammad Soleymani.
* Corresponding author.
E-mail address: gabor.szirtes@realeyesit.com (G. Szirtes).
surveys, and online surveys have been used for this purpose. While
providing useful information, these methods have notable limita-
tions. They pull for rational thinking rather than emotional responses
that may be more predictive of purchase behavior; respondents must
verbally represent what often are non-verbal, often unconscious
cognitive-emotional reactions; and the dynamics of their responses
may be compromised by recency effects. Demand characteristics and
social desirability effects may bias reports as well. Focus groups,
surveys, and related methods further assume that verbal reports
are necessarily the best indices of purchasing influences. Evidence
suggests otherwise. People’s preferences often are outside of their
awareness and strongly influenced by emotion [3,4].
Emotions consist of multiple components that include subjective
feelings, action tendencies and physiological arousal. All are prime
candidates for influencing likelihood of purchase decisions. During
emotion episodes, these components become correlated [5].
Automated facial expression analysis using web-cam video acqui-
sition is a promising alternative. Using computer vision and machine
learning, facial expressions of emotion to television advertisements
can be measured on a moment-to-moment basis. This approach
avoids the necessity for viewers to verbally report their experience,
captures fine-grained information about the timing of behavior, and
http://dx.doi.org/10.1016/j.imavis.2017.03.002
0262-8856/© 2017 Elsevier B.V. All rights reserved.