STYLISTIC ANALYSIS OF PAINTINGS USING WAVELETS AND MACHINE
LEARNING
Sina Jafarpour, G¨ ung¨ or Polatkan, Eugene Brevdo, Shannon Hughes, Andrei Brasoveanu, Ingrid Daubechies
Princeton University
Departments of Electrical Engineering, Computer Science, and Mathematics
and the Program in Applied and Computational Mathematics
Princeton, NJ 08544
ABSTRACT
Wavelet transforms and machine learning tools can be used
to assist art experts in the stylistic analysis of paintings. A
dual-tree complex wavelet transform, Hidden Markov Tree
modeling and Random Forest classifiers are used here for a
stylistic analysis of Vincent van Gogh’s paintings with results
on two stylometry challenges that concern “dating, resp. ex-
tracting distinguishing features”.
1. INTRODUCTION
Stylometry, i.e determining a painter’s style, is a challeng-
ing problem for art historians. Many factors play a role.
Technical analyses of the painting, including of pigments
present, the materials used and the method of their prepa-
ration, the artist’s process as documented in the underlayers
of the painting (observed through Xray and infrared imag-
ing), etc, provide one type of information. Visual inspection
of the painting is of course very important as well, to evalu-
ate and help characterize the visual appearance and style of
the work. However, even the sum of all these analyses may
prove inconclusive for some works.
A new movement in image processing seeks to use com-
putational tools from image analysis and machine learning
to provide an additional source of analysis for such chal-
lenging paintings, based on the assumption that an artist’s
brushwork can be characterized, (at least in part), by sig-
nature features (e.g. those arising from the artist’s habitual
physical movements) and that such distinguishing quantita-
tively measurable characteristics might be found by machine
learning methods and used as an additional piece of evidence
in stylometry tasks. Indeed, early attempts in this area have
already found considerable success [1, 2, 3].
Recent attempts to characterize paintings of particular
style via features discernible by image processing and ma-
chine learning algorithms, have often focused on character-
izing the statistics of the wavelet coefficients of digital scans
of paintings by that artist [1, 4, 5].
This paper uses an approach of this type on a dataset pro-
vided by the Van Gogh Museum and the Kroller-Muller Mu-
seum in the Netherlands, consisting of high resolution scans
of paintings by Vincent van Gogh.
We combine recent image processing and machine learn-
ing techniques, in order to tackle two stylometry problems
proposed by the two museums: extracting distinguishing fea-
tures, and a dating challenge. We show how modeling style
as a hidden variable, controlling the behavior of the image
observables, such as brushstrokes, color patterns, etc, can
improve the accuracy of the style analyzer to a significant
extent. We use a dual-tree complex wavelet transform [6],
that is (almost) shift invariant, to capture quantitatively the
effects observable in the image. Next, using Hidden Markov
Trees [7], an extension of Hidden Markov Variables, com-
bined with the expectation maximization algorithm [8], we
extract the style parameters from the noisy observables. Fi-
nally, using standard machine learning techniques, we feed
the extracted features to appropriate classifiers, and use the
resulting prediction rule for style analysis.
This paper is a sibling of [10], in which similar tech-
niques were used by our team, for authentication purposes
instead of stylistic analysis.
2. APPLICATIONS
2.1 Dating Challenge
In the absence of convincing documentation, the dating of
a painting is based on where it fits in the chronology of the
artist’s style, concerning for example, subject matter, materi-
als used, color palette, compositional style, and brushwork.
Some undocumented paintings have a mixture of features
that seemingly correspond with different periods of their cre-
ator’s artistic development. Such feature mixes pose difficult
dating challenges .
When dating relies on categorizing style and technique
issues, computer-based image processing tasks for magni-
fying the differences in style should prove useful. Further-
more, artificial intelligence and machine learning techniques
can provide the right tools for the final decision task.
The dating challenge concerns the dating of paintings by
Vincent van Gogh that stem from either his Paris phase (end-
ing early in 1888) or his following late Arles period. The
question is to ascertain which features distinguish the two
test sets (taking as benchmark the paintings that are unques-
tionably from the Paris or Arles period), and to use them
subsequently to attempt to associate each of the dating can-
didates with one group or the other.
In distinguishing Van Gogh paintings from these two pe-
riods, art historians rely on several general observations re-
garding shifts in his practice. For instance, small strokes are
more prominent in Paris, while brush handling is broader in
Arles; colors appear more saturated in Arles due to the filling
in of larger areas.
At the initial stage of the challenge, the set of training
examples included 33 images each, from the Paris and the
Arles periods.
At the final stage, three test paintings were provided.
Each test painting exhibits some general features associated
with Arles, as well as others associated with Paris. The final
goal of this challenge was to come up with a high-confidence
17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009
© EURASIP, 2009 1220