CUDA implementation of deformable pattern recognition
and its application to MNIST handwritten digit database
Yoshiki Mizukami
∗
, Katsumi Tadamura
∗
, Jonathan Warrell
†
, Peng Li
‡
and Simon Prince
‡
∗
Graduate School of Science and Engineering, Yamaguchi University, Ube, Japan
Email: {mizu,tadamura}@yamaguchi-u.ac.jp
†
Department of Computing, Oxford Brookes University, Oxford, UK
Email: jwarrell@brookes.ac.uk
‡
Department of Computer Science, University College London, London, UK
Email: lileopold@gmail.com, s.prince@cs.ucl.ac.uk
Abstract—In this study we propose a deformable pattern
recognition method with CUDA implementation. In order to
achieve the proper correspondence between foreground pixels
of input and prototype images, a pair of distance maps are
generated from input and prototype images, whose pixel values
are given based on the distance to the nearest foreground
pixel. Then a regularization technique computes the horizontal
and vertical displacements based on these distance maps.
The dissimilarity is measured based on the eight-directional
derivative of input and prototype images in order to leverage
characteristic information on the curvature of line segments
that might be lost after the deformation. The prototype-
parallel displacement computation on CUDA and the gradual
prototype elimination technique are employed for reducing
the computational time without sacrificing the accuracy. A
simulation shows that the proposed method with the k-nearest
neighbor classifier gives the error rate of 0.57 % for the MNIST
handwritten digit database.
Keywords-handwritten character recognition; displacement
computation; graphics processing unit; compute unified device
architecture;
I. I NTRODUCTION
Deformable approaches are a challenging topic in the
field of computer vision and pattern recognition. One of the
earliest studies is a rubber mask proposed by Widrow [1].
Deformable approaches have been applied to various prob-
lems such as face, object and character recognition.
Many researchers are studying character recognition
based on the modified-NIST handwritten digit database
(MNIST) [2]. Their methods are mainly categorized into
three approaches, that is, statistical, multilayer neural
network, and deformable ones. Support vector machines
(SVMs) are very promising in the statistical approach, and
DeCoste and Scholkopf proposed an SVM-based method
with artificially generated prototypes [3]. Ranzato et al. pro-
posed a convolutional network with unsupervised learning
method for sparse and overcomplete features [4]. As a de-
formable approach for establishing sub-pixel correspondence
between input and prototype images, Belongie et al. pro-
posed a three-step displacement computation [5], where first
many reference points on the contour are selected based on
their shape contexts obtained with the geometry relationship
with the remaining points, second an optimal problem is
solved for pairing reference points on the image with points
on the other image, and finally regularized thin-plate splines
provide a sub-pixel correspondence. Several methods for
giving a pixel-wise correspondence have been proposed [6],
[7]. Keysers et al. studied a non-linear deformation model
with local context of 18-dimensional vector based on vertical
and horizontal Sobel filters [6].
A regularization-based deformable recognition approach
has been studied since 1994 for computing sub-pixel cor-
respondence between input and prototype images [8], [9],
where very simple iterative equations are given from cal-
culus of variations, but it should be noted that the whole
computational cost is proportional to the size of image.
Especially, this problem becomes more serious in dealing
with complicated shapes of characters. One of the solutions
is to efficiently reduce the dimensionality by extracting
features from the image. In 1998 this deformable ap-
proach was applied to Chinese characters with complicated
shapes [10], where the dimensionality of the image was
reduced by employing a directional feature. In addition, it
was clarified that the combination of statistical classifier with
the deformable approach could improve the performance.
Nevertheless, the reduction of the computation cost in this
deformable approach is still very desirable.
In these days, graphics processing units (GPUs) have
gained attention in research and development activity due
to their fast parallel computing performance [11]. It was
not formerly easy for usual programmers to implement their
algorithms on GPUs since they had to learn knowledge and
languages on graphics programming. However, the compute
unified device architecture (CUDA) succeeded in providing
a simple and powerful platform [12].
In this study we propose a novel deformable pattern
recognition method based on the regularization framework.
Distance maps are generated from the input and prototype
images, then the correspondence between them are com-
puted in an iterative manner. The computation time is dras-
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 © 2010 IEEE
DOI 10.1109/ICPR.2010.493
2001
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 © 2010 IEEE
DOI 10.1109/ICPR.2010.493
2005
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 © 2010 IEEE
DOI 10.1109/ICPR.2010.493
2001
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 © 2010 IEEE
DOI 10.1109/ICPR.2010.493
2001
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 © 2010 IEEE
DOI 10.1109/ICPR.2010.493
2001