CUDA implementation of deformable pattern recognition and its application to MNIST handwritten digit database Yoshiki Mizukami ∗ , Katsumi Tadamura ∗ , Jonathan Warrell † , Peng Li ‡ and Simon Prince ‡ ∗ Graduate School of Science and Engineering, Yamaguchi University, Ube, Japan Email: {mizu,tadamura}@yamaguchi-u.ac.jp † Department of Computing, Oxford Brookes University, Oxford, UK Email: jwarrell@brookes.ac.uk ‡ Department of Computer Science, University College London, London, UK Email: lileopold@gmail.com, s.prince@cs.ucl.ac.uk Abstract—In this study we propose a deformable pattern recognition method with CUDA implementation. In order to achieve the proper correspondence between foreground pixels of input and prototype images, a pair of distance maps are generated from input and prototype images, whose pixel values are given based on the distance to the nearest foreground pixel. Then a regularization technique computes the horizontal and vertical displacements based on these distance maps. The dissimilarity is measured based on the eight-directional derivative of input and prototype images in order to leverage characteristic information on the curvature of line segments that might be lost after the deformation. The prototype- parallel displacement computation on CUDA and the gradual prototype elimination technique are employed for reducing the computational time without sacriﬁcing the accuracy. A simulation shows that the proposed method with the k-nearest neighbor classiﬁer gives the error rate of 0.57 % for the MNIST handwritten digit database. Keywords-handwritten character recognition; displacement computation; graphics processing unit; compute uniﬁed device architecture; I. I NTRODUCTION Deformable approaches are a challenging topic in the ﬁeld of computer vision and pattern recognition. One of the earliest studies is a rubber mask proposed by Widrow [1]. Deformable approaches have been applied to various prob- lems such as face, object and character recognition. Many researchers are studying character recognition based on the modiﬁed-NIST handwritten digit database (MNIST) [2]. Their methods are mainly categorized into three approaches, that is, statistical, multilayer neural network, and deformable ones. Support vector machines (SVMs) are very promising in the statistical approach, and DeCoste and Scholkopf proposed an SVM-based method with artiﬁcially generated prototypes [3]. Ranzato et al. pro- posed a convolutional network with unsupervised learning method for sparse and overcomplete features [4]. As a de- formable approach for establishing sub-pixel correspondence between input and prototype images, Belongie et al. pro- posed a three-step displacement computation [5], where ﬁrst many reference points on the contour are selected based on their shape contexts obtained with the geometry relationship with the remaining points, second an optimal problem is solved for pairing reference points on the image with points on the other image, and ﬁnally regularized thin-plate splines provide a sub-pixel correspondence. Several methods for giving a pixel-wise correspondence have been proposed [6], [7]. Keysers et al. studied a non-linear deformation model with local context of 18-dimensional vector based on vertical and horizontal Sobel ﬁlters [6]. A regularization-based deformable recognition approach has been studied since 1994 for computing sub-pixel cor- respondence between input and prototype images [8], [9], where very simple iterative equations are given from cal- culus of variations, but it should be noted that the whole computational cost is proportional to the size of image. Especially, this problem becomes more serious in dealing with complicated shapes of characters. One of the solutions is to efﬁciently reduce the dimensionality by extracting features from the image. In 1998 this deformable ap- proach was applied to Chinese characters with complicated shapes [10], where the dimensionality of the image was reduced by employing a directional feature. In addition, it was clariﬁed that the combination of statistical classiﬁer with the deformable approach could improve the performance. Nevertheless, the reduction of the computation cost in this deformable approach is still very desirable. In these days, graphics processing units (GPUs) have gained attention in research and development activity due to their fast parallel computing performance [11]. It was not formerly easy for usual programmers to implement their algorithms on GPUs since they had to learn knowledge and languages on graphics programming. However, the compute uniﬁed device architecture (CUDA) succeeded in providing a simple and powerful platform [12]. In this study we propose a novel deformable pattern recognition method based on the regularization framework. Distance maps are generated from the input and prototype images, then the correspondence between them are com- puted in an iterative manner. The computation time is dras- 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.493 2001 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.493 2005 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.493 2001 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.493 2001 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.493 2001