An Object Recognition Model Based on Visual Grammars and Bayesian Networks Elias Ruiz and Luis Enrique Sucar Instituto Nacional de Astrof´ ısica, ´ Optica y Electr´ onica Departamento de Ciencias Computacionales Luis Enrique Erro 1, Tonantzintla, Puebla, M´ exico {elias_ruiz,esucar}@inaoep.mx http://www.inaoep.mx Abstract. A novel proposal for a general model for object recognition is presented. The proposed method is based on symbol-relational gram- mars and Bayesian networks. An object is modeled as a hierarchy of fea- tures and spatial relationships using a symbol-relational grammar. This grammar is learned automatically from examples, incorporating a simple segmentation algorithm in order to generate the lexicon. The grammar is created with the elements of the lexicon as terminal elements. This repre- sentation is automatically transformed into a Bayesian network structure which parameters are learned from examples. Thus, recognition is based on probabilistic inference in the Bayesian network representation. Pre- liminary results in modeling natural objects are presented. The main contribution of this work is a general methodology for building object recognition systems which combines the expressivity of a grammar with the robustness of probabilistic inference. Keywords: Visual Grammars, Bayesian Networks, Object Recognition 1 Introduction Most current object recognition systems are centered in recognizing certain type of objects, and do not consider their structure. This implies several limitations: (i) the systems are diﬃcult to generalize to any type of object, (ii) they are not robust to noise and occlusions, (iii) the model is diﬃcult to interpret. This paper proposes a model that achieves a hierarchical representation of a visual object in order to perform object recognition tasks, based on a visual grammar [3] and Bayesian networks (BNs) [8]. Thus, we propose the incorpo- ration of a visual grammar in order to develop an understandable hierarchical model so that from basic elements (obtained by a simple image segmentation algorithm) it will construct more complex forms by certain rules of composi- tion deﬁned in the grammar, in order to achieve object recognition in a limited context (e.g. images of natural objects). The importance of addressing this issue from a hierarchical approach is that it can build a more robust model which can represent variability in a class of R. Klette, M. Rivera, and S. Satoh (Eds.): PSIVT 2013, LNCS 8333, pp. 349–359, 2014. c  Springer-Verlag Berlin Heidelberg 2014