A Directional Texture Descriptor via 2D Walking Ant Histogram Serkan Kiranyaz, Miguel Ferreira * and Moncef Gabbouj 1 Tampere University of Technology, Tampere, Finland, * University of Twente, Netherlands {serkan.kiranyaz, moncef.gabbouj}@tut.fi, * mpaf98@yahoo.com 1 This work was supported by the Academy of Finland, project No. 213462 (Finnish Centre of Excellence Program (2006 - 2011) 1 The research leading to this work was partially supported by the COST 292 Action on Semantic Multimodal Analysis of Digital Media Abstract A novel texture descriptor, which can be extracted from the major object edges automatically and used for the content-based retrieval in multimedia databases, is presented. The proposed method is adopted from the 2D Walking Ant Histogram, which is in fact a generic shape descriptor recently developed for general purpose multimedia databases. 2D WAH shape descriptor is motivated from the imaginary scenario of a walking ant with a limited line of sight over the boundary of a particular object; eventually each sub- segment is traversed and the process keeps describing a certain line of sight, whether it is a continuous branch or a corner, using individual 2D histograms. In this paper we tuned this approach as an efficient texture descriptor, which achieves a superior performance especially for directional textures. Integrating the whole process as feature extraction module into MUVIS framework allows us to test the mutual performance of the proposed texture descriptor in the context of multimedia indexing and retrieval. 1. Introduction In the area of content-based multimedia indexing and retrieval, there is a lack of a generic and robust shape descriptor and the existing methods are merely applicable on such databases where object shapes are extracted manually (e.g. binary shape databases). Alternatively, the efforts are mainly focused on edge-based approaches since the edge field in an image usually represents both object boundaries and texture. MPEG-7 Edge Histogram (EHD) [7] generates a histogram of the main edge directions (vertical, horizontal and two diagonals) within fixed size blocks. It is an efficient texture descriptor for the images with heavy textural presence. It can also work as a shape descriptor as long as the edge field contains the true object boundaries and not saturated by the background texture. In this case the method is particularly efficient on describing geometric objects due to its block-based edge representation only with four directions. A similar but pixel-based method applied directly over Canny edge field [2] is called Histogram of Edge Directions (HED) [1]. Another approach, so called Angular Radial Partitioning (ARP), is presented in [4]. ARP basically works over radial blocks (angular slices from quantized radial steps from the center of mass of a re-scaled image). Although rotation invariance can be obtained within this method, the shape outlines are degraded due to the loss of aspect ratio during re-scaling of the image into square dimensions to fit a surrounding circle. A promising method, Edge Pixel Neighborhood Histogram (EPNH) [3], creates a 240-bin histogram from the direction of the neighbor edge pixels. Although it can describe only one-pixel neighborhood over the entire edge field, it exhibits a comparable performance to MPEG- 7 EHD. Nevertheless, all these methods turn out to be texture descriptors since they cannot discriminate the true object boundaries that are usually suppressed from the surrounding texture edges. 2D Walking Ant Histogram (2D WAH) [6] is initially developed to address this problem as a generic shape descriptor, which works as long as the majority of object edges are available yet the full object (boundary) extraction may or may not be possible. So the main advantage of it is that it can still describe a shape from its rough sketch with some missing parts. It works over the edge field of the image; however ordinary images are usually too “detailed” to achieve an accurate shape extraction over the edge field. Therefore, as proposed in [5] the relevant sub-segments, which are characterized by long, connected series of relatively strong edge-pixels, are extracted from the scale-map as the first step and then a novel shape description, as referred to 2D Walking Ant Histogram (WAH), is applied over them. It is basically motivated from the following imaginary scenario; suppose an ant is walking over a solid object and every once in a while, say in a few steps, it “describes” its “Line of Sight (LoS)” in a convenient way. It can eventually perform a detailed (high resolution) description since it is quite small compared to the object. So cumulating all the intermediate LoS descriptions in a (2D) histogram, particularly focusing on continuous branches and major corners, yields an efficient cue about the shape. Such a description is still feasible if some portion of the object boundary is missing and this is essentially the major advantage of this method. The description frequency (i.e. how often the ant makes a new – intermediate- description) and the length of LoS will obviously be the two major parameters of this scheme. The third one is the amount (number) of relevant sub-segments that are taken into consideration (description). Keeping this number sufficiently low yields the method to describe only the major object(s’) boundaries whilst discarding the texture edges. In this paper, we reverse this process and configure 2D WAH as a texture descriptor by performing necessary manipulations and changes on the generic overview, yet keeping the primary 2D WAH structure intact, that is, extracting the necessary amount of sub-segments from the edges of the texture and describing them via (branch) 2D WAH histogram. The proposed method is fully automatic (i.e. without any supervision, feedback or training involved). Forming the whole process as a FeX module into MUVIS framework, [8], allows us to test the overall performance in the context of multimedia indexing and retrieval. Accordingly we will make comparative evaluations through existing edge based texture descriptors (e.g. MPEG-7 EHD, EPNH and ARP) mentioned earlier, as well as generic and powerful texture descriptors such as Gabor [9], Gray Level Co-occurrence