Computational Intelligence, Volume 22, Number 3/4, 2006 CONCEPTUAL CLUSTERING AND CASE GENERALIZATION OF TWO-DIMENSIONAL FORMS SILKE J¨ ANICHEN AND PETRA PERNER Institute of Computer Vision and Applied Computer Sciences, IBaI, Leipzig, Germany Case-based object recognition requires a general case of the object that should be detected. Real-world appli- cations such as the recognition of biological objects in images cannot be solved by one general case. A case base is necessary to handle the great natural variations in the appearance of these objects. In this paper, we will present how to learn a hierarchical case base of general cases. We present our conceptual clustering algorithm to learn groups of similar cases from a set of acquired structural cases of fungal spores. Due to its concept description, it explicitly supplies for each cluster a generalized case and a measure for the degree of its generalization. The resulting hierarchical case base is used for applications in the field of case-based object recognition. We present results based on our application for health monitoring of biologically hazardous material. Key words: conceptual clustering, hierarchical clustering, case mining, case-based object recognition, fungal spore recognition, health monitoring. 1. INTRODUCTION Model-based object recognition methods are used to detect objects of interest in images where thresholding-based image segmentation methods fail. To determine if a new unseen image contains an object, the model is matched against this image. The model can be either an object model or a contour model. We consider in our work the contour model that consists of a set of pixel positions which describe the contour of the objects. The matching involves transforming the model with respect to the image and, for each transformation, calculating the similarity between the model and the image. A positive match is found if the similarity exceeds a predefined threshold. In case-based object recognition a class of objects is represented by a generalized case for the purpose of efficient matching (Perner and B¨ uhring 2004). If this representative case is not known a priori, it must be learned from real examples. Special problems arise if the objects of interest have a great variation; thus, one cannot generalize from one single case. A case base is necessary which describes the different appearances of the objects. However, even then it is not known in advance how many cases are necessary to detect all objects with a sufficiently high accuracy. Clustering techniques can be used to mine for groups of similar cases in a set of acquired real examples. For each group it is possible to determine a generalized case to represent this group. Because we do not know the number of cases in advance, we will use hierarchical cluster analysis methods to learn a hierarchy of decreasingly generalized cases. If this hier- archy is applied for case-based object recognition, initially, a few strongly generalized cases from the top are matched with an unknown object. In the positive case, the object matches the case with some similarity. Then this object can be matched against the other increasingly specialized cases along the hierarchy until finally the object class with the highest similarity value can be determined. Thus, the organization of the case base in a hierarchical instead of a flat fashion might speed up the recognition process especially in case-based reasoning (CBR) applications with a large number of cases. When learning a representative case of a cluster, this case should be averaged over all cases in this cluster by generalizing common properties of the instances. We offer two different approaches to calculate such a representative. While the first one is to learn an artificial case that is positioned in the centroid, the second one selects that case out of a cluster which has the minimum distance to all other cases in this cluster. C 2006 Blackwell Publishing, 350 Main Street, Malden, MA 02148, USA, and 9600 Garsington Road, Oxford OX4 2DQ, UK.