Proceedings of the 8 th International Symposium on Mathematical Morphology, Rio de Janeiro, Brazil, Oct. 10 –13, 2007, MCT/INPE, v. 1, p. 337–348. http://urlib.net/dpi.inpe.br/ismm@80/2007/04.13.23.19 Design of robust pattern classifiers based on optimum-path forests Jo˜ ao P. Papa 1 , Alexandre X. Falc˜ ao 1 , Paulo A. V. Miranda ,1 , Celso T. N. Suzuki ,1 and Nelson D. A. Mascarenhas 2 1 Instituto de Computa¸ ao (IC), Universidade Estadual de Campinas (Unicamp), SP, Brazil {jpaulo,afalcao}@ic.unicamp.br 2 Departamento de Computa¸ ao, Universidade Federal de S˜ ao Carlos (UFSCar), SP, Brazil nelson@dc.ufscar.br Abstract We present a supervised pattern classifier based on optimum path forest. The samples in a training set are nodes of a complete graph, whose arcs are weighted by the distances between sample feature vectors. The training builds a classifier from key samples (proto- types) of all classes, where each prototype defines an optimum path tree whose nodes are its strongest connected samples. The opti- mum paths are also considered to label unseen test samples with the classes of their strongest connected prototypes. We show how to find prototypes with none classification errors in the training set and propose a learning algorithm to improve accuracy over an eval- uation set. The method is robust to outliers, handles non-separable classes, and can outperform support vector machines. Keywords: supervised classifiers, image foresting transform, image analysis, morphological pattern recognition. 1. Introduction Pattern classification methods are generally divided into supervised and unsupervised according to their learning algorithms [9]. Unsupervised tech- niques assume no knowledge about the classes (labels) of the samples in the training set, while these labels are exploited in supervised techniques. We propose a method to project supervised pattern classifiers based on optimum path forests (OPF). The design of an OPF classifier is based on labeled samples from training and evaluation sets. A test set with unseen samples is used to assess the performance of the classifier. The training samples are nodes of a complete graph in the sample feature space (all pairs of nodes are connected by one arc). See Figure 1(a). The arcs * pavmbr@yahoo.com.br celso.suzuki@gmail.com 337