Proceedings of the 8 th International Symposium on Mathematical Morphology, Rio de Janeiro, Brazil, Oct. 10 –13, 2007, MCT/INPE, v. 1, p. 337–348. http://urlib.net/dpi.inpe.br/ismm@80/2007/04.13.23.19 Design of robust pattern classiﬁers based on optimum-path forests Jo˜ ao P. Papa 1 , Alexandre X. Falc˜ ao 1 , Paulo A. V. Miranda ∗,1 , Celso T. N. Suzuki †,1 and Nelson D. A. Mascarenhas 2 1 Instituto de Computa¸ c˜ ao (IC), Universidade Estadual de Campinas (Unicamp), SP, Brazil {jpaulo,afalcao}@ic.unicamp.br 2 Departamento de Computa¸ c˜ ao, Universidade Federal de S˜ ao Carlos (UFSCar), SP, Brazil nelson@dc.ufscar.br Abstract We present a supervised pattern classiﬁer based on optimum path forest. The samples in a training set are nodes of a complete graph, whose arcs are weighted by the distances between sample feature vectors. The training builds a classiﬁer from key samples (proto- types) of all classes, where each prototype deﬁnes an optimum path tree whose nodes are its strongest connected samples. The opti- mum paths are also considered to label unseen test samples with the classes of their strongest connected prototypes. We show how to ﬁnd prototypes with none classiﬁcation errors in the training set and propose a learning algorithm to improve accuracy over an eval- uation set. The method is robust to outliers, handles non-separable classes, and can outperform support vector machines. Keywords: supervised classiﬁers, image foresting transform, image analysis, morphological pattern recognition. 1. Introduction Pattern classiﬁcation methods are generally divided into supervised and unsupervised according to their learning algorithms [9]. Unsupervised tech- niques assume no knowledge about the classes (labels) of the samples in the training set, while these labels are exploited in supervised techniques. We propose a method to project supervised pattern classiﬁers based on optimum path forests (OPF). The design of an OPF classiﬁer is based on labeled samples from training and evaluation sets. A test set with unseen samples is used to assess the performance of the classiﬁer. The training samples are nodes of a complete graph in the sample feature space (all pairs of nodes are connected by one arc). See Figure 1(a). The arcs * pavmbr@yahoo.com.br † celso.suzuki@gmail.com 337