Experimental Evaluation of Multiplicative Kernel SVM Classiﬁers for Multi-Class Detection Valentina Zadrija Mireo d.d. Zagreb, Croatia Email: valentina.zadrija@mireo.hr Siniˇ sa ˇ Segvi´ c Faculty of Electrical Engineering and Computing University of Zagreb Zagreb, Croatia Email: sinisa.segvic@fer.hr Abstract—We consider the multi-class object detection ap- proach based on a non-parametric multiplicative kernel, which provides both separation against backgrounds and feature shar- ing among foreground classes. The training is carried out through the SVM framework. According to the obtained support vectors, a set of linear detectors is constructed by plugging the fore- ground training samples into the multiplicative kernel. However, evaluating the complete set would be inefﬁcient at runtime, which means that the number of detectors has to be reduced somehow. We propose to reduce that number in a novel way, by an appropriate detector selection procedure. The proposed detection approach has been evaluated on the Belgian trafﬁc sign dataset. The experiments show that detector selection succeeds to reduce the number of detectors to the half of the number of object classes. We compare the obtained performance to the results of other detection approaches and discuss the properties of our approach. I. I NTRODUCTION A long-standing goal of computer vision has been to design a system capable of detecting various classes of objects in cluttered scenes. Traditionally, this task has been solved by building a dedicated detector for each class. This approach requires a signiﬁcant number of examples per each class, which may not be available. In order to overcome the problem, addi- tional partitioning into subclasses can be performed. However, the problem is that domain-based partitioning may not be optimal for the task. In case of multi-view object detection, it can also be time consuming and error prone. Therefore, it would be desirable to omit the manual partitioning stage and embed the process into the classiﬁer itself. Another interesting idea is to train a single classiﬁcation function for all classes jointly. This approach may exploit feature sharing among classes in order to improve classiﬁcation against backgrounds. The feature sharing offers great potential for: (i) improving the detection rate for classes with a low num- ber of examples, and (ii) reducing the runtime computational complexity. In this paper, we train a joint classiﬁcation function with the multiplicative kernel as presented in [1]. However, in contrast to [1], where the authors aim to solve the detection and recognition problems at the same time, we focus on the task of detection. Once the object locations are known, object class can be determined for those locations only, thus alleviating the runtime complexity. Multi-class detection and feature sharing is achieved by means of a non-parametric multiplicative kernel. The approach avoids the partitioning into subclasses by using the foreground training samples as class membership labels. More details are given in section III. After the training, we construct a set of linear detectors as described in section III-A. According to [1], each detector corresponds to a single foreground training sample, which makes a detector set extremely large and inefﬁcient for the detection task. The contributions of our work are as follows: (i) we propose an efﬁcient detector selection in order to identify a representative set of detectors out of the large initial pool as described in section III-B, (ii) we show the properties of the multiplicative kernel method and compare the results with other methods on a Belgian trafﬁc sign dataset (BTSD) [2] as described in section IV. II. RELATED WORK In recent years, a lot of work has been proposed in the area of multi-class object detection. However, the work presented in [3] achieved a signiﬁcant breakthrough in the area. The authors consider multiple overlapping subsets of classes. The experiments have shown that a jointly trained classiﬁcation function requires a signiﬁcantly smaller number of features in order to achieve the same performance as independent detectors. More speciﬁcally, the number of features grows logarithmically with respect to the number of subclasses. The approach presented in [4] employs a tree-based clas- siﬁer structure called Cluster Boosted Tree (CBT) in order to solve the multiview detection problem. The tree structure is constructed automatically according to the weak learners se- lected by the boosting algorithm. The node splits are achieved by means of unsupervised clustering. Therefore, in contrast to [3], this approach does not require manual partitioning into classes, but it implies the hierarchical feature sharing. The authors in [5] consider a classiﬁer structure comprised out of several boosted classiﬁers. This approach also avoids manual partitioning into classes, but the classiﬁers do not share weak learners. Initially, the training data is partitioned randomly into subsets. At each round of training, the sample is (re)assigned to the subset corresponding to the classiﬁer that has the highest probability of classifying that sample. The resulting classiﬁers are further transformed into decision trees, which reduce the average number of weak learner evaluations during classiﬁcation. The concept of feature sharing is also explored through shape-based hierarchical compositional models [6], [7]. Dif- ferent object categories can share parts or an appearance. The Proceedings of the Croatian Computer Vision Workshop, Year 2 September 16, 2014, Zagreb, Croatia CCVW 2014 Image and Video Analysis 50