Fisheries Research 111 (2011) 170–176 Contents lists available at ScienceDirect Fisheries Research jou rn al h om epa ge: www.elsevier.com/locate/fishres A comparison of multi-class support vector machine and classification tree methods for hydroacoustic classification of fish-schools in Chile H. Robotham a, , J. Castillo b,1 , P. Bosch a,2 , J. Perez-Kallens a,3 a Facultad de Ingeniería, Instituto de Ciencias Básicas, Universidad Diego Portales, Avenida Ejército 441, Santiago, Chile b Depto de Evaluaciones Directas Instituto de Fomento Pesquero-Blanco 839 Valparaíso, Chile a r t i c l e i n f o Article history: Received 4 May 2011 Received in revised form 28 July 2011 Accepted 31 July 2011 Keywords: Classification tree Support vector machines Species identification Hydroacoustics Fish a b s t r a c t The purpose of this study was to compare the results of the multi-class support vector machines (SVM) classification method to those of the classification tree (CART) method for automatic classification of fish schools. The discrimination study was done using descriptors of morphology, bathymetry, energy, and space positions for schools of three species; anchovy (Engraulis ringens), common sardine (Strangomera bentincki), and jack mackerel (Trachurus murphyi) from acoustic data in southern-central Chile. The classi- fication rate averages were 86.8% with classification trees and 89.5% with SVM. The levels of importance of the descriptors presented by the two methods are not fully concordant (Kendall’s rank coefficient of concordance is 0.77). However, the two methods agree on the groups of descriptors considered as effective for classification. The bottom depth descriptor was the most important for classification trees, while the school-altitude index was the most important for SVM. This highlights the importance of the bathymetric and positional descriptors in the classification of species compared to energetic and mor- phometric descriptors. Advantages and disadvantage of the methods are presented. Classification trees have the advantages over SVM of being easier to implement and interpret, but have a lesser performance. One major problem with trees is their high degree of variance. Because each classification method has its own performance, limitations and advantages, a good practice is to use two or more classifiers. © 2011 Elsevier B.V. All rights reserved. 1. Introduction Acoustic techniques are widely used around the world to study the behaviour of fish and estimate their abundance and distribution. These techniques have made significant progress in recent decades with the development of more rapid computers, new transducers and microelectronics. Despite the technologi- cal advances in acoustic devices with improvements in detection capacity and computer processing, there is still the challenge of species identification directly by acoustics (MacLennan and Holliday, 1996; Horne, 2000; Fernandes et al., 2006; Trenkel et al., 2008). Echograms provide information about size, location and echo intensity of fish schools, however the species composition is not directly known (Fernandes, 2009). An approach to solve this question is to perform algorithms that use different parameters in the post-processing of acoustic signals to identify species. The pos- Corresponding author. Tel.: +56 2 6762416; fax: +56 2 6762402. E-mail addresses: hugo.robotham@udp.cl (H. Robotham), jorge.castillo@ifop.cl (J. Castillo), paul.bosch@udp.cl (P. Bosch), jaime.perez@udp.cl (J. Perez-Kallens). 1 Tel.: +56 32 2151474; fax: +56 3 2 2151465. 2 Tel.: +56 2 6762409; fax: +56 2 6762402. 3 Tel.: +56 2 6762420; fax: +56 2 6762402. sibility to provide acoustic species identification of individual fish schools is based on the assumption that schooling process reflects differences in the behaviour among species allowing for an infer- ence about the species. Acoustic species identification will probably not be successful in discriminating two different species forming schools. Species are usually classified by scrutinizing the echograms, use of expert criteria and additional information from trawl sam- pling (Simmonds and MacLennan, 2005). Because this procedure incorporates some degree of subjectivity in interpretation, other approaches have been developed based on the information pro- vided by echosounders. Species identification based on school descriptors of morphology, bathymetry, energy and geographical position extracted from single-frequency and single beam acous- tic data represents one approach (Scalabrin et al., 1996). A second approach uses multi-frequency acoustic data (Korneliussen et al., 2009), combined with information about the morphological and geographical distribution of fish species. A wide range of classification models has been used to clas- sify fish schools based on acoustics/school descriptors: principal component analysis and discriminant-function analysis (Nero and Magnuson, 1989; Vray et al., 1990; Scalabrin et al., 1996; Lawson et al., 2001); artificial neural networks (Haralabous and Georgakarakos, 1996; Simmonds et al., 1996, Cabreira et al., 2009); 0165-7836/$ see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.fishres.2011.07.010