sensors Letter Mapping Utility Poles in Aerial Orthoimages Using ATSS Deep Learning Method Matheus Gomes 1 , Jonathan Silva 2 , Diogo Gonçalves 2 , Pedro Zamboni 1 , Jader Perez 1 , Edson Batista 1 , Ana Ramos 3 , Lucas Osco 3 , Edson Matsubara 2 , Jonathan Li 4 , José Marcato Junior 1, * and Wesley Gonçalves 1,2 1 Faculty of Engineering, Architecture and Urbanism and Geography, Federal University of Mato Grosso do Sul, Campo Grande 79070900, Brazil; matheusmbg.eng@gmail.com (M.G.); mail.pedrozamboni@gmail.com (P.Z.); jaderluc@gmail.com (J.P.); edson.ufms@gmail.com (E.B.); wesley.goncalves@ufms.br (W.G.) 2 Faculty of Computer Science, Federal University of Mato Grosso do Sul, Campo Grande 79070900, Brazil; jonathan.andrade@ufms.br (J.S.); diogo.goncalves@ufms.br (D.G.); edsontm@facom.ufms.br (E.M.) 3 Post-Graduate Program of Environment and Regional Development, University of Western São Paulo, Presidente Prudente 18067175, Brazil; anaramos@unoeste.br (A.R.); lucasosco@unoeste.br (L.O.) 4 Department of Geography and Environmental Management and Department of Systems Design Engineering, University of Waterloo (UW), Waterloo, ON N2L3G1, Canada; junli@uwaterloo.ca * Correspondence: jose.marcato@ufms.br Received: 21 September 2020; Accepted: 10 October 2020; Published: 26 October 2020   Abstract: Mapping utility poles using side-view images acquired with car-mounted cameras is a time-consuming task, mainly in larger areas due to the need for street-by-street surveying. Aerial images cover larger areas and can be feasible alternatives although the detection and mapping of the utility poles in urban environments using top-view images is challenging. Thus, we propose the use of Adaptive Training Sample Selection (ATSS) for detecting utility poles in urban areas since it is a novel method and has not yet investigated in remote sensing applications. Here, we compared ATSS with Faster Region-based Convolutional Neural Networks (Faster R-CNN) and Focal Loss for Dense Object Detection (RetinaNet ), currently used in remote sensing applications, to assess the performance of the proposed methodology. We used 99,473 patches of 256 × 256 pixels with ground sample distance (GSD) of 10 cm. The patches were divided into training, validation and test datasets in approximate proportions of 60%, 20% and 20%, respectively. As the utility pole labels are point coordinates and the object detection methods require a bounding box, we assessed the inﬂuence of the bounding box size on the ATSS method by varying the dimensions from 30 × 30 to 70 × 70 pixels. For the proposal task, our ﬁndings show that ATSS is, on average, 5% more accurate than Faster R-CNN and RetinaNet. For a bounding box size of 40 × 40, we achieved Average Precision with intersection over union of 50% ( AP 50 ) of 0.913 for ATSS, 0.875 for Faster R-CNN and 0.874 for RetinaNet. Regarding the inﬂuence of the bounding box size on ATSS, our results indicate that the AP 50 is about 6.5% higher for 60 × 60 compared to 30 × 30. For AP 75 , this margin reaches 23.1% in favor of the 60 × 60 bounding box size. In terms of computational costs, all the methods tested remain at the same level, with an average processing time around of 0.048 s per patch. Our ﬁndings show that ATSS outperforms other methodologies and is suitable for developing operation tools that can automatically detect and map utility poles. Keywords: object detection; convolutional neural network; utility pole detection Sensors 2020, 20, 6070; doi:10.3390/s20216070 www.mdpi.com/journal/sensors