IEEE Transactions on Intelligent Transportation Systems 1 Abstract— Automatic crack detection on pavement surfaces is an important research field in the scope of developing an intelligent transportation infrastructure system. In this paper, a novel method on the basis of conditional Wasserstein generative adversarial network (cWGAN) is proposed for road crack detection. A 121-layer densely connected neural network with deconvolution layers for multi-level feature fusion is used as generator, and a 5-layer fully convolutional network is used as discriminator. To overcome the scattered output issue related deconvolution layers, connectivity maps are introduced to represent the crack information within the proposed cWGAN. The proposed method is tested on a dataset collected from a moving vehicle equipped with a commercial grade high speed camera. This dataset is challenging because the images containing cracks also include the disturbance of other objects. The results show that the proposed method achieves state-of-the-art performance compared with other existing methods in terms of precision, recall and F1 score. Index Terms— Crack detection; deep learning; conditional Wasserstein generative adversarial network; Connectivity map I. INTRODUCTION racks on road surfaces are early signs for potential damage in the pavements and in the supporting structures. They serve as a good indicator to assess the current condition of the transportation infrastructure. Defects in road surfaces may delay traffic and even cause safety issues if they are severe. In addition, our road infrastructure must be improved significantly to support the autonomous vehicles of the future in the scope of smart cities. The current common practice in road surface survey is mainly based on visual inspection, which has limitations like high costs and low efficiency. Such defects as cracks or potholes may be present for a considerable amount of time before they are repaired. In this context, the automation of crack or defect detection on pavement surface is invaluable and a vast amount of research has been conducted in this field [1, 2]. One of the most promising methods for automated crack/defect detection is image-based methods using cameras due to the low cost and accessibility of cameras [3]. However, it is a challenging task to distinguish the cracks from the background on images. It is difficult to find a general approach that works for most of the pavement surfaces since the cracks usually have irregular shape, the illumination conditions change for different images and there is always noise like stains or shadows from other objects that can interrupt the analysis. In recent years, deep learning based methods have attracted much attention due to their superior performance in object detection. Girshick et al. [4] introduced a deep neural network called regions-based convolutional neural network (R-CNN). In their paper, they introduced the concept of region proposals to resolve the problem of selecting a large number of regions. A two-step detection, i.e. first generating a series of candidate regions and then conducting classification and regression on these proposed regions, was conducted in their paper. A number of other algorithms were inspired by the idea of R-CNN [5-7]. Fast R-CNN [5] directly fed an image to a convolutional neural network (CNN) to generate the proposed regions in order to achieve better performance and lower computational time. Liu et al. [8] developed a single shot multibox detector (SSD) algorithm for object detection in real time. Instead of two-step detection, SSD speeds up the detection process by eliminating the region proposal network. They achieved similar performance with R-CNN but with significantly increased speeds. Another widely used object detection algorithm is called you only look once (YOLO) [9, 10]. It has evolved to its third generation with many improvements. YOLO is also a real time object detection algorithm. Researchers have made attempts to apply various deep learning algorithms to crack detection [11-13]. However, since cracks do not have a certain shape and usually have extremely large aspect ratio, the crack detection task is very different than other object detection tasks. Also, the publicly available datasets specifically designed to evaluate crack detection algorithms are limited. Furthermore, most of the datasets have been simplified comparing to the ones that could be encountered in real life. For example, some datasets controlled the light conditions [14], some manually exclude any disturbance and focus only on pavement surfaces using static images [15-17], and some was created for other algorithms and simply do not have enough images for deep learning [1, 16]. In this paper, a novel deep learning algorithm based on conditional Wasserstein generative adversarial network (cWGAN) is proposed to detect the cracks at pixel level. The algorithm will be pretrained on a large general dataset called ImageNet [18] and on a small crack dataset called CFD [15], and will then be trained and tested a new dataset, EdmCrack600. EdmCrack600 dataset includes 600 images extracted from A Conditional Wasserstein Generative Adversarial Network for Pixel-level Crack Detection using Video Extracted Images Qipei Mei, Mustafa Gül C Qipei Mei and Mustafa Gül are with the Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB T6G 2R3 Canada (email: qipei@ualberta.ca; mustafa.gul@ualberta.ca).