Lane Discovery in Traffic Video Nicholas Miller Miovision David M. Swart Kitchener, ON Akshaya Mishra Canada Andrew Achkar Abstract Video sensing has become very important in Intelligent Transporta- tion Systems (ITS) due to its relative low cost and non-invasive de- ployment. An effective ITS requires detailed traffic information, in- cluding vehicle volume counts for each lane in surveillance video of a highway or an intersection. The multiple-target, vehicle-tracking and counting problem is most reliably solved in a reduced space defined by the constraints of the vehicles driving within lanes. This requires lanes to be pre-specified. An off-line pre-processing method is presented which automatically discovers traffic lanes from vehi- cle motion in uncalibrated video from a stationary camera. A mov- ing vehicle density map is constructed, then multiple lane curves are fitted. Traffic lanes are found without relying on possibly noisy tracked vehicle trajectories. 1 Introduction In computer vision, the heaviest focus on traffic lane detection is in the autonomous driving application. Only the driver’s own lane boundaries and adjacent lanes need to be detected. Lanes must be localized in real world space from video image space. Hence the camera for this set-up is typically a calibrated front facing cam- era, relatively close to the road. The standard approaches often rely on line fitting to edges from road markings which are clearly visible to the driver from this vantage point [1, 2]. Ego motion in the image is present and initial lane detections are tracked and up- dated. Another area where lane detection is required is for vehicle counting from a statically mounted traffic surveillance camera. Of- ten, in the computer vision literature, traffic lanes are specified manually as a region of interest [4, 5, 6, 7, 8, 9]. This may be suffi- cient in some situations where only total volumes are required, but in many ITS applications a precise breakdown of counts for each lane may be required. Yet manual lane annotation can prove to be error prone, especially when lane marking are unclear and only a small segment of video or a still frame is presented to the user which does not fully demonstrate vehicle paths for every possible lane. For the vehicle counting application, it is sufficient to fit the traf- fic lanes in the image space only. This avoids the need for calibra- tion and extrinsic camera parameter estimation. Furthermore, if the lane detection may be performed off-line as a pre-processing step, lanes may be discovered and traced from the motion of detected vehicles. This results in more reliable lane placement than rely- ing on possibly noisy, degraded, or invisible edge markings. Lane detection from moving vehicles was performed in [3] by tracking trajectories and simple clustering of fitted lines. This assumes that lanes are straight on a planar road and does not address noise due to vehicle tracking errors. Most other automatic lane detection work assumes straight planar lanes and employs manual annota- tion, camera calibration, or homographies [9]. 2 Methodology The objective of this article is to use the positions of vehicles as they move through the image in order to discover all of the traffic lanes. The general approach is to tightly localize vehicles in every frame and register their positions over time using a density image. A random search is used to fit multiple curve models to the den- sity image and thereby identify each separate lane as an individual curve. See Fig. 1. Since traffic cameras are mounted upright, vehicles are verti- cally aligned and their tops and bottoms appear up and down re- spectively. There may be a variety of vehicles of different heights so the location of vehicle tops and centroids may vary in the image. Nonetheless, the defining characteristic of all vehicles traveling in- Fig. 1: Example traffic video. Three lanes (blue) are discovered from moving vehicles. Vehicle density is also overlaid (red). Car Van Transport Truck Lane Image Camera Vehicle Bottom * Fig. 2: A demonstration of the appearance of various sized vehicles in the image. The fronts of several vehicles are shown, all from the same lane. The vehicle tops have various image projections but the bottoms all converge near one image point. This may be used to define the location of the lane in the image. side a particular lane is the curve traced out by bottom point of their apparent boundaries as they move through the image, see Fig. 2. A greedy random search is employed to fit multiple quadratic spline models to the density image. It is similar to sequentially ap- plying Random Sample Consensus (RANSAC) to fit multiple mod- els by removing fitted samples. There are a few key differences to address the well-known pitfalls when using the greedy sequential RANSAC strategy [11, 12]. 2.1 Vehicle Density Image Vehicle trajectories provide the locations of the lanes in which they travel. Unfortunately, solving the multiple target tracking and data association problem reliably enough to discover lanes from vehi- cles trajectories proves difficult in practice. Furthermore, espe- cially for vehicle counting, reliable identification and tracking is best done in a reduced space where the traffic lanes are already identi- fied [6, 7, 10]. In order to discover lanes from vehicle motion without explic- itly tracking them, each vehicle is independently localized in ev- ery frame of video. Vehicles are detected using a deep learning vehicle contour detector which is fused with background subtrac- tion [8]. This provides a precise closed boundary contour around moving vehicles from which the bottom point of each vehicle can be located in each frame. These points are added as samples in a vehicle motion density image. The resulting density image has