Joint optical flow estimation, segmentation, and 3D interpretation with level sets H. Sekkati, A. Mitiche * Institut National de la recherche Scientifique, INRS-EMT, Place Bonaventure, 800, de la Gauchetie `re Ouest, Suite 6900, Montre ´al, Que., Canada H5A1K6 Received 8 July 2005; accepted 16 November 2005 Available online 5 January 2006 Abstract This paper describes a variational method with active curve evolution and level sets for the estimation, segmentation, and 3D inter- pretation of optical flow generated by independently moving rigid objects in space. Estimation, segmentation, and 3D interpretation are performed jointly. Segmentation is based on an estimate of optical flow consistent with a single rigid motion in each segmentation region. The method, which allows both viewing system and viewed objects to move, results in three steps iterated until convergence: (a) evolution of closed curves via level sets and, in each region of the segmentation, (b) linear least squares computation of the essential parameters of rigid motion, (c) estimation of optical flow consistent with a single rigid motion. The translational and rotational components of rigid motion and regularized relative depth are recovered analytically for each region of the segmentation from the estimated essential param- eters and optical flow. Several examples with real image sequences are provided which verify the validity of the method. Ó 2005 Elsevier Inc. All rights reserved. Keywords: 3D structure; 3D motion; Segmentation; Optical flow; Variational method; Level sets A major branch of picture processing deals with image analysis or scene analysis; ... the desired output is a description of the given picture or scene. Segmentation is basically a process of pixel classifica- tion; the picture is segmented into subsets by assigning the individual pixels to classes ... The approach dis- cussed ... minimizes the expected classification error, ..., more generally, (an) expected cost ... It should be point- ed out that if range or velocity information is available for each pixel, obtained by special sensors or derived from stereopairs or image sequences, we can use this information as a basis for segmenting the picture. (At) an edge, the gray level is relatively consistent in each of two adjacent, extensive regions, and changes abruptly as the border between the regions is crossed ... this is a special case of pixel classification. Azriel Rosenfeld in A. Rosenfeld and A. C. Kak Digital picture processing, Academic Press, Second edition, 1982. Foreword Azriel Rosenfeld founded the field of image process- ing four decades ago. He identified and investigated fun- damental subjects which are still the focus of intense research. The book Digital picture processing by A. Rosenfeld and A. C. Kak, published in 1976, was structured based on these subjects and described research in a textbook style. This made its contents informative, accessible, and attractive so as to bestir extensive and sustained research. This paper deals with image segmentation/edge-detection and description, sub- jects that the quotes from the book highlight. The basis for segmentation/edge-detection is the velocity field derived from an image sequence, and the saught descrip- tion is the three-dimensional (3D) structure and motion of the objects in the observed scene. The velocity field and its 3D interpretation are derived from cost function 1077-3142/$ - see front matter Ó 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.cviu.2005.11.002 * Corresponding author. E-mail addresses: sekkati@emt.inrs.ca (H. Sekkati), mitiche@emt. inrs.ca (A. Mitiche). www.elsevier.com/locate/cviu Computer Vision and Image Understanding 103 (2006) 89–100