Large Occlusion Completion Using Normal Maps Engin Tola, Andrea Fossati, Christoph Strecha, Pascal Fua Computer Vision Laboratory ´ Ecole Polytechnique F´ ed´ erale de Lausanne Switzerland Abstract. Dense disparity maps can be computed from wide-baseline stereo pairs but will inevitably contain large areas where the depth can- not be estimated accurately because the pixels are seen in one view only. A traditional approach to this problem is to introduce a global opti- mization scheme to fill-in the missing information by enforcing spatial- consistency, which usually means introducing a geometric regularization term that promotes smoothness. The world, however, is not necessarily smooth and we argue that a better approach is to monocularly estimate the surface normals and to use them to supply the required constraints. We will show that, even though the es- timates are very rough, we nevertheless obtain more accurate depth-maps than by enforcing smoothness. Furthermore, this can be done effectively by solving large but sparse linear systems. 1 Introduction Though dense short-baseline stereo matching is well understood [1, 2], its wide baseline counterpart is much more challenging due to large perspective distor- tions and increased occluded areas. It is nevertheless worth addressing because it can yield more accurate depth estimates while requiring fewer images to re- construct a complete scene. It has been shown that replacing traditional correlation windows by SIFT- like descriptors such as DAISY [3], which can be efficiently computed at every point in the image, yields more effective dense matching and therefore better depth maps for widely separated images. This, however, does not address the occlusion issue, that is, of pixels that are seen in one image only and to which it is difficult to assign a reliable depth value. A traditional approach to solving this problem is to rely on a global optimization scheme to fill the resulting holes in the depth map by enforcing spatial consistency, which usually means introducing a geometric regularization term that promotes smoothness. The world, however, is not always smooth, especially in urban environments, and this approach often results in over-smoothing. In this paper, we show that we can improve upon this situation by estimating the normals in the occluded areas and using these estimates to more accurately fill-in the holes in our depth maps. More specifically, even though single-view estimation of surface normals is known to be difficult, as evidenced by the fact that shape-from-texture remains