Towards Complete Free-Form Reconstruction of Complex 3D Scenes from an Unordered Set of Uncalibrated Images H. Cornelius 1 , R. ˇ S´ ara 2 , D. Martinec 2 , T. Pajdla 2 , O. Chum 2 , and J. Matas 2 1 Royal Institute of Technology (KTH) Department of Numerical Analysis and Computing Science 100 44 Stockholm, Sweden hugoc@nada.kth.se, http://www.nada.kth.se 2 Center for Machine Perception, Czech Technical University 166 27 Prague, Czech Republic {sara,martid1,pajdla,chum,matas}@cmp.felk.cvut.cz, http://cmp.felk.cvut.cz Abstract. This paper describes a method for accurate dense reconstruc- tion of a complex scene from a small set of high-resolution unorganized still images taken by a hand-held digital camera. A fully automatic data processing pipeline is proposed. Highly discriminative features are ﬁrst detected in all images. Correspondences are then found in all image pairs by wide-baseline stereo matching and used in a scene structure and cam- era reconstruction step that can cope with occlusion and outliers. Image pairs suitable for dense matching are automatically selected, rectiﬁed and used in dense binocular matching. The dense point cloud obtained as the union of all pairwise reconstructions is fused by local approxima- tion using oriented geometric primitives. For texturing, every primitive is mapped on the image with the best resolution. The global structure reconstruction in the ﬁrst step allows us to work with an unorganized set of images and to avoid error accumulation. By using object-centered geometric primitives we are able to preserve the ﬂexibility of the method to describe complex free-form structures, pre- serve the possibility to build the dense model in an incremental way, and to retain the possibility to reﬁne the cameras and the dense model by bundle adjustment. Results are demonstrated on partial models of a circular church and a Henri de Miller’s sculpture. We observed spatial resolution in the range of centimeters on objects of about 20 m in size. 1 Introduction Building geometric representation of a complex scene from a set of views is one of the classical Computer Vision problems. The task is to obtain a model that (1) can either be used to generate a novel view for a moving observer or (2) contains explicit representation of the structure (3D topology and geometry) of the scene. The focus of this paper is on the latter. We present a method that obtains the 3D model from a small unordered set of uncalibrated images. This means that the Presented at 2nd Workshop on Statistical Methods in Video Processing, SMVP 2004, Prague, May 2004. This is a pre-print of the paper that will appear in Springer LNCS proceedings.