Mathematical Flaws in the Essential Matrix Theory TAYEB BASTA College of Computing Al Ghurair University Po Box 37374, Dubai Academic City, Dubai UNITED ARAB EMIRATES tayebasta@gmail.com Abstract: - Extracting 3D structure from two views is a flourishing subject in computer vision literature. In 1981 Longuet-Higgins introduces what it seemed a mathematically founded theory that relates the corresponding points from the two images independently from the extrinsic camera parameters. Since then a number of contributions based on such a theory was emerged. Higgins defined the world point in two different reference frames and derived a formula relating the image points defined in the two frames through a matrix. Trucco presented Longuet-Higgins’ solution by formulating the problem as the product of three planar vectors. He then derived an algebraic formula relating the two image points through an algebraic entity called the essential matrix. Such a matrix is independent of the position of cameras used to capture the two views. In this paper I clarify that the reasoning of Longuet-Higgins in its first form is based on an undefined vectors operation. His reasoning presented in Trucco was misled by assuming that the world reference frame is fixed onto the left camera frame. He did not take into account that (1) dividing the coordinates of a world point by its Z- coordinate is a point belonging to a plane parallel to the XY-plane, (2) fixing the world reference frame onto the left camera implies that the coordinates of any projection belong simultaneously to the world reference frame and the left camera frame. The contribution of this paper is to unveil a misconception in the theory of Higgins’ algorithm that remains hidden up to date. Key-Words: - Essential matrix, fundamental matrix, two-view, epipolar geometry, 3D extraction 1 Introduction Extracting 3D structure from two views is a significant step of recovering 3D structure from a sequence of camera images. This theme in computer vision became prominent in 1981 when Longuet- Higgins produced his famous algorithm "A computer algorithm for reconstructing a scene from two projections" [3]. Such an algorithm was supplemented in 1992 by Faugeras’ article “What can be seen in three dimensions with an uncalibrated stereo rig?” [2]. Since then a large number of relevant publications appeared in the computer vision discipline. Most of that literature is just for estimating the essential and fundamental matrices. The contribution of this paper is to unveil a misconception in the theory of Higgins’ algorithm that remains hidden up to date. The rest of the paper is as follows: section 2 is a basic theory of the essential and fundamental matrices. Section 3 introduces Longuet-Higgins’ algorithm. In section 4 the misconception that occurred in Higgins’ theory was identified. Finally, our work concludes in section 5. 2 Essential and Fundamental Matrices Correspondence methods for extracting 3D structure from two views of a given scene are based on the epipolar geometry. Such a geometry is represented by a 3×3 singular matrix. The image points of a given world point are related by such matrix. 3D Recent Advances in Signals and Systems ISSN: 1790-5109 215 ISBN: 978-960-474-114-4