A comparison of Viewing Geometries for Augmented Reality Dana Cobzas and Martin Jagersand Computing Science, University of Alberta, Canada WWW home page: www.cs.ualberta/~dana Abstract. Recently modern non-Euclidean structure and motion esti- mation methods have been incorporated into augmented reality scene tracking and virtual object registration. We present a study of how the choice of projective, aﬃne or Euclidean scene viewing geometry and sim- ilarity, aﬃne or homography based object registration aﬀects how ac- curately a virtual object can be overlaid in scene video from varying viewpoints. We found that projective and aﬃne methods gave accurate overlay to a few pixels, while Euclidean geometry obtained by auto cali- brating the camera was not as accurate and gave about 15 pixel overlay error. 1 Introduction In Augmented Reality a virtual object is registered with and visually overlaid into a video stream from a real scene[1]. In classical AR systems this is com- monly achieved by a-priori geometric modeling for the registration and using external devices (e.g. magnetic) to track camera pose. Using visual tracking through the real scene camera oﬀers several advantages. Since ideally the real and virtual camera should be the same, it avoids the calibration to an unrelated external sensor. It also allows error minimization using image measurements di- rectly between the real scene and virtual object. Recent progress in geometric vision furthermore oﬀers a variety of methods for auto-calibration and alignment of object without needing any a-priori information. These new methods intro- duce a variety of choices in building an AR system. First, under varying camera models, the scene-camera pose tracking can be done in Euclidean[11], aﬃne[7] or projective[10] formulation. Second, the VR object is normally given as an a-priori (Euclidean) graphics model, but in recent work also captured image- based objects have also been inserted[2, 9]. Third, the transform which aligns the object can either be similarity[3], aﬃne or homography[12]. An important consideration in designing a system is choosing the geometric representation for the above three parts so that the accuracy constraints of the task at hand are satisﬁed. This is perhaps particularly important in industrial applications where AR can be used e.g. to overlay geometric guides for machining and assembly. In AR a relevant way to characterize accuracy is in pixel repro- jection error. Note that this is diﬀerent from e.g. absolute errors in computed camera pose and scene structure, since some errors will cancel out when pro- jected. However, AR is also diﬀerent from pure re-projection. In the alignment