The Visual Computer manuscript No. (will be inserted by the editor) Oleg Polonsky · Giuseppe Patan´ e · Silvia Biasotti · Craig Gotsman · Michela Spagnuolo What’s in an Image ? Towards the Computation of the “Best” View of an Object Abstract There are many possible 2D views of a given 3D object and most people would agree that some views are more aesthetic and/or more “informative” than oth- ers. Thus, it would be very useful, in many applications, to be able to automatically compute these “best” views. Although all measures of the quality of a view will ul- timately be subjective, hence diﬃcult to quantify, we propose some general principles which may be used to address this challenge. In particular, we describe a num- ber of diﬀerent ways to measure the goodness of a view, and show how to optimize these measures by reducing the size of the search space. Keywords Visualization · View entropy · Scene composition 1 Introduction The real world consists of three-dimensional objects. The human visual system, however, is limited by optics to view only their two-dimensional images. Stereo vision and perspective only partially overcome this limitation. Thus, a signiﬁcant component of the geometric infor- mation about a 3D object is lost during the viewing transformation. This unfortunate fact is also reﬂected in traditional computer graphics applications, where we commonly see rendered 2D images. Although all the in- formation about the 3D shape is known a-priori (i.e., be- fore the image rendering), much is lost when the shape is This work is supported by the EU Network of Excellence AIM@SHAPE IST NoE No 506766. O. Polonsky Technion — Israel Institute of Technology E-mail: olegp@cs.technion.ac.il G. Patan´ e, S. Biasotti, M. Spagnuolo IMATI/CNR Genova E-mail: {patane,biasotti,spagnuolo}@ge.imati.cnr.it C. Gotsman Harvard University E-mail: gotsman@eecs.harvard.edu projected onto the image plane, and the amount of pre- served information depends on the eye (camera) position relative to the shape in that particular view. In this paper, we focus on the quantiﬁcation and mea- surement of the visual information present in an image of a 3D object with the aim of ﬁnding optimal, or nearly- optimal, views. It should be emphasized that the notion of the goodness of a view may depend on the particular visual task or application. For example, in an illustrated manual of work tools, people may prefer views where the tool is drawn in the typical position, as used by the ma- chine operator. Object recognition tasks performed by a robot may require a totally diﬀerent view to achieve best performance. Nonetheless, we believe that there ex- ists some common basis for all these visual problems. Answering these questions presents a signiﬁcant chal- lenge in the ﬁeld of visualization and shape understand- ing. A solution would be useful in several applications such as automatic camera positioning in CAD, thumb- nail generation for large 3D databases, automatic scene composition, technical illustration, and object recogni- tion. In this paper we propose the following methodology: deﬁne a view descriptor which attaches a score to a view of the object, taking into account its visible geometry (Section 3). Then, compute the value of this descriptor for a small number of candidate views (Section 4). We consider the view with the highest score to be the most informative. We describe a number of such descriptors, and show how to optimize them eﬃciently over the view- ing sphere. We compare the views generated by these descriptors and discuss their performance (Section 5 and 6). 2 Previous work The question “What is a good view of an object ? ” dates back to the Greeks and Romans, who proposed some simple rules of thumb, e.g. the golden ratio, the rule of thirds, the rule of ﬁfths, etc [13].