Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. NPAR 2013, July 19 – 21, 2013, Anaheim, California. Copyright © ACM 978-1-4503-2198-3/13/07 $15.00 Towards effective evaluation of geometric texture synthesis algorithms Zainab AlMeraj * University of Waterloo Kuwait University Craig S. Kaplan University of Waterloo Paul Asente Adobe Abstract In recent years, an increasing number of example-based Geometric Texture Synthesis (GTS) algorithms have been proposed. How- ever, there have been few attempts to evaluate these algorithms rigorously. We are driven by this lack of validation and the sim- plicity of the GTS problem to look closer at perceptual similarity between geometric arrangements. Using samples from a geolog- ical database, our research ﬁrst establishes a dataset of geometric arrangements gathered from multiple synthesis sources. We then employ the dataset in two evaluation studies. Collectively these empirical methods provide formal foundations for perceptual stud- ies in GTS, insight into the robustness of GTS algorithms and a better understanding of similarity in the context of geometric tex- ture arrangements. CR Categories: I.3 [Computer Graphics]: ;— [I.5]: Pattern Recognition—Design Methodology Pattern Analysis; Keywords: non-photorealistic rendering, texture synthesis, 2D vector graphics, 2D visual perception, user studies, qualitative and quantitative evaluation methods 1 Introduction Example-based Geometric Texture Synthesis (GTS) refers to a class of algorithms that generate a large arrangement of vector elements from a small input arrangement called an exemplar. Roughly speak- ing, the goal is the same as it is with raster-based texture synthesis: the output arrangement should be judged by a human viewer to be “similar” to the exemplar. The challenge is to deﬁne similarity in a way that is rigorous enough to be formalized as an algorithm, while still conforming to human perceptual judgments. We have seen a positive trend of applying formal evaluation meth- ods in the validation of new algorithms in non-photorealistic ren- dering (NPR), but this trend has not caught on in the ﬁeld of GTS. Many GTS algorithms have been proposed, all of which seem to produce reasonable results across a range of inputs. But at best, authors run their algorithm on an exemplar from a previous paper by others, and show the old and new outputs side by side. We be- lieve that there is a need for effective evaluation strategies in GTS, which can be applied to compare existing algorithms and validate new ones. Hence our high-level goal in this paper is to establish a practical evaluation methodology for GTS algorithms. AlMeraj et al. [2011] conducted the ﬁrst study that probed the na- ture of similarity in the perception of geometric textures. Their in- vestigation resulted in a descriptive list of visual features that peo- ple use to explain the similarity between synthesized arrangements * e-mail: z.almeraj@gmail.com and exemplars. Building on their work, this paper attempts to push our understanding of texture similarity even further. We gather a comprehensive dataset of geometric textures (Section 3) from sev- eral different synthesis sources: expert human designers, state-of- the-art synthesis algorithms, and simple randomly generated tex- tures. We then conduct two user studies based on this dataset (Sec- tions 5–6), in order to see whether human judgments of similarity between synthesized textures and exemplars can be used to assess the performance of different synthesis sources. Using results from the studies we attempt a small evaluation (Section 7). We believe that the dataset and the evaluation methodologies will be useful to others in the GTS ﬁeld, and will suggest analogous studies that could be applied in other areas of NPR. 2 Related work 2.1 Geometric texture synthesis Current GTS algorithms use various combinations of procedural growth, statistics and perceptual foundations to gather layout in- formation about individual motifs from exemplars and utilize them to synthesize larger similar arrangements. Barla et al. [2006] were the ﬁrst to contribute a 2D geometric tex- ture synthesis algorithm. Their method adopts a non-parametric statistical method on an exemplar to capture the spatial distribution. Hurtut et al. [2009] devised a statistical appearance-based approach to GTS modelling concepts from gestalt grouping theory. Alves dos Passos et al. [2010] and Ijiri et al. [2008] use similar pro- cedural growth approaches to enhance the appearance of results for a variety of texture styles. The method by Jenny et al. [2010] syn- thesizes regular and irregular arrangements while simultaneously resolving overlaps and appearance issues. The algorithm by Ma et al. [2011] is able to synthesize 2D and 3D results using a complex energy-based optimization process de- signed to mimic both appearance and distribution properties found in exemplars. A subsequent geometric synthesis algorithm by AlMeraj et al. [2013] uses a patch-based method to achieve global and local distributions similar to those in the exemplar. A recent statistical approach by ¨ Oztireli and Gross [2012] uses a second-order statistic called the Pair Correlation Function (PCF) as a guide to achieve global similarity. Given one or more exemplar inputs, they are able to synthesize 2D and 3D arrangements either by using a generalized dart throwing routine, or by ﬁtting an ar- rangement to the PCF by gradient decent. ¨ Oztireli and Gross offer quantitative evidence for their claims of similarity by including charts showing PCF curves and irregularity measures for synthesized and target arrangements. These quanti- tative measures reduce subjectivity in comparing synthesized ar- rangements, and move us a step closer towards understanding sim- ilarity in GTS. However, proving whether or not these statistical measures give an effective account of how humans judge similar- ity is difﬁcult. In this paper, we address the subjectivity involved in similarity judgements and hope that our insights help researchers develop an appropriate deﬁnition of similarity for GTS in the future. 5