PPTLens: Create Digital Objects with Sketch Images Changcheng Xiao * Shanghai Jiao Tong University Shanghai, China xchangcheng@gmail.com Changhu Wang † Microsoft Research Beijing, China chw@microsoft.com Liqing Zhang Shanghai Jiao Tong University Shanghai, China zhang-lq@cs.sjtu.edu.cn ABSTRACT In this work, we introduce the PPTLens system to convert sketch images captured by smart phones to digital ﬂowcharts in PowerPoint. Diﬀerent from existing sketch recognition system, which is based on hand-drawn strokes, PPTLens enables users to use sketch images as inputs directly. It’s more challenging since strokes extracted from sketch im- ages might not only be very messy, but also without tem- poral information of the drawings. To implement the ’Im- age to Object’ (I2O) scenario, we propose a novel sketch image recognition framework, including an eﬀective stroke extraction strategy and a novel oﬄine sketch parsing algo- rithm. By enabling sketch images as inputs, our system makes ﬂowchart/diagram production much more convenient and easier. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing; H.5.2 [User Interfaces]: User-centered Design Keywords Oﬄine Sketch Recognition; Sketched Flowchart Recognition 1. INTRODUCTION Drawing ﬂowcharts and diagrams on a whiteboard or tablet is an eﬀective way for users to convey and express informa- tion. Although with the development of touch-screen de- vices, increasing research is conducted on hand-drawn ﬂowchart or diagram recognition, most work targets at interpreting online drawings with temporal information (online sketch recognition), with little eﬀort on understanding sketch im- ages captured by smart phones (oﬄine sketch parsing). How- ever, it will make ﬂowchart/diagram production much more * Changcheng Xiao performed this work while being an in- tern at Microsoft Research Asia. † Corresponding author. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full cita- tion on the ﬁrst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author(s). Copyright is held by the owner/author(s). MM’15, October 26–30, 2015, Brisbane, Australia. ACM 978-1-4503-3459-4/15/10. DOI: http://dx.doi.org/10.1145/2733373.2807974 . PPTLens User UI Recognion Engine Sketch Image Digital Output Figure 1: Illustration of the PPTLens system. Users can take a photo of the sketched ﬂowchart, which can be automatically converted to a digital ﬂowchart in PowerPoint with shape symbols and arrows replaced by form digital objects. convenient and easier if we can use sketch images as inputs, especially with the popularity of smart phones. In spite of the increasing interest in building system that can automatically recognize sketch images, there is still a huge gap between sketch images and digital objects. This gap mainly comes from the following two aspects: 1) the challenges of extracting strokes from sketch images. Diﬀerent from online sketch recognition [5], in which strokes can be obtained directly from the user interface, it’s very challenging to extract strokes from sketch images. On the one hand, sketch images may have uneven drawing quality and large variations of luminous intensity, resulting in low stroke extraction recall. On the other hand, the missing of drawing orders results in high error rate for the stroke con- struction. 2) the challenges of oﬄine sketch parsing. To recognize the symbols of a sketch, a typical approach in the literature is to ﬁrst generate a number of candidate stroke groups, followed by the recognition of each group [3, 4]. Theoretically, for N strokes, there will be 2 N diﬀer- ent stroke groups to recognize. To avoid the exponentially growing recognition cost, several constraints are leveraged, such as temporal constraint [1] and spatial constraint [4]. However, the strokes extracted from sketch images might be very messy, resulting in much more groups. Moreover, the unavailable of the temporal constraint makes it harder to reduce the search space. To bridge the gap between sketch images and digital ob- jects, in this work, we ﬁrst propose an eﬀective stroke ex-