PPTLens: Create Digital Objects with Sketch Images Changcheng Xiao * Shanghai Jiao Tong University Shanghai, China xchangcheng@gmail.com Changhu Wang Microsoft Research Beijing, China chw@microsoft.com Liqing Zhang Shanghai Jiao Tong University Shanghai, China zhang-lq@cs.sjtu.edu.cn ABSTRACT In this work, we introduce the PPTLens system to convert sketch images captured by smart phones to digital flowcharts in PowerPoint. Different from existing sketch recognition system, which is based on hand-drawn strokes, PPTLens enables users to use sketch images as inputs directly. It’s more challenging since strokes extracted from sketch im- ages might not only be very messy, but also without tem- poral information of the drawings. To implement the ’Im- age to Object’ (I2O) scenario, we propose a novel sketch image recognition framework, including an effective stroke extraction strategy and a novel offline sketch parsing algo- rithm. By enabling sketch images as inputs, our system makes flowchart/diagram production much more convenient and easier. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing; H.5.2 [User Interfaces]: User-centered Design Keywords Offline Sketch Recognition; Sketched Flowchart Recognition 1. INTRODUCTION Drawing flowcharts and diagrams on a whiteboard or tablet is an effective way for users to convey and express informa- tion. Although with the development of touch-screen de- vices, increasing research is conducted on hand-drawn flowchart or diagram recognition, most work targets at interpreting online drawings with temporal information (online sketch recognition), with little effort on understanding sketch im- ages captured by smart phones (offline sketch parsing). How- ever, it will make flowchart/diagram production much more * Changcheng Xiao performed this work while being an in- tern at Microsoft Research Asia. Corresponding author. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author(s). Copyright is held by the owner/author(s). MM’15, October 26–30, 2015, Brisbane, Australia. ACM 978-1-4503-3459-4/15/10. DOI: http://dx.doi.org/10.1145/2733373.2807974 . PPTLens User UI Recognion Engine Sketch Image Digital Output Figure 1: Illustration of the PPTLens system. Users can take a photo of the sketched flowchart, which can be automatically converted to a digital flowchart in PowerPoint with shape symbols and arrows replaced by form digital objects. convenient and easier if we can use sketch images as inputs, especially with the popularity of smart phones. In spite of the increasing interest in building system that can automatically recognize sketch images, there is still a huge gap between sketch images and digital objects. This gap mainly comes from the following two aspects: 1) the challenges of extracting strokes from sketch images. Different from online sketch recognition [5], in which strokes can be obtained directly from the user interface, it’s very challenging to extract strokes from sketch images. On the one hand, sketch images may have uneven drawing quality and large variations of luminous intensity, resulting in low stroke extraction recall. On the other hand, the missing of drawing orders results in high error rate for the stroke con- struction. 2) the challenges of offline sketch parsing. To recognize the symbols of a sketch, a typical approach in the literature is to first generate a number of candidate stroke groups, followed by the recognition of each group [3, 4]. Theoretically, for N strokes, there will be 2 N differ- ent stroke groups to recognize. To avoid the exponentially growing recognition cost, several constraints are leveraged, such as temporal constraint [1] and spatial constraint [4]. However, the strokes extracted from sketch images might be very messy, resulting in much more groups. Moreover, the unavailable of the temporal constraint makes it harder to reduce the search space. To bridge the gap between sketch images and digital ob- jects, in this work, we first propose an effective stroke ex-