International Journal of Multimedia and Ubiquitous Engineering Vol.15, No.1 (2020), pp.35-48 http://dx.doi.org/10.21742/ijmue.2020.15.1.04 Print ISSN: 1975-0080, eISSN: 2652-1954 IJMUE Copyright ⓒ 2020 Global Vision Press (GV Press) EOLO: Deep Machine Learning Algorithm for Embedded Object Segmentation that Only Looks Once ^ Longfei Zeng 1 and Sabah Mohammed 2 Lakehead University, Smart Health FabLab, Department of Computer Science, Canada 1 lzeng3@lakeheadu.ca, 2 mohammed@lakeheadu.ca Abstract In this paper, we introduce an anchor-free and single-shot instance segmentation method, which is conceptually simple with 3 independent branches, fully convolutional and can be used by easily embedding it into mobile and embedded devices. Our method, refer as EOLO, reformulates the instance segmentation problem as predicting semantic segmentation and distinguishing overlapping objects problem, through instance center classification and 4D distance regression on each pixel. Moreover, we propose one effective loss function to deal with sampling high-quality center of gravity examples and optimization for 4D distance regression, which can significantly improve the mAP performance. Without any bells and whistles, EOLO achieves 27.7% in mask mAP under IoU50 and reaches 30 FPS on 1080Ti GPU, with single-model and single-scale training/testing on the challenging COCO2017 dataset. For the first time, we show the different comprehension of instance segmentation in recent methods, in terms of both up-bottoms, down-ups, and direct-predict paradigms. Then we illustrate our model and present related experiments and results. We hope that the proposed EOLO framework can serve as a fundamental baseline for a single-shot instance segmentation task in Real-time Industrial Scenarios. Keywords: Deep machine learning, Image segmentation, Instance segmentation, Embedded platforms 1. Introduction Instance segmentation is a more complex task comparing with object detection and semantic segmentation. It requires predicting each instance not only an approximate location but also pixel-level segmentation. The recent instance segmentation networks tend to be lighter and try to keep the State-of-the-Art performance. Despite the anchor-free and one stage detectors have promoted the speed of inference, these advanced algorithms are not small enough and inference slow for most industrial application scenarios. It is still a challenge to implement a faster and smaller instance segmentation network on a computationally limited platform. To break through this dilemma, this paper proposes an efﬁcient and succinct instance segmentation network for embedded vision application scenarios. There are four categorizations of instance segmentation algorithms, two-stage or one-stage paradigms, and top-down or bottom-up paradigms. Mask R-CNN [1] and its’ relative derivatives are following top-down and two-stage paradigms. It ﬁrst detects objects by bounding boxes and Article history: Received (March 2, 2020), Review Result (April 9, 2020), Accepted (May 11, 2020)