Design of a Multi-Modal End-Effector and Grasping System How Integrated Design helped win the Amazon Robotics Challenge S. Wade-McCue 1,2 , N. Kelly-Boxall 1,2 , M. McTaggart 1,2 , D. Morrison 1,2 , A.W. Tow 1,2 , J. Erskine 1,2 , R. Grinover 1,2 , A. Gurman 1,2 , T. Hunn 1,2 , D. Lee 1,2 , A. Milan 1,3 , T. Pham 1,3 , G. Rallos 1,2 , A. Razjigaev 1,2 , T. Rowntree 1,3 , R. Smith 1,2 , K. Vijay 1,3 , Z. Zhuang 1,4 , C. Lehnert 2 , I. Reid 1,3 , P. Corke 1,2 , and J. Leitner 1,2 Abstract— We present the grasping system behind Cartman, the winning robot in the 2017 Amazon Robotics Challenge. The system makes strong use of redundancy in design by implementing complimentary tools, a suction gripper and a parallel gripper. This multi-modal end-effector is combined with three grasp synthesis algorithms to accommodate the range of objects provided by Amazon during the challenge. We provide a detailed system description and an evaluation of its performance before discussing the broader nature of the system with respect to the key aspects of robotic design as initially proposed by the winners of the first Amazon Picking Challenge. To address the principal nature of our grasping system and the reason for its success, we propose an additional robotic design aspect ‘precision vs. redundancy’. The full design of our robotic system, including the end-effector, is open sourced and available at http://juxi.net/projects/AmazonRoboticsChallenge/. I. I NTRODUCTION Amazon offers approximately 400 million products to the US through their on-line marketplace [1], and is able to offer same-day shipping on many items through their Amazon Prime service. This feat is a testament to the logistical capabilities of Amazon and showcases their state-of-the-art warehouse automation technology. However, technological limitations have kept Amazon from entirely automating their supply chain, with the bulk of item pick-and-place tasks in warehouses still performed by humans. Despite the strong advancement of robot and computer vision technology in recent years [2], [3], pick-and-place robotics for unstructured warehouse settings is still in its infancy. Amazon fosters de- velopment in this space by hosting an annual competition, the Amazon Robotics Challenge (ARC) (previously the Amazon Picking Challenge). The ARC requires teams to develop autonomous ware- house manipulation systems to perform the warehouse tasks of stocking shelves and fulfilling orders into shipping boxes. Traditional warehouses consist of static shelves in which items are stored. In such an arrangement, travel is required when storing to and picking from the shelves. The ‘goods- to-man’ Kiva systems implemented by Amazon removed this requirement by having the shelves move around the This research was supported by the Australian Research Council Centre of Excellence for Robotic Vision (ACRV) (project number CE140100016). The participation at the ARC was supported by Amazon Robotics LLC. Contact: sean.wademccue@hdr.qut.edu.au S. Wade-McCue and N. Kelly-Boxall contributed equally to this work. 1 Authors are with the Australian Centre for Robotic Vision (ACRV). 2 Authors are with the Queensland University of Technology (QUT). 3 Authors are with the University of Adelaide. 4 ZZ is with the Australian National University (ANU). Fig. 1. Top: Model of Wrist including (A) Suction tool (B) Tool change motor and (C) Parallel Jaw Gripper, Bottom: The end-effector in use during the Amazon Robotics Challenge (left) picking an item with the suction tool and another one with the gripper (right). warehouse autonomously [4]. This design allows shelves to be packed tightly (saving floor space) and pick-and-place operations to be performed at static, distributed locations. Amazon also employs a chaotic warehouse structure where each bin in a warehouse shelf holds a large variety of items. Initially designed to improve the speed of human picking, this feature also reduces the potential for bottlenecks with peaks in item sales. Developing a system that can pick items from static, cluttered bins is the challenge presented by the ARC. Autonomous grasping from clutter requires a robot that can handle items of varying size, weight, shape, texture and physical occlusion. On the perception front, a robot must be capable of distinguishing each item from one another. Co- herent integration of both hardware and software is required to overcome the challenge of grasping items in clutter. We present here the grasping system of our ARC winning robot Cartman [5]. The grasping system consists of a hybrid end-effector with suction tool and parallel gripper, and the software, a multi-level grasp point detection algorithm de- arXiv:1710.01439v3 [cs.RO] 19 Jun 2018