A Strategy for Efficient Observation Pruning in Multi-Objective 3D SLAM Jaime Valls Miro, Weizhen Zhou and Gamini Dissanayake Abstract— An efficient automatic solution to the feature- based simultaneous localisation and mapping (SLAM) of mobile robots operating in conditions where a number of competing objectives operate simultaneously is proposed. The formulation quantitatively measures the merit of incoming data with respect to multiple priorities, automatically adjusting the amount of observations to be used in the estimation process for the best possible combined outcome The methodology enables a selection mechanism which can efficiently exploit the observations avail- able to the robot to best fulfil the objectives of differing tasks throughout the course of a mission, e.g. localisation, mapping, exploration, feature distribution, searching for specific objects or victims, etc. The work is particularly motivated by navigation in three-dimensional terrains, and an example considering the objectives of robot localisation and map expansion in a search and rescue environment using and RGB-D camera is utilised for discussion and results. I. MOTIVATION The abstraction of an environment through a two- dimensional (2D) projection has been proven to procure suf- ficient insight for a robot to navigate and act in a structured environment. This simple representation, however is not necessarily adequate when operating in more complex, un- structured environments. An example is an Urban Search and Rescue (USAR)scenario, where the environment is highly challenging to navigate, has few or no levelled pathways, and control sequences based on a planar assumption may easily lead a robot towards an untraversable region, thereby risking expensive equipment and lifesaving payload. In this context, a planar projection simply cannot represent the real spatial configuration of the environment. Upholding such assumption poses the risk of overlooking potentially hazardous situation, and is therefore limited in its capacity to provide information which is useful and understandable for the human rescuers that are to follow. Such limitations have encouraged research for generating richer and more descriptive three-dimensional (3D) maps of unstructured en- vironment, mainly using stereoscopic cameras [1], or tilting laser range finders combined with standard cameras [2], [3], [4]. In recent years, this volume of work has benefited enor- mously with the wider availability of compact and affordable three-dimensional visual sensors, such as the SwissRanger, the PMD or the PrimeSense/Kinect depth cameras. The proposition in this paper follows on this volume of work by looking at the problem of producing 3D maps with a combination of visual and range information, and proposes Centre of Excellence for Autonomous Systems, Faculty of Engineering and IT, University of Technology Sydney (UTS), NSW 2007, Australia jaime.vallsmiro, weizhen.zhou, gamini.dissanayake@cas.edu.au a mechanism to efficiently refine the output of the SLAM process in the context of a variety of robot objectives to be met during the course of a mission. II. I NTRODUCTION AND RELATED WORK The synthesis of range and vision sensors has allowed features extracted from visual images to be augmented with ranging information, which can then be used in the SLAM context to produce dense 3D maps. While combining the sensors provides more accurate observations than either type of sensor alone, it inevitably bears the same computational encumbrance faced by most SLAM algorithms. Many approaches to improve efficiency have been devoted to restraining the quantity of landmarks incorporated and maintained by the filter. While the most intuitive step is probably to simply remove those that quickly fall outside the camera’s field of view, from a statistical perspective, the natural proposition is to remove landmarks with a high degree of uncertainty in their position [5]. Thus, strate- gies combining frequency count with estimates of position certainty are often used to measure landmark quality. In terms of limiting the growth of the system state, these strategies are straightforward and effective. However, the quality of the landmark position alone is not necessarily the best method to determine the removal of a landmark or suitability of new ones to be introduced into the filter. A landmark with higher uncertainty in a sparsely known region is often more useful than a precisely located landmark in a well-known region. Establishing a selection criteria beyond the geometric properties of a landmark, such as where it can be observed or the distance to neighbouring landmarks, can therefore also be considered to assess the quality of the measurement. For example, in [6] a rating scheme was proposed by first dividing all landmarks into groups by k- mean clustering. The difference in information content of the landmarks within each group was then calculated. The group providing the greatest information difference was selected as a candidate for landmark removal, the rationale being an assumption that the bigger the information difference, the less the loss of localisation support within the cluster’s represented region. Finally, from this cluster, the landmark with the lowest information content was removed, effectively incurring the smallest degradation in the quality of the final map’s geometry. While this scheme guarantees that land- marks will be removed sparsely across the map leaving no blank spot for robot localisation, in general selection criteria to conclusively single out a landmark or group of landmarks from the observations are not easily derived. Furthermore,