1 SLAP: Simultaneous Localization and Planning Under Uncertainty via Dynamic Replanning in Belief Space Ali-akbar Agha-mohammadi, Saurav Agarwal, Sung-Kyun Kim, Suman Chakravorty, and Nancy M. Amato Abstract—Simultaneous localization and Planning (SLAP) is a crucial ability for an autonomous robot operating under uncertainty. In its most general form, SLAP induces a continuous POMDP (partially-observable Markov decision process), which needs to be repeatedly solved online. This paper addresses this problem and proposes a dynamic replanning scheme in belief space. The underlying POMDP, which is continuous in state, action, and observation space, is approximated offline via sampling-based methods, but operates in a replanning loop online to admit local improvements to the coarse offline policy. This construct enables the proposed method to combat changing environments and large localization errors, even when the change alters the homotopy class of the optimal trajectory. It further outperforms the state-of-the-art FIRM (Feedback-based Informa- tion RoadMap) method by eliminating unnecessary stabilization steps. Applying belief space planning to physical systems brings with it a plethora of challenges. A key focus of this paper is to implement the proposed planner on a physical robot and show the SLAP solution performance under uncertainty, in changing environments and in the presence of large disturbances, such as a kidnapped robot situation. Index Terms—Motion planning, belief space, robust, POMDP, uncertainty, mobile robots, rollout I. I NTRODUCTION Simultaneous Localization and Planning (SLAP) refers to the problem of (re)planning dynamically every time the localization module updates the probability distribution on the robot’s state. For autonomous navigation, solving SLAP and enabling a robot to perform online (re)planning under uncertainty is a key step towards reliable operation in changing real-world environments with uncertainties. For example, consider a low-cost mobile robot operating in an office-like environment with a changing obstacle map (e.g., office doors switch state between open and closed), and responding to changing goals (tasks) assigned online based on user requests. Such changes in the obstacle map or in the goal location often call for global replanning to accommodate changes in the homotopy class of the optimal solution. What makes the problem more challenging is that such replanning has to happen online and fast in partially observable environments with motion and sensing uncertainties. In general, decision making and control under uncertainty are ubiquitous challenges in many robotic applications. For an autonomous robot to operate reliably, it needs to perceive sen- sory measurements, infer its situation (state) in the environment, plan, and take actions accordingly. In partially-observable envi- ronments, where the state of the system cannot be determined exactly (due to imperfect and noisy measurements), a filtering Agha is with NASA-JPL, Caltech. Agarwal and Chakravorty are with the Dept. of Aerospace Eng. and Amato is with the Dept. of Computer Science & Eng. at Texas A&M University. Kim is with the Robotics Institute at Carnegie Mellon University, Emails: aliagha@jpl.nasa.gov, {sauravag,schakrav,amato}@tamu.edu, kimsk@cs.cmu.edu (a) Belief tree: forward construction (b) Belief graph: backward construction Fig. 1. (a) This figure depicts a typical search tree in belief space, corresponding to a very small problem with 3 actions U = {u 1 , u 2 , u 3 } and two observations Z = {z 1 , z 2 }. Each posterior belief (probability distribution) branches into |U| number of priors and each prior belief branches into |Z| posteriors, and thus the tree grows exponentially in the the search depth (referred to as the curse of history). (b) This figure depicts the idea of using funnels (local feedback controllers) in belief space that can break this exponential growth by funneling a large set of posteriors into a pre-computed beliefs (in red circles). Thus a graph is formed in belief space with funnels as edges and the pre-computed beliefs as nodes. This graph grows linearly with the search depth. module (e.g., Kalman filter) can provide an estimate of the state, i.e., a probability distribution function (pdf) over all possible system states. This pdf describing the localization uncertainty is referred to as the belief or information-state. At each time- step, actions are chosen based on the robot’s belief. To find the optimal policy that maps beliefs to actions, we cast the problem in its principled form as a Partially Observable Markov Decision Process (POMDP) problem [1], [2]. There are a number of challenges in dealing with POMDPs: • curse of dimensionality [3], which refers to the high di- mensions of the belief space. If the underlying robotic system evolves in a discrete grid world with n cells, the corresponding belief space is an n-dimensional continuous space. • curse of history [3], [4], which refers to the exponential growth of the number of future outcomes in the search depth (see Fig. 1(a)). Methods such as [3]–[13] alleviate these issues and take arXiv:1510.07380v3 [cs.RO] 13 May 2018