IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 18, NO. 2, APRIL 2002 199 Visual Motion Planning for Mobile Robots Hong Zhang, Associate Member, IEEE, and James P. Ostrowski, Member, IEEE Abstract—This paper presents a novel framework for image-based motion planning, which we term Visual Motion Planning. The method skips the step of transferring image fea- tures back to robot pose, and hence makes motion plans directly in the image plane. Analogous to visual servo control, the visual motion planning concept takes advantage of the image features to achieve a direct and fast motion planning solution. It provides a “virtual” trajectory in the image plane for the robot to track with standard visual servoing techniques. In this paper, we show the result of applying the idea to simulated 2-D and 3-D mobile robot systems. Within this motion planning paradigm, we also discuss the mechanisms for taking advantage of surplus features and incorporating image-based constraints, such as requiring that the images remain in the field of view. Experimental results using a Pioneer AT ground mobile robot are presented, showing excellent agreement with theory. Index Terms—Mobile robotics, motion planning, visual servoing. I. INTRODUCTION M OTION planning for robotics has long been an exten- sively studied field [2], [20], [23], for which a wide va- riety of techniques have been developed. Some research has fo- cused on the use of methods such as potential fields [18] (or its extensions [32]) or randomized approaches [17], while others have taken different approaches such as differential geometry [4], graph theory [8], and game theory [21], [42]. A common ground for most of them is that all these methods try to find a solution in the robot’s configuration space, or -space [19]. On the other hand, in order to observe the environment, most modern robots are equipped with different kinds of sensors, such as cameras (or more generally, visual sensors) [30], [41], sonar sensors [1], [38], and tactile sensors [24]. etc. Although certain applications may require the use of a particular sensor, we note that vision-based sensors provide a very powerful and versatile mechanism for sensing in general-purpose applications. Today, many robots are equipped with at least one camera, with ex- amples including the Nomad 2000 [34], Sony’s AIBO dog-like robot [11], and unmanned blimps [41], and often even carry a stereo rig. A plausible research direction, therefore, is to integrate these sensors, especially visual sensors, to the study of motion plan- ning. There are scores of sensor-based motion planning methods available now, but most have not explicitly taken advantage of the distinct properties of different types of sensors. In traditional Manuscript received April 16, 2001; revised December 4, 2001. This paper was recommended for publication by Associate Editor D. Kriegman and Editor A. De Luca upon evaluation of the reviewers’ comments. H. Zhang is with the Department of Mechanical Engineering, Rowan Univer- sity, Glassboro, NJ 08028-1701 USA (e-mail: zhang@galaxy.eng.rowan.edu). J. P. Ostrowski is with General Robotics, Automation, Sensing and Per- ception (GRASP) Laboratory, University of Pennsylvania, Philadelphia, PA 19104-6228 USA (e-mail: jpo@grasp.cis.upenn.edu). Publisher Item Identifier S 1042-296X(02)04303-3. motion planning, we generally assume that we know (or at least partially know) the pose (position and orientation) of the target and/or the robot. Motion plans are then computed based on the pose information only. Meanwhile, most current sensor-based motion planning methods simply extend this idea by using the sensors to obtain the pose of the objects or as feedback in un- derstanding the location of the robot with respect to a map (see, e.g., [6]). It is assumed that given any kind of sensor, its output can always be transferred back to pose information. However, the direct outputs of the sensors are generally not position infor- mation, but sensor signals. Particularly, when visual sensors are used, the outputs are image features, which are always distorted due to projection [15], and are generally limited to the places where they can be detected due to restrictions on the field of view [7]. In order to determine the global position and orienta- tion or even just the relative pose of one object, we need various algorithms of calibration and transformation. Hence, solving the sensor-based (especially the vision-based) motion planning problem is usually a two-step process: first the robot transfers the sensor features back to pose information, then it makes a motion plan in the pose space based on this information. This is parallel to the early development of image-based control, in which the process can be divided into the sequential steps of vi- sion-based pose reconstruction and control [13]. Motivated by image based visual servoing [9], [16] and rele- vant recent work [7], [12], [26], [27], where the computation of the control inputs is performed directly in the image plane, we propose in this paper the idea of motion planning in the image plane, or visual motion planning, where the computation of the motion plans is executed in the image plane using the image features extracted directly from the visual data. It is well known that we can drive a robot to track landmarks [22], [41] or follow a given trajectory [25]. However, there is no inherent mecha- nism to guarantee that the tracked features remain unoccluded in the field of view, and also there is no way to incorporate such basic needs as obstacle avoidance. These problems arise fre- quently where the initial and final poses are widely separated, and where rotational controls are being executed, which may move the target out of the field of view [9], [35]. Several prob- lems that arise when using a visual servoing feedback scheme to drive the control, such as large, undesired motions of the robot in its pose space, have been pointed out by Chaumette [5]. Our idea, then, is to generate a virtual trajectory in the fea- ture space for the robot to follow. This virtual trajectory should satisfy some criteria such as minimizing cost or avoiding obsta- cles, and can be generated by using standard motion planning al- gorithms [17]. Its advantage is analogous to that of visual servo control: by removing intermediate transformations, we can save computation time and eliminate the computation and modeling errors accompanied with it. Currently, only a few researchers are working on related topics. For example, Singh et al. [33] applied image morphing 1042-296X/02$17.00 © 2002 IEEE