Velocity estimation of an UAV using visual and IMU data in a GPS-denied environment Raﬁk Mebarki, Jonathan Cacace, and Vincenzo Lippiello PRISMA Lab – Department of Electrical Engineering and Information Technology University of Naples Federico II via Claudio 21, 80125, Naples, Italy Email: raﬁk.mebarki@unina.it, jonathan.cacace@unina.it, vincezo.lippiello@unina.it Abstract—This paper proposes two methods for UAV transla- tional velocity estimation based on onboard sensing only. Spher- ical image measurements provided by a single onboard camera along with IMU data consist the main information feeding the estimators. The ﬁrst algorithm consists of a nonlinear observer, designed using Lyapunov synthesis, while the second is based on the Unscented Kalman ﬁltering technique. Differently with respect to existing approaches, the velocity is directly estimated from the onboard image without the need to fully estimate the vehicle 3D pose. The low computational requirement makes the proposed techniques suitable for applications where the execution time is of prominent importance even if no powerful hardware is available, as it is the case with UAV systems. Experimental results validate the algorithms, and this with the use of only four image features. I. I NTRODUCTION These last couple of decades are witnessing an increasing interest in Unmanned Aerial Vehicles (UAVs). This trend is even particularly true for Vertical Take-Off and Landing (VTOL) UAVs, that disclose interesting features. In fact, in addition of ﬂying without a pilot onboard and thus might be of amenable size, they can hold stationary and pass through narrow and cumbersome environments. This makes them aerial vessels allowing numerous potential applications, such as search and rescue [1], aerial imagery and inspection [2], surveillance and exploration [3], load transportation [4], and aerial manipulation [5], to name a few. The measure of the translational velocity is crucial for the autonomous navigation [6]. Moreover, in a visual servo- ing formalism the high-level visual control scheme generates desired velocity set-points. The latter need to be achieved by comparing and referring to the actual velocity. Although Global Positioning System (GPS) would be an interesting candidate for the purpose, there exists however a number of conditions hindering its use. It is unreliable in cloudy weathers and when ﬂying at low altitude, is precluded in urban and indoors environments, in addition for the possibility of The research leading to these results has been supported by ARCAS and SHERPA collaborative projects, which have received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreements ICT-287617 and ICT-600958, respectively. The authors are solely responsible for its content. It does not represent the opinion of the European Community and the Community is not responsible for any use that might be made of the information contained therein. communication cut with the satellite, is not passive, and does not present enough accuracy to envisage ﬁne motions control. On the other hand, vision provides wealthy information with a pixel-order resolution, is lightweight, passive, and cheap [7]. In [8], the control system is augmented with virtual states for quadrotor stabilization without velocity measure- ments, but vision was not considered. In [9], Vicon motion capture system is employed, thus leading to non self-contained auto ﬂight. For velocity estimation, [10] presents an approach where the image features coordinates are combined with the accelerometer readings through three consecutive camera positions. The approach consists of a closed-form solution to extract the camera velocity, but robustness to image noise is not reported. Similarly but adopting spherical visual features, the velocity of a camera-attached UAV is estimated in [11], [12]. In [13]–[16] IMU accelerometer data are used in nonlinear observers for velocity, pose, and pose and velocity estimation, respectively. However, in [13] and [14] vision is not adopted. Work [17] proposes a velocity-free controller for a ﬂeet of UAVs, but vision is not considered and the controller needs the position of each vehicle. In [18]–[20], optical-ﬂow along with inertial measurements is exploited for velocity estimation. Optical ﬂow suffers from image noise, is prone to drifts, and requires a textured environment. In [21], the velocity is inferred from a full state estimation, where stereo camera information and inertial measurements are fused through a SLAM algorithm and an Unscented Kalman ﬁlter (UKF). However, the system is computationally constrained since the SLAM needs hundreds of points to process, in addition to the requirement of points matching through consecutive frames, and a stereo vision system is required. In this paper we propose an image-based estimation of the translational velocity, where the estimate is directly inferred from the onboard image, without the requirement of full 3D pose estimation, and thence we avoid the computational time the latter imposes on the system. As instance, the pose is indeed not required for visual servoing. As for navigation, it can eventually be constructed at a lower rate than that at which the velocity required for control is estimated. Two different algorithms sharing the same formalisme are proposed. The ﬁrst one is designed using Lyapunov synthesis, while the second one is based on Kalman ﬁltering. Spherical image coordinates from a single onboard camera, along with inertial measure- ments, consist the main information feeding the algorithms. Note that we do not use the accelerometer readings, which are 978-1-4799-0880-6/13/$31.00 c  2013 IEEE