Visual Odometry Algorithm Using an RGB-D Sensor and IMU in a Highly Dynamic Environment Deok-Hwa Kim, Seung-Beom Han, and Jong-Hwan Kim Department of Electrical Engineering, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon, 305-701, Republic of Korea, dhkim@rit.kaist.ac.kr, sbhan@rit.kaist.ac.kr, johkim@rit.kaist.ac.kr Abstract. This paper proposes a robust visual odometry algorithm us- ing a Kinect-style RGB-D sensor and inertial measurement unit (IMU) in a highly dynamic environment. Based on SURF (Speed Up Robust Fea- tures) descriptor, the proposed algorithm generates 3-D feature points incorporating depth information into RGB color information. By using an IMU, the generated 3-D feature points are rotated in order to have the same rigid body rotation component between two consecutive im- ages. Before calculating the rigid body transformation matrix between the successive images from the RGB-D sensor, the generated 3-D feature points are ﬁltered into dynamic or static feature points using motion vectors. Using the static feature points, the rigid body transformation matrix is ﬁnally computed by RANSAC (RANdom SAmple Consensus) algorithm. The experiments demonstrate that visual odometry is success- fully obtained for a subject and a mobile robot by the proposed algorithm in a highly dynamic environment. The comparative study between pro- posed method and conventional visual odometry algorithm clearly show the reliability of the approach for computing visual odometry in a highly dynamic environment. 1 Introduction Nowadays, many diﬀerent kinds of robots have been used in various environments like home, museum, school, etc. To carry out tasks in such environments, au- tonomous navigation systems become more important [1, 2]. In the autonomous navigation systems, humanoid robots [3, 4] or aerial vehicles [5, 6, 7] might not directly use their encoder values for the odometry information. However, wheeled robots can use their encoder values for odometry information. For this reason, usage of the encoder sensors for the odometry information limits the scalability of platform to conduct various tasks. Because of this limitation, visual odome- try, which is not much aﬀected by platform types, has become more important than the classical encoder odometry. Actually, in robotics and computer vision, many techniques and algorithms have been developed for 3-D mapping and vi- sual odometry using monocular cameras [8, 9], ﬁsh-eye vision sensors [10], stereo cameras [11] and RGB-D sensors [12, 13].