IEEE SENSORS JOURNAL, VOL. 15, NO. 1, JANUARY 2015 37
Using Scale Coordination and Semantic
Information for Robust 3-D Object
Recognition by a Service Robot
Yan Zhuang, Member, IEEE, Xueqiu Lin, Huosheng Hu, Senior Member, IEEE, and Ge Guo Member, IEEE
Abstract— This paper presents a novel 3-D object recognition
framework for a service robot to eliminate false detections
in cluttered office environments where objects are in a great
diversity of shapes and difficult to be represented by exact models.
Laser point clouds are first converted to bearing angle images
and a Gentleboost-based approach is then deployed for multiclass
object detection. In order to solve the problem of variable
object scales in object detection, a scale coordination technique
is adopted in every subscene that is segmented from the whole
scene according to the spatial distribution of 3-D laser points.
Moreover, semantic information (e.g., ceilings, floors, and walls)
extracted from raw 3-D laser points is utilized to eliminate false
object detection results. K-means clustering and Mahalanobis
distance are finally deployed to perform object segmentation in
a 3-D laser point cloud accurately. Experiments were conducted
on a real mobile robot to show the validity and performance of
the proposed method.
Index Terms— Active environment perception, robust 3-D
object recognition, scale coordination, semantic information, 3-D
laser scanning, service robot.
I. I NTRODUCTION
A
CTIVE environment perception and 3-D object recog-
nition are two fundamental tasks for service robots to
operate in cluttered indoor environments [1]–[3], including
detecting tables, chairs and sofas in normal operations [4], [5],
and detecting hazard and dangerous objects in search and
rescue operations [6], [7]. A variety of computer vision algo-
rithms and novel RGB-D vision sensors have been developed
to implement visual object recognition, resulting in significant
progress in 3-D object recognition [8]–[10]. Lai et al. define a
view-to-object distance where a novel view is compared simul-
taneously to all views of a previous object [11]. This novel
distance is based on a weighted combination of feature dif-
ferences between views, which leads to superior classification
performance on object category and instance recognition in
Manuscript received May 26, 2014; accepted July 4, 2014. Date of publi-
cation July 15, 2014; date of current version November 5, 2014. This work
was supported by the National Natural Science Foundation of China under
Grant 61375088 and Grant 61035005. The associate editor coordinating the
review of this paper and approving it for publication was Prof. Subhas Chandra
Mukhopadhyay.
Y. Zhuang and X. Lin are with the School of Control Science and
Engineering, Dalian University of Technology, Dalian 116024, China (e-mail:
zhuang@dlut.edu.cn; 373930883@qq.com).
H. Hu is with the School of Computer Science and Electronic Engineering,
University of Essex, Colchester CO4 3SQ, U.K. (e-mail: hhu@essex.ac.uk).
G. Guo is with the School of Information Science and Technology, Dalian
Maritime University, Dalian 116026, China (e-mail: geguo@dlut.edu.cn).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSEN.2014.2336987
the context of RGB-D cameras. Herbst et al. use a PrimeSense
camera to provide color and depth, and fuse information from
multiple sensing modalities to detect changes between two 3-D
maps [12].
To effectively perform active environment perception, col-
ored 3-D point cloud data obtained from a range sensor and an
associated color camera has been successfully used in object
detection and recognition. In [5], objects were detected by
using Iterative Closest Point (ICP) with a database of known
point cloud models to guarantee accurate results. By using a
3-D point cloud and an associated color image, a fast scene
analysis scheme was presented in [13], which can rapidly parse
a scene into a collection of planar surfaces so that a robot was
able to quickly detect relevant objects such as walls, doors,
windows, tables and chairs. To automatically search objects
in an indoor environment, Kanezaki et al. developed a system
that can collect 3-D-scene data by transforming both color and
range images into a set of color voxel data. 3-D features in
each bounding box region were extracted for computing the
similarity between these features and the features of a target
object. Then a global search of the collected 3-D-scene data
objects was conducted for quick object detection [14].
Although the fusion between the 3-D point cloud and an
associated color image (e.g. an RGB-D (Kinect-style) depth
cameras) is a very useful technology, these sensors still have
two limitations for object detection and recognition. First, the
RGB data acquired from a RGB-D camera are still susceptible
to different lighting conditions; Second, their measurement
range is limited and the field of view is narrow. Therefore,
many 3-D laser range finders have been deployed for handling
large-scale scenes, such as Leica HDS 3000 terrestrial laser
scanner, Velodyne’s HDL-64E LiDAR sensor, as well as
2D SICK and Hokuyo laser sensors mounted on rotating
platforms. These 3-D laser range finders can provide a wider
field of view, detailed 3-D range information, most importantly
working in a dark environment.
In recent years, laser range point clouds have been widely
deployed in object detection and recognition. A common
approach for finding objects in 3-D laser range point clouds
is to use a bottom-up procedure where planes and curves are
located first, and then fit to known models. Rather than asso-
ciating point data with a priori hypothesized object, an alter-
native approach is to reconstruct object models directly from
point data, and explores relaxations of the exact likelihood
function [15]. In order to extract effective features in 3-D point
cloud data for object recognition, Steder et al. proposed
1530-437X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.