3D Vision-Based Control On An Industrial Robot Mana Saedan and Marcelo H. Ang Jr.* Department of Mechanical Engineering National University of Singapore, Singapore 119260 *mpeangh@nus.edu.sg Abstract This paper investigates the relative target-object (rigid body) pose estimation for vision-based control. A closed- form target pose estimation algorithm is developed and implemented. Moreover, PI-based visual control was de- signed and implemented in the camera (sensor) frame to minimize the effect of errors in the extrinsic parameters of the camera. The performance of the vision-based control algorithm has been veri£ed on a 7-DOF industrial robot. 1. Introduction Industrial robots are designed for tasks such as pick and place, welding, and painting. The environment and the working conditions for those tasks are well set. If the work- ing condition changed, those robots may not be able to work properly. Therefore, external sensors are necessary to enhance the robot’s capability to work in a dynamic en- vironment. A vision sensor is an important sensor that can be used to extend the robot’s capabilities. The images of objects of interest can be extracted from their environment, then information from these images can be computed to control the robot. The control that uses the images as feed back signals is known as vision-based control. Recently, vision-based control has became a major research £eld in robotics. Vision-based control 1 can be classi£ed into two main categories. The £rst approach, feature based visual control, uses image features of a target object from image (sensor) space to compute error signals directly. The error signals are then used to compute the required actuation signals for the robot.The control law is also expressed in the image space. Many researchers in this approach use a mapping function (called the image Jacobian) from the image space to the Cartesian space. The image Jacobian, generally, is a function of the focal length of the lens of the camera, depth (distance between camera (sensor) frame and target features), and the image features. In contrast, the position- based visual control constructs the spatial relationship, tar- get pose 2 , between the camera frame and the target object frame from target image features. Many construction algo- rithms have been proposed. Each algorithm has different assumptions and limitations. 1 Some researchers use the term visual servo control. 2 The position and orientation of target-object. There are numbers of works on those two approaches. Feddema et al. [1], Hashimoto et al. [2] [3], and Pa- panikolopoulos et al. [4] are some of interesting works on the feature-based approach. In the position-based approach Chaumette et al. [5], Wilson and colleagues [6] [7] [8], and Martinet and Gallice [9] reported the works on position- based approaches that could achieve the same performance as feature-based approaches. In this paper, a position-based approach is presented. The advantage of this approach is that the servo control structure is independent from the target pose reconstruc- tion. Usually, the desired control values is speci£ed in the Cartesian space, so they are easy to visualize. One main issue on position-based approach is target pose reconstruc- tion. To construct the pose of a target object from two- dimension image feature points, two cameras are needed. Image feature points in each of the two images have to be matched and 3-D information of the coordinates of the target object and its feature points can then be computed by triangulation. One-camera systems can, however, de- termine 3-D information if the geometry of the target ob- ject is known beforehand. The distance between the feature points in the target object, for example, can be used to help compute the 3-D position and orientation of the target with respect to the camera. Several estimation methods were proposed using dif- ferent techniques. Moving a camera to different positions (and orientations) can extract the depth information from target image without knowledge of the actual target object geometry (e.g., dimensions). This method, however, has a signi£cant depth estimation error. To reduce the error many different camera positions should be used. Thus, the method is not suitable for tracking a moving object. An- other method, which was proposed by Wilson et al. [8], derives the relationship between target pose and image fea- ture points in a recursive form; based on the assumption that the actual target object features, i.e. position of fea- ture point with respect to the target frame, are known. The target pose can be estimated by using Kalman £lter. This method gives an accurate estimation when the vision sys- tem can be operated at high sampling rate, e.g. 61 Hz [8]. The main disadvantage of Wilson’s method is the plant er- ror covariance is needed in the Kalman £lter and this is not easy to identify. In addition, the plant error covariance can only be estimated for some speci£c cases. Hashimoto et al. [2] used the closed-form pose estimation method to £nd the M Saedan and M H Ang Jr, “3D Vision-Based Control of an Industrial Robot”, Proceedings of the IASTED International Conference on Robotics and Applications, Nov 19-22, 2001, Florida, USA, pp. 152-157.