Using Kinect TM and a Haptic Interface for Implementation of Real-Time Virtual Fixtures Fredrik Ryd´ en, Howard Jay Chizeck, Sina Nia Kosari, Hawkeye King and Blake Hannaford Abstract—The use of haptic virtual fixtures is a potential tool to improve the safety of robotic and telerobotic surgery. They can “push back” on the surgeon to prevent unintended surgical tool movements into protected zones. Previous work has suggested generating virtual fixtures from preoperative images like CT scans. However these are difficult to establish and register in dynamic environments. This paper demonstrates automatic generation of real-time haptic virtual fixtures using a low cost Xbox Kinect TM depth camera connected to a virtual environment. This allows generation of virtual fixtures and calculation of haptic forces, which are then passed on to a haptic device. This paper demonstrates that haptic forces can be successfully rendered from real-time environments containing both non-moving and moving objects. This approach has the potential to generate virtual fixtures from the patient in real-time during robotic surgery. I. I NTRODUCTION This paper presents a potential method to improve the safety and efficacy of robotic and telerobotic surgery. Provision of haptic feedback to the surgeon, so that a “sense of touch” is available to assist the surgeon, is an ongoing topic of research at several labs. Of particular interest are methods to constrain robot tool movements and to provide the surgeon with haptic indication of “don’t cut” zones. Such constraints are often called virtual fixtures [10]. They have been discussed in the context of telerobotics and surgical robotics for the past two decades. Abbott et al. [1] suggests that Virtual Fixtures can be divided into two categories: Guidance Virtual Fixtures, which guide the operator of the haptic device along a specified path; and Forbidden-Region Virtual Fixtures that “push back” on unintended movements into certain protected zones. In this paper, we propose a novel method for the automatic generation of Forbidden-Region Virtual Fixtures, based upon images obtained from a depth camera. Virtual Fixtures for telerobotics are most commonly speci- fied as simple geometric shapes. However, for use in surgical robotics, these fixtures may need to represent (or be derived from) portions of the patient’s anatomy. Thus a challenge arises: How can virtual fixtures be appropriately specified, relative to the patient—and can this be done in real time, so as to compensate for movements and deformations during a surgical procedure? Li et al. [7] suggest that shapes generated from pre-operative CT scans (via 3D Slicer [3]) can be used as virtual fixtures. This approach can work for surgery when there is little or no movement, if the CT scan can be adequately registered to the patient. However, when robotically-assisted minimally invasive surgery results in relative motion of different organs and tissues, then this use of pre-operative imagery is prob- lematical. Repeating CT scans during the surgical procedure presents logistical complications, as well as increased radiation exposure to the patient. A solution to this problem is to generate haptic virtual fixtures from depth camera imagery in real-time. Haptic ren- dering from RGB-D data was first investigated by Cha et al. [4]. Their method requires pre-processing of recorded camera data and therefore is not suitable for real-time rendering of dynamic environments. This paper demonstrates generation of real-time haptic virtual fixtures using Xbox Kinect, a low cost device developed for the gaming purposes. In Section II depth cameras and RGB-D cameras are dis- cussed. In III the implementation of the proposed approach is presented. Experimental evaluations are contained in Section IV. Finally, Section V provides suggestions for future work. II. DEPTH CAMERAS RGB-D cameras specify color information for each pixel and also specify an estimated distance from the camera to the pixel. This depth estimation is most commonly calculated using either time-of-flight, active stereo or projected patterns. Fig. 1. IR dot pattern of a scene with a Styrofoam head held up against a plain surface (Right). Color representation of depth estimation (Left). Note the black parts of the image where there are no projected dots and hence no depth estimation. The Xbox Kinect is a low cost RGB-D camera (with an approximate retail price of US$150) that has been developed for use with video games. The Kinect consists of one infrared (IR) projector, one IR camera and one regular RGB camera. The IR projector emits a dot pattern, which is known for a certain depth, into the room (see Figure 1). The IR camera 1