978-1-4673-5637-4/13/$31.00 ©2013 IEEE Abstract. The results of design and investigation on a human gesture recognition system, based on a Kinect sensor, are presented in this paper. In the presented research, we use a Kinect device as a 3D data scanner. Therefore, the 3D coordinates are calculated directly from depth images. The system’s hardware description and computation method for 3D human gesture identification are presented in this study. Ten specific single hand motion gestures, repeated several times by seven different people were recorded and used in the experimentation. Gesture recognition and interpretation are performed by using a trained neural classifier in two ways. In the first way, single hand motion gestures are captured in free 3D space, while in the second one people’s heads coordinates in 3D are used as reference points for recorded hand gestures. Such an approach provided easy adaptation and flexibility for gesture interpretation. The structure of the classifier was estimated through the trial and error approach. Keywords: 3D gesture recognition, Kinect, neural network. I. INTRODUCTION HE traditional human and electronic devices interface is not sufficiently effective in exploiting all the advantages of nonverbal information. These days keyboards, manipulators and touch screens prevail as control devices, but they can only transfer small amounts of data for the exchange of information between humans and devices. Usually, all these devices work only as long as humans have a direct contact with them. In order to process the ever increasing amounts of information efficiently, users of computers with 3D application software need more natural and effective interaction methods. Humans’ physical abilities and the development of modern electronics technology make it possible to design and develop new interaction methods between humans and any device or system. The use of human gestures is a noteworthy alternative to current interface devices for the human – computer interaction (HCI) or robot control. In particular, visual recognition and interpretation of the human gesture recognition system provide the absence of physical contact and naturalness desirable in a system’s interface. According to [1] numerous approaches to gesture recognition have been developed. A large variety of techniques have been used to track the hand in 2D pictures, and recognition is limited by use of Hidden Markov Models (HMM) and Kalman filters. However, it is rather difficult to recognize the hand in color images, when the background color matches the hand’s skin color or when the lighting is changeable [16]. In the past two years low-cost depth-sensing cameras have also become commercially available, including the very well-known Microsoft Kinect 3D scanner [2, 3]; the latter has made it possible to sense not only the hand, but also the whole body without using any markers or hand-held devices [4, 5, 6]. The Kinect devices do not work reliably in areas lit by direct sunlight. The OpenNI TM organization has emerged to promote the standardization of these natural interaction devices, and has made available an open source framework for developers. The study in [13] has revealed that aircraft marshalling hand motion gestures used in the military air force can be recognised with a ~99 % recognition rate on the testing data set and with a 83% recognition rate on a data set with unintentional (unseen) gestures. Wang et. all in [14][15] have used Hidden Markov Model (HMM) for 2D 7 hand motion gesture recognition and have obtained a ~95% recognition rate. The research presented here is focused on the design of a system in which hand gestures are identified as commands for robot control. The experiment was split into two parts: first, 10 different gestures (commands) made by a single hand were captured in free 3D space. Following this, human head coordinates are used as a reference point in space to bound hand movements at different distances. The coordinates of human body parts in 3D space are obtained as written in [4 - 6] with the Kinect sensor. Such a system can be adapted to different systems as the human – system interface. For gesture interpretation we have applied the Neural Network (NN) classifier with tapped delay lines (TDL) which was effective enough and robust in gesture classification. The NN itself is a static data mapping structure and by adding TDL to the NN the input data dynamics is expressed. The series of delays break the input up in time and these delayed values are fed into the network. Based on these inputs, the NN generates an estimated output value, i.e. a predefined class label. The artificial neural network registers the sequence of hand gestures (eg 10 reference points) on real time basis and then generates the outputs. The paper contains five main sections. In the second section the experimental setup is surveyed. The third section gives more information about the experiment and gesture data. Experimental results are presented in section four. Finally, the conclusions are presented in section five. 3D Human Hand Motion Recognition System Kstas Rimkus 1 , Audrius Bukis 2 , Arnas Lipnickas 3 , Saulius Sinkeviius 1 1 Department of Control Technology and 2 Department of Process Control, 3 The Mechatronics Centre for Studies, Information and Research, Kaunas University of Technology, Lithuania, kestas.rimkus@gmail.com, arunas.lipnickas@ktu.lt, saulius.sinkevicius@stud.ktu.lt T 180