Sign Language Gesture Recognition using Doppler Radar and Deep Learning Hovannes Kulhandjian † , Prakshi Sharma † , Michel Kulhandjian ‡ , Claude D’Amours ‡ † Department of Electrical and Computer Engineering, California State University, Fresno, Fresno, CA 93740, U.S.A. E-mail: hkulhandjian@csufresno.edu,prakshi1993@mail.fresnostate.edu ‡ School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Ontario, K1N 6N5, Canada E-mail: mkk6@buffalo.edu,cdamours@uottawa.ca Abstract—In this paper, we study American sign language (ASL) hand gesture recognition using Doppler radar. A set of ASL hand gesture motions are captured as micro-Doppler signals using a microwave X-band Doppler radar transceiver. We apply joint time-frequency analysis and observe the presence of the micro-Doppler signatures in the spectrogram. The micro- Doppler signatures of different hand gestures are analyzed using Matlab. Each hand gesture is observed to contain unique spectral characteristics. Based on unique spectral characteristics, we investigate the classiﬁcation of ASL essential short phrases including emergency signals. For recognizing and characterizing the presence of micro-Doppler signatures in spectrogram we explore deep convolution neural network (DCNN) algorithm. We show that the DCNN algorithm can classify different sign language gestures based on the presence of micro-Doppler signa- tures in the spectrogram with fairly high accuracy. Experimental results reveal that utilizing 80% of data for training, and the remaining 20% for validation purposes in DCNN algorithm a validation accuracy of 87.5% is achieved. To further improve the recognition system, we apply a very deep learning algorithm VGG-16 using transfer learning, which improves the validation accuracy to 95%. Index Terms—Detection and classiﬁcation, American sign lan- guage (ASL) gesture recognition, Doppler radar, micro-Doppler signatures, deep convolution neural network (DCNN), VGG-16 algorithm. I. I NTRODUCTION Hand gesture recognition has many applications ranging from medical, gaming, human machine interaction as well as sign language interpretation [1]–[3]. The problem of hand gesture recognition consists of identifying a given gesture performed by hand movements. There are various ways that can be used to perform hand gesture recognition ranging from video or image processing to radar motion detection and tracking [4]. A number of research works have studied sign language hand gesture recognition using video or image signal processing with the combination of machine learning. In [5], radar is used to enable gesture recognition based on the micro-Doppler signatures that are associated to different movements. Five micro-Doppler based handcrafted features are used for gesture recognition. A simple k-nearest neighbor (kNN) classiﬁer [6] is applied to evaluate the importance of the ﬁve features. The overall classiﬁcation accuracy of the proposed framework was 84%. In [7], a method is presented to classify four different kinds of hand gestures that include snapping ﬁngers, ﬂipping ﬁngers, hand rotation and calling, using a radar micro-Doppler sensor. Two different kinds of micro-Doppler features are extracted from time-frequency spectrum and support vector machine (SVM) [6] is applied to classify the four kinds of gestures. Experimental results reveal the proposed method classiﬁcation accuracy was 88.6%. In [8], deep neural network is applied for American sign language (ASL) ﬁngerspelling (posture) translation purposes. The ‘Kaggle’ ASL letter database of hand gestures was used to evaluate the framework. Performance validation provides high accuracy posture translation. In [9], a real-time ASL ﬁngerspelling translator based on convolutional neural network (CNN) is presented. A model is developed for classiﬁcation of letters from a -e correctly with ﬁrst-time users and another that classiﬁes letters from a - k correctly in the majority of cases. In [10], hand gesture recognition using radar micro-Doppler signature envelopes is presented. The kNN classiﬁer and Man- hattan distance (ℓ 1 ) [11] measure is used in their algorithm for distinguishing the envelope values. The algorithm uses an energy-based thresholding for separately extracting the positive and negative frequency envelopes that are present in spectrogram. The proposed method does not make use of a deep learning algorithm. In [12], a vision-based application is created that can offer sign language translation. The proposed method extracts temporal and spatial features from the video sequences. For spatial feature recognition CNN is used and a recurrent neural network (RNN) is applied to train on the temporal features. In [13], a method is presented using deep convolution neural network (DCNN) to classify images of the letters and digits in ASL. The data set of 25 images from ﬁve different people were collected using a camera. An accuracy of 82.5% is achieved on the alphabet gestures, and 97% on digits. Unlike the previous studies, which mainly focus on ASL letter or digit recognition, in this paper, we investigate recog- nition of ten essential hand gesture phrases including emer- gency signals. In an emergency situation, a ﬁrst responder who may be unfamiliar with ASL can use the proposed