DEEP-LEARNING-BASED HUMAN INTENTION PREDICTION WITH DATA AUGMENTATION Shengchao Li 1 , Lin Zhang 2 , and Xiumin Diao 3 1 Arrow Electronics, Centennial, CO, USA 2 Department of Physics & Astronomy, University of Central Arkansas, Conway, AR, USA 3 School of Engineering Technology, Purdue University, West Lafayette, IN, USA ABSTRACT Data augmentation has been broadly applied in training deep-learning models to increase the diversity of data. This study ingestigates the effectiveness of different data augmentation methods for deep-learning- based human intention prediction when only limited training data is available. A human participant pitches a ball to nine potential targets in our experiment. We expect to predict which target the participant pitches the ball to. Firstly, the effectiveness of 10 data augmentation groups is evaluated on a single-participant data set using RGB images. Secondly, the best data augmentation method (i.e., random cropping) on the single-participant data set is further evaluated on a multi-participant data set to assess its generalization ability. Finally, the effectiveness of random cropping on fusion data of RGB images and optical flow is evaluated on both single- and multi-participant data sets. Experiment results show that: 1) Data augmentation methods that crop or deform images can improve the prediction performance; 2) Random cropping can be generalized to the multi-participant data set (prediction accuracy is improved from 50% to 57.4%); and 3) Random cropping with fusion data of RGB images and optical flow can further improve the prediction accuracy from 57.4% to 63.9% on the multi-participant data set. KEYWORDS Human Intention Prediction, Data Augmentation, Human-Robot Interaction, Deep Learning 1. INTRODUCTION Humans can predict the intentions of others by observing their actions. We would also expect robots to be able to predict human intentions such that we can have safer and more efficient human-robot interactions [1][2][3], just like humans would do in collaboration with others. Besides human-robot interaction, human intention prediction is also the core technology for a variety of applications (e.g., rehabilitation devices to predict trainees’ intention of slowing down [4], pedestrians’ intention of crossing the road [5], driving assistance systems to predict drivers’ intention of lane change [6], and surveillance and security to predict the intentions behind detected abnormal human activities [7]). Human intention prediction is closely related to human action recognition but with a different purpose of classification. Action recognition [8][9] classifies different actions based on observed action sequences. However, intention prediction [10][11][12] predicts the intention of an action from the subtle motion patterns of the same action. Figure 1 illustrates the difference between action recognition and intention prediction. For action recognition, one needs to distinguish different action sequences, such as shooting an arrow and pitching a ball. The spatial and temporal patterns of these two action sequences are very different. For intention prediction, one predicts which target the participant pitches the ball to. These two pitching sequences have similar spatial and temporal patterns, which makes the intention prediction challenging. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.1, January 2022 1 DOI: 10.5121/ijaia.2022.13101