Mini-Batched Online Incremental Learning Through Supervisory Teleoperation with Kinesthetic Coupling* Hiba Latifee 1 , Affan Pervez 2 , Jee-Hwan Ryu 1 and Dongheui Lee 3 Abstract—We propose an online incremental learning ap- proach through teleoperation which allows an operator to partially modify a learned model, whenever it is necessary, during task execution. Compared to conventional incremental learning approaches, the proposed approach is applicable for teleoperation-based teaching and it needs only partial demonstration without any need to obstruct the task execu- tion. Dynamic authority distribution and kinesthetic coupling between the operator and the agent helps the operator to correctly perceive the exact instance where modification needs to be asserted in the agent’s behaviour online using partial trajectory. For this, we propose a variation of the Expectation- Maximization algorithm for updating original model through mini batches of the modified partial trajectory. The proposed approach reduces human workload and latency for a rhythmic peg-in-hole teleoperation task where online partial modification is required during the task operation. I. INTRODUCTION Since Ray Goertz first proposed a pantograph-based me- chanical teleoperation system [1], there has been a lot of progress in teleoperation. In early days, teleoperation systems have been limited to applications where human’s physical ac- cess is limited, which includes deep underwater, outer space, and nuclear power plants [2]. However, the recent advance- ments in human-computer interfaces and artificial intelli- gence technology have broadened its application domains [3, 4] to more general areas including telerobotic surgery, treat- ments and diagnosis [5], robotic gripping/grasping [6, 7], search and rescue (SAR) [8], and robotic rehabilitation [9]. Teleoperation tasks often require highly trained operators and high mental workload as the complexity of the teleoper- ation task increases and more general applications get intro- duced [10]. Shared teleoperation [11], which combines local autonomy with direct teleoperation, can relieve operator’s mental burden by shifting some of the workload, especially related to repetitive tasks, from the operator to the local autonomy of the slave. There were efforts to shift repetitive tasks from a human operator to an artificial agent by Learning from Demonstra- *This research is partially supported by the project “Toward the Next Generation of Robotic Humanitarian Assistance and Disaster Relief: Fun- damental Enabling Technologies (10069072)” and Helmholtz Association. 1 Hiba Latifee and Jee-Hwan Ryu are with the Department of Civil and Environmental Engineering, Korea Advanced Institute of Science and Tech- nology, Daejeon, Republic of Korea. h.o.latifee@gmail.com, jhryu@kaist.ac.kr 2 Affan Pervez is with Intech Process Automation, Lahore, Pakistan. affan.pervez@intechww.com 3 Dongheui Lee is with the Department of Electrical and Computer Engineering, Technical University of Munich (TUM), Munich, Germany and also with Institute of Robotics and Mechatronics, German Aerospace Center (DLR), Germany. dhlee@tum.de tions (LfD) through teleoperation [12, 13, 14]. Although the LfD through teleoperation showed possibility of relieving operator’s workload at least for repetitive tasks, a slight change in the task or the environment requires retraining from the beginning, which can be time consuming and computationally expensive. The human operator in parallel with an agent can handle these variations in a task or the environment [15], hence avoiding task failure. But, due to the active participation of the human operator on the control layer with the agent, the issue of operator’s heavy mental workload still remains prevalent. In such a case, incremental learning using human demon- strations helps in updating the already learned task [16, 17, 18]. In [19], the authors update the model incremen- tally through an iterative Expectation-Maximization (EM) algorithm, and in [20] local data points were utilized for incrementally learning a new model at each time step in a simulated environment. However, even for partially modify- ing the behaviour of the initially learned agent, a completely new full trajectory has been required and the model was up- dated offline once the full demonstration is terminated [21]. Moreover, as can be noted in the aforementioned studies, both the robot and the human demonstrator were physically co-located to incrementally update the learned model [22]. While [23] allows learned model’s autonomous refinement through shared teleoperation, but it requires multiple task demonstrations from the human operator to assert a modifi- cation which can be both a bit costly and cumbersome. One of the major limitations of the conventional incre- mental learning approaches (even though most of them are limited to kinesthetic teaching) is that they do not support updating the agent’s behaviour through a partial demonstration without any obstructions. Especially, for a partial modification in repetitive tasks, e.g. when one bolting position out of ten is changed, model update on the fly would provide large benefit; the task does not need to be obstructed especially when it is highly time critical, also we can maintain the quality of already well-trained model. However, the impact of partial demonstration on incremental learning has been understudied, particularly for rhythmic/repetitive tasks. Oftentimes, a complete new demonstration is not even needed (as only a part of the motion is needed to be updated) or even available [13]. Although there was a research effort to carry out an online model refinement through kinesthetic teaching [24, 25], there have been no prior investigations into an online model update especially through a partial teleoperated demonstration without obstructions. In this paper, we propose an online incremental learn-