Markus Philipp, Anna Alperovich, Alexander Lisogorov, Marielena Gutt-Will, Andrea Mathis, Stefan Saur, Andreas Raabe, Franziska Mathis-Ullrich Annotation-efficient learning of surgical instrument activity in neurosurgery https://doi.org/10.1515/cdbme-2022-0008 Abstract: Machine learning-based solutions rely heavily on the quality and quantity of the training data. In the medical domain, the main challenge is to acquire rich and diverse annotated datasets for training. We propose to decrease the annotation efforts and further diversify the dataset by introducing an annotation-efficient learning workflow. Instead of costly pixel-level annotation, we require only image-level labels as the remainder is covered by simulation. Thus, we obtain a large-scale dataset with realistic images and accurate ground truth annotations. We use this dataset for the instrument localization activity task together with a student- teacher approach. We demonstrate the benefits of our workflow compared to state-of-the-art methods in instrument localization that are trained only on clinical datasets, which are fully annotated by human experts. Keywords: Annotation-efficiency learning, neurosurgery, instrument localization, medical deep learning 1 Introduction The lack of large, annotated data is one of the main challenges in medical deep learning. This stems from the fact that the creation of such datasets is constrained by cost- and time-intensive annotations, which often require medical expertise. Annotations are especially expensive if they are on a pixel-wise level, such as segmentation or bounding boxes. To address the annotated data constraint, annotation-efficient learning became a relevant issue in medical deep learning [1]. We focus on the problem of localizing surgical instrument activity in neurosurgical microscope video data, see Fig. 1 (a), which is a cornerstone towards computer-assisted surgery. To train deep learning models in our prior work [2], annotators manually labelled instrument tips with bounding boxes, which we required to compute instrument activity labels, see Fig. 1 (b). Creating a medium-sized annotated dataset took hundreds of hours and many annotation rounds. To create a large-scale dataset, we need even more time and human effort. In this work, we investigate annotation-efficient learning to save annotation labour for future similar problems. Contributions. We propose an annotation-efficient learning workflow for surgical instrument activity localization. We abstain from costly pixel-level bounding box annotations and resort to cheaper image-level labels, which merely require annotators to decide if an instrument is present in a current frame or not. Based on these image-level annotations, we create a hybrid-synthetic data domain, where we can automatically compute instrument activity labels. In this way, we combine the advantage of human-made image- level annotations and machine-made pixel-level annotations. This approach speeds up the annotation process and diversifies the dataset with more instrument shapes and positions. Then, we formulate a student-teacher approach to learn instrument activity localization, where we use our hybrid-synthetic data domain as a proxy to guide the student. While we achieve competitive results compared to the model trained on the dataset based on costly manual bounding box annotations, our approach saves ~75% of the annotation work. ______ *Corresponding author: F. Mathis-Ullrich: Health Robotics and Automation (IAR-HERA), Karlsruhe Institute of Technology (KIT), Karlsruhe, DE, e-mail: franziska.ullrich@kit.edu M. Philipp: Health Robotics and Automation (IAR-HERA), KIT, Karlsruhe, DE & Carl Zeiss Meditec AG, Oberkochen, DE A. Alperovich: Carl Zeiss AG, Oberkochen, DE A. Lisogorov, S. Saur: Carl Zeiss Meditec AG, Oberkochen, DE M. Gutt-Will, A. Mathis, A. Raabe: University Hospital Bern, CH Figure 1: (a) A neurosurgical scene (left) with surgical instrument activity as yellow overlay (right). (b) Bounding box annotation for the same scene (top) and post-processing to obtain surgical activity labels (bottom). DE GRUYTER Current Directions in Biomedical Engineering 2022;8(1): 30-33 30 Open Access. © 2022 The Author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 International License.