An Interactive Learning Approach to Histology Image Segmentation Micha¨ el Derde a Laura Antanas a Luc De Raedt a Fabian Guiza Grandas b a Katholieke Universiteit Leuven, Department of Computer Science b Katholieke Universiteit Leuven, Laboratory of Intensive Care Medicine Abstract Histology image analysis using computer-aided diagnosis systems has become increasingly important during the last years. One reason is the need to alleviate the heavy workload of medical experts. In this paper, we introduce a general purpose framework which is able to solve histology analysis problems that are not restricted to a speciﬁc type of tissue or task, exploit local information in microscopical images, interact with medical experts and interatively consider direct user feedback. The framework is general enough to learn models that can adapt to several learning tasks and can detect several types of medical interest regions. We evaluate our framework on real-world datasets collected from patients in the intensive care unit. We considerably outperform image processing techniques commonly used in such medical imaging tasks. 1 Introduction Histology is the anatomical study of the microscopic structure of tissues. It is regarded as a gold standard for clinical diagnosis of diseased tissue (e.g., cancer) and for the identiﬁcation of therapy effects [10]. His- tological analysis is performed by examining a thin section of tissue under a microscope [13, 20, 16], after applying a sequence of procedures for tissue preparation: ﬁxation, dehydration, clearing, inﬁltration, em- bedding, sectioning and staining [14]. It reveals information about cells and tissue with a high level of detail. Despite the great care taken in their preparation, histology images are prone to several artifacts, e.g., folding of the tissue section, overlap among cell boundaries, noise introduced by the microscope or slides, blurry sections, etc. As a result, analysis of histology tissues remains most of the time a manual endeavor which relies heavily on the expertise of the medical expert. This manual work is however very time consuming and prone to subjective interpretation. Therefore, computer-assisted diagnosis (CAD) systems are becoming crucial in histology analysis, as they could automatically identify regions of medical interest. Their main advantage is the ability to provide immediate results in a consistent and objective manner, thereby reducing the workload of the medical experts. With few exceptions [24], current image analysis tools focus on speciﬁc tasks, such as nuclei and cell counting, and lack the ﬂexibility of dealing with a variety of tissue types that might be of interest to medical experts. Moreover, most systems make use of traditional image processing approaches such as global thresh- olding, region growing, region splitting and merging, and active contours (for more details see [18, 28]). Their main drawback is that they fail to account for local variations (e.g., brightness, staining intensity) within a single image that are introduced by the microscope or lightening conditions. In this paper we present a new CAD framework which learns to automatically detect regions of interest in histology images. It overcomes drawbacks of the current approaches by combining an interactive learning technique that adapts to the speciﬁc user-deﬁned medical task, with a local approach that takes into account local variations of particular regions of interest. Instead of the basic supervised learning paradigm in which the expert is asked to label examples and then a predictor is learned from these targets, without other explicit interaction, our framework uses expert knowledge to interactively feed training instances to the learning system. Once a new instance has been added by the expert, a completely new model is built from scratch, as a new supervised learning step. In this way, by changing the learning targets after each iteration, our framework can incoporate real-time feedback for the current model predictions, reducing training time and data. Once the model has been trained, it can be used to automatically process any amount of images. We formalize the supervised learning problem as a regression task and we employ regression trees to represent 1