Roman Bruch*, Rüdiger Rudolf, Ralf Mikut, and Markus Reischl
Evaluation of semi-supervised learning using
sparse labeling to segment cell nuclei
https://doi.org/10.1515/cdbme-2020-3103
Abstract: The analysis of microscopic images from cell cul-
tures plays an important role in the development of drugs. The
segmentation of such images is a basic step to extract the vi-
able information on which further evaluation steps are build.
Classical image processing pipelines often fail under heteroge-
neous conditions. In the recent years deep neuronal networks
gained attention due to their great potentials in image segmen-
tation. One main pitfall of deep learning is often seen in the
amount of labeled data required for training such models. Es-
pecially for 3D images the process to generate such data is
tedious and time consuming and thus seen as a possible rea-
son for the lack of establishment of deep learning models for
3D data. Efforts have been made to minimize the time needed
to create labeled training data or to reduce the amount of la-
bels needed for training. In this paper we present a new semi-
supervised training method for image segmentation of micro-
scopic cell recordings based on an iterative approach utilizing
unlabeled data during training. This method helps to further
reduce the amount of labels required to effectively train deep
learning models for image segmentation. By labeling less than
one percent of the training data, a performance of 90% com-
pared to a full annotation with 342 nuclei can be achieved.
Keywords: Sparse labeling, Deep learning, Iterative training,
Semi-supervised learning, Semantic segmentation
1 Introduction
Cell cultures can be used to examine the effectiveness and se-
lectivity of an anti-cancer drug without the need to sacrifice
animals. A large part of such studies relies on the evaluation
of microscopic images, since they offer a wealth of informa-
tion. From basic matters like the proliferation of cells up to
more advanced aspects like the state of individual cells, a lot
of questions can be answered on cell imaging.
*Corresponding author: Roman Bruch, Institute of Molecular
and Cell Biology, Faculty of Biotechnology, Mannheim University of
Applied Sciences, Mannheim, Germany, e-mail:
r.bruch@hs-mannheim.de
Rüdiger Rudolf, Institute of Molecular and Cell Biology, Faculty of
Biotechnology, Mannheim University of Applied Sciences,
Mannheim, Germany
Ralf Mikut, Markus Reischl, Institute for Automation and Applied
Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
To extract this information in a quantitative and objective
manner algorithms are needed. The segmentation of nuclei in
microscopic images is a fundamental step on which further ac-
tions like cell counting or co-localization of other fluorescent
markers depend on. Algorithms like built-in FIJI plugins [13],
CellProfiler [9], Mathematica pipelines [14], TWANG [15]
and XPIWIT [1] are well established for this task. Typically,
these pipelines require properties such as object size and shape
and therefore have to be reparameterized for different record-
ing conditions or cell lines. In extreme cases, such as the seg-
mentation of apoptotic cells, parameterization is not sufficient
and special algorithms need to be designed [8].
In the recent years deep learning models like the U-
Net [11] gained attention in the biological field due to their
great modeling power: U-Nets can outperform classical seg-
mentation methods [3], but therefore they need rich training
data sets.
Newly emerging 3D cell cultures represent the living or-
ganism more closely than 2D cultures. Data sets are given
as stacked image series. Furthermore, new difficulties such
as decreasing brightness along the z-axis arise. Classical ap-
proaches, robust against intensity fluctuations exist, but suffer
from parameterization [15]. Deep learning methods like the U-
Net can also be used for 3D data [12], but they lack in estab-
lishment due to the effort to create 3D training data sets. The
process of generating training data for 3D images is time con-
suming and burdensome: Since the visualization is effectively
limited to 2D slices it is hard to conceive the object dimen-
sions which often leads to a loss of overview. In a full manual
approach, each plane in the 3D image has to be labeled indi-
vidually. It is also challenging to achieve consistent segment
boarders over consecutive planes [6].
As an example, to label a single 12812832 image
patch containing 200 nuclei with the help of an interactive la-
beling method [16], a time of 7.5ℎ was needed. To create a
training data set with only ten image patches a time of 75ℎ
would be needed.
Thus, many methods were developed to reduce the 3D-
labeling effort which can be divided in three major approaches:
Interactive labeling [5, 16], weakly supervised learning [19]
and artificial training data [2, 7, 17]. The goal of interactive
labeling is to accelerate the annotation process by support-
ing the user in a semi-automatic manner. Weakly supervised
learning uses different annotations like point or scribble an-
DE GRUYTER Current Directions in Biomedical Engineering 2020;6(3): 20203103
Open Access. © 2020 Roman Bruch et al., published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 License.