Medical Image Segmentation by Using Reinforcement Learning Agent
Mahsa Chitsaz
Faculty of Computer Science and Information
Technology, University of Malaya
mchitsaz@perdana.um.edu.my
Woo Chaw Seng
Faculty of Computer Science and Information
Technology, University of Malaya
cswoo@um.edu.my
Abstract— Image segmentation still requires
improvements although there have been research work
since the last few decades. This is due to some factors.
Firstly, most image segmentation solution is problem-
based. Secondly, medical image segmentation methods
generally have restrictions because medical images have
very similar gray level and texture among the interested
objects. The goal of this work is to design a framework to
extract simultaneously several objects of interest from
Computed Tomography (CT) images. Our method does
not need a large training set or priori knowledge. The
learning phase is based on reinforcement learning (RL).
The input image is divided into several sub-images, and
each RL agent works on it to find the suitable value for
each object in the image. Each state in the environment
has associated defined actions, and a reward function
computes reward for each action of the RL agent. Finally
the valuable information is stored in a Q-Matrix, and the
final result can be applied in segmentation of new similar
images. The experimental results for cranial CT images
demonstrated segmentation accuracy above 93%.
Keywords- Biomedical image segmentation; multi-agent
system; reinforcement learning system
I. INTRODUCTION
Image segmentation techniques have been an invaluable
task in many domains. For example, quantification of tissue
volumes, medical diagnosis, pathological localization,
anatomical structure study, treatment planning, partial
volume correction of functional imaging data, and computer
integrated surgery[1]. Image segmentation separates an
image into some disjoint partitions whereas the whole of
partitions reconstruct the primary image. Image
segmentation is still a debatable problem although there
have been done many research work in the last few
decades[2]. First of all, every solution for image
segmentation is problem-based. Secondly, medical image
segmentation methods generally have restrictions because
medical images have very similar gray level and texture
among the interested objects. Therefore, significant
segmentation error may occur. Another difficulty may arise
due to the lack of sufficient training samples. For instance,
some supervised segmentation methods require training
samples that are prepared by field experts. Consequently, a
more universal approach in segmentation requires decreasing
the level of user interaction and minimum training data set.
Bearing in mind the above obstacles of medical image
segmentation, our new algorithm based on reinforcement
learning (RL) is proposed. Agent can learn to perform
segmentation over time by systematic trial and error[3]. The
reinforcement learning agent is trained by obtaining rewards
or punishment based on its action in an environment. Due to
the dynamic nature of RL agent, it is suitable for segmenting
images with high complexity. The goal of the RL agent is to
find out an optimal way to reach the best answer given some
signals obtained after each action.
The state and action should be defined when using RL
method in medical image segmentation; state can be defined
as regions within the image .Firstly, the agent takes an image
and applies some values. The input image is divided into
several sub-images, and each RL agent works on it to find
the suitable value for each object in the image. Each state in
the environment has associated with some actions; and a
reward function computes reward for each action of the RL
agent. Therefore, the agent tries to learn which actions can
gain the highest reward. Finally the gained valuable
information will be used to segment new similar images.
The main purpose of this work is to segment medical
images simultaneously with some different regions of
interest. This is a significant advantage compared to other
approaches because it can segment many objects within an
image concurrently. In addition, our method does not need a
large training set or priori knowledge.
We present a short description of reinforcement learning
in Section II. Section III gives the details of the approach and
discusses algorithms used in this research. Section IV
analyses the experimental results. Finally, Section V
concludes our work.
II. BACKGROUND
A. Reinforcement-Learning Model
Learning to act in ways that are rewarded is a sign of
intelligence. For example, it is natural to train an elephant in
circus by rewarding it when it correctly acts in reaction of a
command as have been studied in experimental
psychology[4]. In the standard reinforcement learning
model, an agent interacts with its environment via perception
and action as depicted in Figure 1. On each step of
interaction the agent receives as input, i, the current state, s,
of the environment; the agent then chooses an action, a, to
generate an output. The action changes the state of the
environment and the value of this state transition is then
received by the agent through a reinforcement signal
(reward/punishment), r. The agent's behavior, B, should
International Conference on Digital Image Processing
978-0-7695-3565-4/09 $25.00 © 2009 IEEE
DOI 10.1109/ICDIP.2009.14
216