Retinal Image Registration Based on Keypoint Correspondences,
Spherical Eye Modeling and Camera Pose Estimation
Carlos Hernandez-Matas
1,2
, Xenophon Zabulis
1
and Antonis A. Argyros
1,2
Abstract— In this work, an image registration method for
two retinal images is proposed. The proposed method utilizes
keypoint correspondences and assumes a spherical model of
the eye. Image registration is treated as a pose estimation
problem, which requires estimation of the rigid transformation
that relates the two images. Using this estimate, one image can
be warped so that it is registered to the coordinate frame of the
other. Experimental evaluation shows improved accuracy over
state-of-the-art approaches as well as robustness to noise and
spurious keypoint correspondences. Experiments also indicate
the method’s applicability to diagnostic image enhancement and
comparative analysis of images from different examinations.
I. INTRODUCTION
Small vessel structure and function assessment can lead
to more accurate and timely diagnosis of diseases whose
common denominator is vasculopathy, i.e. hypertension and
diabetes [1]. Small vessels exist in all internal and external
organs. Of them, the retina provides an open and accessible
window for assessing their condition. Retinal vessels are
imaged through fundoscopy, an efficient and non-invasive
imaging technique that is suitable for screening. Accurate
image registration is of interest in the comparison of images
from different examinations [2] and in the combination of
multiple images into larger [3] or enhanced [4] ones.
Image registration has been employed frequently on
slightly overlapping images of the same examination, to
create mosaic images of large tissue areas, i.e. [3]. Small
overlap increases examination efficiency, but increases reg-
istration difficulty as it is based on less data. This difficultly
is tackled by strong registration cues, such as keypoint
correspondences, i.e. [5].
Not frequently, image registration has been employed to
register images of (approximately) the same retinal region.
Motivation is twofold. First, to combine images from the
same examination into an image of higher resolution, facil-
itating more precise measurements, [6], [7], [4]. Second, to
register images from different examinations and compara-
tively analyze them [2], [8].
In this work, the image registration problem refers to a
pair of images, the reference and the test image. Its solution
is the aligning transformation that warps the test image so
that the same physical points are imaged in the same pixel
coordinates as in the reference image. Henceforth, image
This research was made possible by a Marie Curie grant from the
European Commission in the framework of the REVAMMAD ITN (Initial
Training Research network), Project number 316990.
1
Institute of Computer Science, Foundation for Research and Technology
– Hellas (FORTH), Heraklion, Greece.
2
Computer Science Department, University of Crete, Heraklion, Greece.
{carlos, zabulis, argyros} at ics.forth.gr
registration methods which provide a solution by means of
transformation(s) upon the image plane are characterized
as “2D”, while methods which account for the retina as a
surface imaged from different views as “3D”.
The proposed method focuses on the cue to image registra-
tion due to keypoint correspondences. The additional value of
other cues is acknowledged; i.e. edge, bifurcation matching.
The proposed framework is open to additional cues and their
adoption is left for future work.
II. RELATED WORK
For retinal image registration, overlapping image regions
have been matched using similarity of intensities over spatial
regions [9] or the frequency domain [10], keypoint feature
correspondences [5], retinal feature matching i.e. vessel trees
[11], bifurcations [12]. Feature-based approaches are pre-
ferred in 3D approaches, as point correspondences comprise
a relatively stronger cue for estimating the motion between
two images and, also, are robust to local image differences.
Retinal image registration has been studied using 2D and
3D transformation models. 2D models do not explicitly
account for perspectivity [11], though some [12] employ non-
linear transformations for this purpose. 3D models account
for perspectivity, but require the shape of the imaged surface.
Consideration of perspectivity improves image registration.
Even simple surface models, as a planar patch, were shown to
promote registration accuracy [4]. At the other end, in [5],
the retinal surface is reconstructed to achieve registration.
However it requires a stereo reconstruction of the retina,
which for significantly overlapping images is inaccurate due
to the very short baseline.
Fundus imaging has been modeled by the pinhole camera
model [5]. Usually lens distortion has been judged as negli-
gible, due to the fine optics of fundus cameras. Visual dis-
tortions due to the cornea, the eye lens, the vitreous humor,
as well as pulsation, have been approximated as negligible.
We also follow these approximations, acknowledging that
compensating for pertinent distortions, would increase the
accuracy of the proposed method.
The proposed method utilizes a 3D cost optimization
method that is robust to correspondence errors and copes
with local minima. Efficiency is supported by a parallel
implementation and evaluation shows improved performance
with respect to state-of-the-art. The method is open to the
addition of more visual cues (i.e. due to edges, intensity).
III. METHOD
The proposed method estimates the rigid transformation
{R, t} that relates the reference (F
0
) and the test (F
r
) image,
978-1-4244-9270-1/15/$31.00 ©2015 IEEE 5650