A HPC Infrastructure for Processing and
Visualizing Neuro-anatomical Images Obtained by
Confocal Light Sheet Microscopy
Alessandro Bria
∗
, Giulio Iannello
†
, Paolo Soda
†
, Hanchuan Peng
‡
,
Giovanni Erbacci
§
, Giuseppe Fiameni
§
, Giacomo Mariani
§
, Roberto Mucci
§
, Marco Rorro
§
,
Francesco Pavone
¶
, Ludovico Silvestri
¶
, Paolo Frasconi
‖
and Roberto Cortini
‖
∗
Department of Electrical and Information Engineering
University of Cassino and Lazio Meridionale, Cassino (FR), Italy
†
Integrated Research Center, University Campus Bio-Medico of Rome, Italy
‡
Allen Institute for Brain Science, Seattle, WA, USA
Janelia Farm Research Campus, Howard Hughes, Medical Institute, Ashburn, VA, USA
§
SuperComputing Applications and Innovation Department
Cineca - Interuniversity Consortium, Casalecchio di Reno (BO), Italy
¶
European Laboratory for Non-Linear Spectroscopy (LENS), University of Florence, Italy
‖
Information Engineering Department, University of Florence, Italy
Abstract—Scientific problems dealing with the processing of
large amounts of data require efforts in the integration of
proper services and applications to facilitate the research activity,
interacting with high performance computing resources. Easier
access to these resources have a profound impact on research in
neuroscience, leading to advances in the management and pro-
cessing of neuro-anatomical images. An ever increasing amount
of data are constantly collected with a consequent demand of
top-class computational resources to process them. In this paper,
a HPC infrastructure for the management and the processing
of neuro-anatomical images is presented, introducing the effort
made to optimize and integrate specific applications in order to
fully exploit the available resources.
Keywords—HPC, Data, Neuroscience, Visualisation, Human
Brain Project, Confocal Microscopy
I. INTRODUCTION
Contemporary science has to tackle with an ever increasing
amount of data, and biological sciences make no exception.
Indeed, the automatization of imaging techniques such as
optical and electron microscopy is making possible to collect
larger and larger image datasets, which nowadays easily ex-
ceed one TeraByte each. In parallel to technical developments
improving the speed and throughput of data generation, new
computational paradigms are needed to cope with these large
datasets, in order to discover new insights. The research
field in computational analysis of biological images has been
recently named Bioimage Informatics [1], and several tools
are now available to deal with important problems in bioimage
analysis. However, even state-of-the-art tools cannot generally
cope with images of dimensions exceeding tens of GigaBytes
so novel tools are needed, especially designed to operate
on TeraByte-sized datasets. A further issue that arises when
dealing with very large images, is that all the processing
pipelines from image acquisition to storage and retrieval have
to be carefully designed and implemented to maintain resource
requirements and response times within acceptable limits. In
particular high performance computing techniques have to be
extensively employed to meet application requirements.
To respond to the increasing complexity of manipulation and
processing these very large datasets, an IT infrastructure has
been set up to provide data management and high performance
computing capabilities. Data handled are mouse brain images
obtained using CLSM (Confocal Light Sheet Microscopy) [2],
a confocal ultra-microscopy technique in which selectively
labelled neurons are imaged by light-sheet based microscopy
[3] [4] with micron-scale resolution. Data obtained from an
experiment conducted on a mouse brain (1 cubic cm at micro-
metric resolution) might be of a range of 1 TeraByte, or more.
Specific applications have been implemented in a toolkit at dis-
posal of the scientists in order to perform: 1) fully automated
3D Stitching capability starting from acquired raw data and 2)
semi-automatic extraction of some morphological characteris-
tics (e.g. neurons localization) [2] [5] and 3) interactive visu-
alization and annotation of images. Data and software tools,
as well as elaboration algorithms are made available through
a dedicated storage and computational infrastructure operated
by Cineca [6], the largest Italian computing center. Data sets
originated from the CLSM at the European Laboratory of Non-
linear Spectroscopy LENS [7] are transferred to Cineca using
high performance protocol (i.e. GridFTP) and successively
978-1-4799-5313-4/14/$31.00 ©2014 IEEE 592