Standardization of the MSA/MAS/AMAS Hyper-Dimensional File Format
Aaron Torpy
1
, Mike Kundmann
2
Nick Wilson
1
, Colin MacRae
1
, and Nestor J. Zaluzec
3
1.
Microbeam Lab, CSIRO Mineral Resources, Private Bag 10, Clayton South, VIC, 3169, Australia.
2.
e-Metrikos, P.O. Box 5506, Pleasanton, CA 94566, USA.
3.
Electron Microscopy Center, Argonne National Laboratory, Argonne, IL 60439 USA.
In 2010, the Standards Committee of the Microscopy Society of America (MSA) formed a working
group comprising members of the MSA, the Microbeam Analysis Society (MAS) and the Australian
Microbeam Analysis Society (AMAS) to develop a standardized file format to facilitate the exchange of
microscopy datasets with high dimensionality, such as hyperspectral maps. The proposed file format,
known as the MSA/MAS/AMAS hyper-dimensional data file (HMSA, for short), was presented to the
community for comment at the M&M 2011 meeting in Nashville, TN [1], and revisions incorporating
feedback from researchers and vendors was presented at subsequent meetings [2-5]. After a further
round of generalisation and simplification, a finalised specification (HMSA v1.01) was approved by the
MSA Standards Committee in 2016 [6], which have now commenced the formal standardisation process
with the International Standards Organisation (ISO).
The primary consideration when designing the HMSA format was to ensure the greatest ease of
implementation, so as to enable the rapid and widespread adoption of the format by vendors and
researchers. However, simplicity was not the only design criterion, as to be fit for purpose the format
must be reasonably compact, and sufficiently flexible to support a wide range of experimental
techniques and measurement modes, including those not yet envisioned. To balance these competing
considerations, the HMSA format was split into two files; a simple binary format (file extension of
‘HMSA’) to efficiently store the hyperdimensional data in full fidelity, and an accompanying eXtensible
Markup Language (XML) file [7] that defines the size, location and ordering of data in the binary
HMSA file, as well as describe any experimental parameters that may be relevant.
The binary HMSA file contains one or more datasets, which are represented as regular arrays (of any
dimensionality) stored as raw uncompressed data. Whilst this simple format may be easily implemented
in any programming language or environment, it is also extremely flexible, as it can support spectra
(1D), images (2D), hyperspectral maps or serial-section images (3D), ‘hyper-image’ maps with an image
at every pixel (4D), and higher dimensional data. Conveniently, this structure of the binary dataset also
enables efficient data processing on very large files by permitting random seeking through the dataset on
disk. In the interests of simplicity, the HMSA binary file does not support internal data compression, as
this may require the use of external algorithms or libraries. The binary HMSA file was deliberately
designed to not require the use of external algorithms or libraries, as these may not be available in all
data processing environments or operating systems. Hence, the format should be applicable to the
broadest range of platforms, both now and in the future.
The XML file is used to define the layout of the binary data in the HMSA file in a human-readable
format, which would not be possible if this information was stored within the binary file itself. This
information may be read (or written) using basic text editor software, which allows researchers to create
or interpret these files without the need for any specialised software. The XML language is an
1092
doi:10.1017/S1431927617006122
Microsc. Microanal. 23 (Suppl 1), 2017
© Microscopy Society of America 2017
https://doi.org/10.1017/S1431927617006122
Downloaded from https://www.cambridge.org/core. IP address: 3.237.40.210, on 09 Apr 2021 at 11:40:05, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms.