Group analyses of connectivity-based cortical parcellation using repeated
k -means clustering
Luca Nanetti
a,b
, Leonardo Cerliani
a,b
, Valeria Gazzola
a,b
, Remco Renken
a,c
, Christian Keysers
a,b,
⁎
a
BCN NeuroImaging Center, University of Groningen, The Netherlands
b
Department of Neuroscience—Section Social Brain Lab, University Medical Center Groningen, The Netherlands
c
Department of Neuroscience, University Medical Center Groningen, The Netherlands
abstract article info
Article history:
Received 20 August 2008
Revised 7 May 2009
Accepted 3 June 2009
Available online xxxx
Keywords:
Anatomical connectivity
K-means clustering
K-means
SMA–preSMA
Insula
K-means clustering has become a popular tool for connectivity-based cortical segmentation using Diffusion
Weighted Imaging (DWI) data. A sometimes ignored issue is, however, that the output of the algorithm
depends on the initial placement of starting points, and that different sets of starting points therefore could
lead to different solutions. In this study we explore this issue. We apply k-means clustering a thousand times
to the same DWI dataset collected in 10 individuals to segment two brain regions: the SMA–preSMA on the
medial wall, and the insula. At the level of single subjects, we found that in both brain regions, repeatedly
applying k-means indeed often leads to a variety of rather different cortical based parcellations. By assessing
the similarity and frequency of these different solutions, we show that ∼ 256 k-means repetitions are needed
to accurately estimate the distribution of possible solutions. Using nonparametric group statistics, we then
propose a method to employ the variability of clustering solutions to assess the reliability with which certain
voxels can be attributed to a particular cluster. In addition, we show that the proportion of voxels that can be
attributed significantly to either cluster in the SMA and preSMA is relatively higher than in the insula and
discuss how this difference may relate to differences in the anatomy of these regions.
© 2009 Elsevier Inc. All rights reserved.
Introduction
DW-MRI (Diffusion Weighted-Magnetic Resonance Imaging)
infers information about white matter structure in the brain from
the differential attenuation of the spin echo signal, as modulated by
the local spatial microstructure of the surrounding medium, and by
the strength and direction of the applied magnetic diffusion gradient
(Basser et al., 1994; Pierpaoli et al., 1996).
Using this method in conjunction with probabilistic tractography,
one can estimate, for each individual voxel of the brain (seed)
whether it is connected or not with all other voxels of the brain
(target). This information is called the binarized tractogram (also
known as binarized connectivity profile) of that voxel (Behrens et al.,
2003; Hosey et al., 2005).
Johansen-Berg (2004) first illustrated how this information can be
used to divide the medial motor wall in two subregions, which on the
basis of their location and functional properties were likely to
represent the supplementary and presupplementary motor area—
SMA and preSMA. They used probabilistic tractography for all voxels
within the SMA–preSMA complex to define the corresponding
tractograms, took the correlation between the tractograms of each
pair of voxels in the SMA–preSMA as a measure of their similarity and
calculated the full cross-correlation matrix (cc-matrix) of all tracto-
grams. They then reordered the cc-matrix using a spectral reordering
algorithm (Higham et al., 2007). Eye-balling then revealed a sudden
discontinuity in the reordered matrix. Remapping the location of the
voxels on either side of this discontinuity, they found that they fell
within putative SMA and preSMA, respectively.
This method attracted much attention, because it is plausible that
if two subregions have different connectivities, they may also have
different functions. Requiring an experimenter to decide where to
place the border between the regions however is undesirable. To
circumvent this caveat, researchers turned towards k-means cluster-
ing (Hartigan, 1975; Hartigan and Wong, 1979) to decide where to
place the border between clusters. K-means clustering is an iterative
algorithm, which, for the case of connectivity profiles, ultimately
divides the voxels of a seed region into k non overlapping clusters of
voxels (the experimenter needs to decide the value of k based on
functional and anatomical considerations). This is done in a hyper-
space with as many dimensions as there are voxels in the seed region.
Each seed voxel is represented as a point whose coordinates are the
correlation of its tractogram with all the other voxels' tractograms.
Different strategies exist to choose the initial putative centroids for the
algorithm. Hartigan and Wong (1979) propose to randomly choose k
points (i.e. voxels in our case) from the initial set, the coordinates of
which become the centers of the k clusters. A frequently used
NeuroImage xxx (2009) xxx–xxx
⁎ Corresponding author. Antonius Deusinglaan2, 9713AW Groningen, The Netherlands.
Fax: +315036387500
E-mail address: c.m.keysers@rug.nl (C. Keysers).
YNIMG-06338; No. of pages: 12; 4C:
1053-8119/$ – see front matter © 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.neuroimage.2009.06.014
Contents lists available at ScienceDirect
NeuroImage
journal homepage: www.elsevier.com/locate/ynimg
ARTICLE IN PRESS
Please cite this article as: Nanetti, L., et al., Group analyses of connectivity-based cortical parcellation using repeated k-means clustering,
NeuroImage (2009), doi:10.1016/j.neuroimage.2009.06.014