ORIGINAL ARTICLE Quantitative Arbor Analytics: Unsupervised Harmonic Co-Clustering of Populations of Brain Cell Arbors Based on L-Measure Yanbin Lu & Lawrence Carin & Ronald Coifman & William Shain & Badrinath Roysam # Springer Science+Business Media New York 2014 Abstract This paper presents a robust unsupervised har- monic co-clustering method for profiling arbor morpholo- gies for ensembles of reconstructed brain cells (e.g., neurons, microglia) based on quantitative measurements of the cellular arbors. Specifically, this method can identify groups and sub-groups of cells with similar arbor morphol- ogies, and simultaneously identify the hierarchical grouping patterns among the quantitative arbor measurements. The robustness of the proposed algorithm derives from use of the diffusion distance measure for comparing multivariate data points, harmonic analysis theory, and a Haar-like wavelet basis for multivariate data smoothing. This algorithm is designed to be practically usable, and is embedded into the actively linked three-dimensional (3-D) visualization and analytics system in the free and open source FARSIGHT image analysis toolkit for interactive exploratory population-scale neuroanatomic studies. Studies on synthet- ic datasets demonstrate its superiority in clustering data matrices compared to recent hierarchical clustering algo- rithms. Studies on heterogeneous ensembles of real neuronal 3-D reconstructions drawn from the NeuroMorpho database show that the proposed method identifies meaningful grouping patterns among neurons based on arbor morphol- ogy, and revealing the underlying morphological differences. Keywords Neuron reconstruction . L-Measure (RRID:nif-0000-00003) . Quantitative arbor analytics . Harmonic co-clustering . Population profiling Introduction The phrase “quantitative arbor analytics” is introduced here to refer to the application of high-dimensional mul- tivariate bio-informatics algorithms to profile quantitative measurements of the arbors of large populations of brain cells, for example, neurons, astrocytes, and microglia. Our motivations stem from the fact that the functional properties and/or the activation states of these cells are related to their arbor morphology (Jinushi-Nakao et al. 2007), and the unmet need to analyze populations rather than individual cells. This work is made possible by a convergence of recent advances. First, computer-assisted methods for reconstruct- ing cellular arbors from microscope images (Meijering 2010; Peng et al. 2011; Wang et al. 2011), methods to inspect and edit automated reconstructions (Luisi et al. 2011), commercial reconstruction systems (e.g., MBF Biosciences Inc., and BitPlane Inc.), and a variety of semi- automated reconstruction methods (Halavi et al. 2012; Ho et al. 2011) have collectively enabled neuroscientists to com- pile large collections of arbor reconstructions, notably the NeuroMorpho database (www.neuromorpho.org). These reconstructions can be exported to standard file formats for visualization, computational simulation, and further analy- sis. The next advance relates to quantifying arbor morphol- ogy. Given the reality that no single number can capture all Y. Lu : B. Roysam (*) Department of Electrical & Computer Engineering, University of Houston, Houston, TX, USA e-mail: broysam@Central.UH.EDU L. Carin Department of Electrical & Computer Engineering, Duke University, Durham, NC, USA R. Coifman Department of Mathematics, Yale University, New Haven, CT, USA W. Shain Center for Integrative Brain Research, Seattle Children’ s Research Institute, Seattle, WA, USA Neuroinform DOI 10.1007/s12021-014-9237-2