A comparative study of skull stripping methods in relapsing-remitting multiple sclerosis: Emergence of a new automatic segmentation algorithm Jean-Christophe Souplet, Sophia Antipolis France, Christine Lebrun, Nice France, Pierre Clavelou, Clermont- Ferrand France, William Camu, Montpellier France, Jean Pelletier, Marseille France. Stéphane Chanalet, Nice France, Nicholas Ayache, Sophia Antipolis France and Grégoire Malandain, Sophia Antipolis France. Objective: To obtain an automatic and robust brain segmentation method on a multi-site prospective database of homogenous population of relapsing-remitting (RR) multiple sclerosis (MS) patients. Background: Skull-stripping is usually a required step before morphometric measurements on brain MRI. Manual delineation is a fastidious task and is subject to inter and intra-expert variability. Different automatic methods are available but there is no gold standard. Most methods have not been evaluated on MS patients MRI or require the lesions delineation. Design/Methods: 25 MS patients from different sites underwent MR examination at baseline and follow up. Five skull-stripping methods BET (Smith 2002), HWA (Segonne 2004), AnaT1toBrainMask (Brainvisa), EM- BrainMask (Dugas 2004) and 3dIntracranial (Ward 1999) were run on 30 sets of MRI sequences (T1, T2 FSE, PD). From these five segmentations, the Staple algorithm (Warfield 2004) was used to give a probabilistic reference segmentation for each set. This segmentation was validated visually by an expert and compared with manual delineation when possible. The Staple framework allowed to assess any segmentation method, by its sensitivity and its specificity. All methods and method combinations have been tested. A method combination binary segmentation was obtained by an automatic optimized thresholding of the corresponding Staple probabilistic segmentation. Results: The (sensitivity-specificity) measurement ranges from (0.838-0.763) to (0.985-0.993) for all methods and combination of methods. Considering additional information (average execution time, software installation facility, robustness…), the best segmentation is a combination of three methods (BET, EM-BrainMask, 3dIntracranial) with (0.980-0.951). This new method has been tested and validated by an expert on all database sets. Conclusions/Relevance: Using the Staple probabilistic framework different skull-stripping methods have been compared. An original reproducible automatic skull-stripping method has been obtained. This preliminary step is essential for atrophy and lesion load measurements.