SPNet: Shape Prediction using a Fully Convolutional Neural Network S M Masudur Rahman Al Arif 1 , Karen Knapp 2 and Greg Slabaugh 1 1 City, University of London 2 University of Exeter Abstract. Shape has widely been used in medical image segmentation algo- rithms to constrain a segmented region to a class of learned shapes. Recent meth- ods for object segmentation mostly use deep learning algorithms. The state-of- the-art deep segmentation networks are trained with loss functions defined in a pixel-wise manner, which is not suitable for learning topological shape infor- mation and constraining segmentation results. In this paper, we propose a novel shape predictor network for object segmentation. The proposed deep fully con- volutional neural network learns to predict shapes instead of learning pixel-wise classification. We apply the novel shape predictor network to X-ray images of cervical vertebra where shape is of utmost importance. The proposed network is trained with a novel loss function that computes the error in the shape domain. Experimental results demonstrate the effectiveness of the proposed method to achieve state-of-the-art segmentation, with correct topology and accurate fitting that matches expert segmentation. 1 Introduction Shape is a fundamental topic in medical image computing and particularly important for segmentation of known objects in images. Shape has been widely used in segmentation methods, like the statistical shape model (SSM) [1] and level set methods [2], to con- strain a segmentation result to a class of learned shapes. Recently proposed deep fully convolutional neural networks show excellent performance in segmentation tasks [3, 4]. However, the neural networks are trained with a pixel-wise loss function, which fails to learn high-level topological shape information and often fails to constrain the object segmentation results to possible shapes (see Fig. 1a, 1b and 1c). Incorporating shape information in deep segmentation networks is a difficult challenge. In [6], a deep Boltzmann machine (DBM) has been used to learn a shape prior from a training set. The trained DBM is then used in a variational framework to perform object segmentation. A multi-network approach for incorporating shape information with the segmentation results was proposed in [7]. It uses a convolutional network to localize the segmentation object, an autoencoder to infer the shape of the object, and finally uses deformable models, a version of SSM, to achieve segmentation of the tar- get object. Another method for localization of shapes using a deep network is proposed in [8] where the final segmentation is performed using SSM. All these methods con- sist of multiple components which are not trained in an end-to-end fashion and thus cannot fully utilize the excellent representation learning capability of neural networks