1 CT Super-resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE) Chenyu You, Guang Li, Yi Zhang, Senior Member, IEEE, Xiaoliu Zhang, Hongming Shan, Shenghong Ju, Zhen Zhao, Zhuiyang Zhang, Wenxiang Cong, Michael W. Vannier, Member, IEEE, Punam K. Saha, Senior Member, IEEE, and Ge Wang*, Fellow, IEEE Abstract—Computed tomography (CT) is widely used in screening, diagnosis, and image-guided therapy for both clinical and research purposes. Since CT involves ionizing radiation, an overarching thrust of related technical research is development of novel methods enabling ultrahigh quality imaging with ﬁne structural details while reducing the X-ray radiation. In this paper, we present a semi-supervised deep learning approach to accurately recover high-resolution (HR) CT images from low- resolution (LR) counterparts. Speciﬁcally, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs. We also include the joint constraints in the loss function to facilitate structural preservation. In this deep imaging process, we incorporate deep convolutional neural network (CNN), residual learning, and network in network techniques for feature extraction and restoration. In contrast to the current trend of increasing network depth and complexity to boost the CT imaging performance, which limit its real-world applications by imposing considerable computational and memory overheads, we apply a parallel 1 × 1 1 × 1 1 × 1 CNN to compress the output of the hidden layer and optimize the number of layers and the number of ﬁlters for each convolutional layer. Quantitative and qualitative evaluations demonstrate that our proposed model is accurate, efﬁcient and robust for super- resolution (SR) image restoration from noisy LR input images. In particular, we validate our composite SR networks on three large- scale CT datasets, and obtain promising results as compared to the other state-of-the-art methods. Index Terms—Computed tomography (CT), super-resolution, noise reduction, deep learning, adversarial learning, residual learning. Asterisk indicates corresponding author. C. You is with Departments of Bioengineering and Electrical Engineering, Stanford University, Stanford, CA, 94305 USA (e-mail: uniycy@stanford.edu) G. Li, H. Shan, W. Cong, and G. Wang* are with Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, 12180 USA (e-mail: lig10@rpi.edu, shanh@rpi.edu, congw@rpi.edu, wangg6@rpi.edu) Y. Zhang is with the College of Computer Science, Sichuan University, Chengdu, 610065 China (e-mail: yzhang@scu.edu.cn) X. Zhang is with Department of Electrical and Computer Engineering, Uni- versity of Iowa, Iowa City, IA, 52246 USA, (email: xiaoliu-zhang@uiowa.edu) S. Ju, Z. Zhao are with Jiangsu Key Laboratory of Molecular and Func- tional Imaging, Department of Radiology, Zhongda Hospital, Medical School, Southeast University, Nanjing, 210009 China (e-mail: jsh0836@hotmail.com, zhaozhen8810@126.com) Z. Zhang is with Department of Radiology, Wuxi No.2 People’s Hospital, Wuxi, 214000 China (e-mail: zhangzhuiyang@163.com) M. W. Vannier is with Department of Radiology, University of Chicago, Chicago, IL, 60637 USA P. K. Saha is with Department of Electrical and Computer Engineering and Radiology, University of Iowa, Iowa City, IA, 52246 USA, (email: pksaha@engineering.uiowa.edu) I. I NTRODUCTION X -RAY computed tomography (CT) is one of the most popular medical imaging methods for screening, diag- nosis, and image-guided intervention [1]. Potentially, high- resolution (HR) CT (HRCT) imaging may enhance the ﬁdelity of radiomic features as well. Therefore, super-resolution (SR) methods in the CT ﬁeld are receiving a major attention [2], [3]. The image resolution of a CT imaging system is constrained by x-ray focal spot size, detector element pitch, reconstruc- tion algorithms, and other factors. While physiological and pathological units in the human body are on an order of 10 microns, the in-plane and through-plane resolution of clinical CT systems are on an order of submillimeter or 1 mm [3], [4]. Even though the modern CT imaging and visualization software can generate any small voxels, the intrinsic resolution is still far lower than what is ideal in important applications such as early tumor characterization and coronary artery analysis [5]. Consequently, how to produce HRCT images at a minimum radiation dose level is a holy grail of the CT ﬁeld. In general, there are two strategies for improving CT image resolution: (1) hardware-oriented and (2) computational. First, more sophisticated hardware components can be used, including an x-ray tube with a ﬁne focal spot size, detector elements of small pitch, and better mechanical precision for CT scanning. These hardware-oriented methods are generally expensive, increase the CT system cost and radiation dose, and compromise the imaging speed. Especially, it is well known that high X radiation dosage in a patient could induce genetic damages and cancerous diseases [6], [7]. As a result, the second type of methods for resolution improvement [8]– [14] is more attractive, which is to obtain HRCT images from LRCT images. This computational deblurring job is a major challenge, representing a seriously ill-posed inverse problem [3], [15]. Our neural network approach proposed in this paper is computational, utilizing advanced network architectures. More details are as follows. To reconstruct HRCT images, various algorithms were proposed. These algorithms can be broadly categorized into the following classes: (1) Model-based reconstruction meth- ods [16]–[20]: These techniques explicitly model the im- age degradation process and regularize the reconstruction according to the characteristics of projection data. These algorithms promise an optimal image quality under the as- arXiv:1808.04256v3 [eess.IV] 6 Sep 2018