IEEE SIGNAL PROCESSING LETTERS, VOL. 25, NO. 1, JANUARY 2018 85 Training-Free, Single-Image Super-Resolution Using a Dynamic Convolutional Network Aritra Bhowmik, Suprosanna Shit, and Chandra Sekhar Seelamantula, Senior Member, IEEE Abstract—The typical approach for solving the problem of single-image super-resolution (SR) is to learn a nonlinear mapping between the low-resolution (LR) and high-resolution (HR) repre- sentations of images in a training set. Training-based approaches can be tuned to give high accuracy on a given class of images, but they call for retraining if the HR → LR generative model deviates or if the test images belong to a different class, which limits their applicability. On the other hand, we propose a solution that does not require a training dataset. Our method relies on constructing a dynamic convolutional network (DCN) to learn the relation be- tween the consecutive scales of Gaussian and Laplacian pyramids. The relation is in turn used to predict the detail at a ﬁner scale, thus leading to SR. Comparisons with state-of-the-art techniques on standard datasets show that the proposed DCN approach re- sults in about 0.8 and 0.3 dB gain in peak signal-to-noise ratio for 2× and 3× SR, respectively. The structural similarity index is on par with the competing techniques. Index Terms—Convolutional neural network (CNN), deep learn- ing, dynamic convolutional network (DCN), Gaussian/Laplacian pyramids, super-resolution (SR). I. INTRODUCTION S INGLE-IMAGE super-resolution (SR) is an important tool in applications such as biomedical imaging [1], face hal- lucination [2], etc. In single-image SR [3], [4], one infers lo- cal image properties from the low-resolution (LR) image or learns them over a collection of given high-resolution (HR)–LR pairs. The learning approaches employ dictionaries or neural networks, which capture the LR–HR association. In this letter, we develop a technique that infers HR image features starting from the LR image without going through the standard process of training, obviating the need for a training dataset. Before proceeding further, we review recent techniques that speciﬁ- cally tackle the single-image SR problem, and highlight their strengths and weaknesses. Some of these techniques will be Manuscript received June 28, 2017; revised August 17, 2017; accepted September 4, 2017. Date of publication September 15, 2017; date of current version November 29, 2017. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. S. Channappayya. (Aritra Bhowmik and Suprosanna Shit contributed equally to this work and must be treated as joint ﬁrst authors). (Corresponding author: Chandra Sekhar Seelamantula). The authors are with the Department of Electrical Engineering, Indian In- stitute of Science, Bangalore 560012, India (e-mail: aritra0593@gmail.com; suprosanna93@gmail.com; chandra.sekhar@ieee.org). This letter has supplementary downloadable material available at http:// ieeexplore.ieee.org. Color versions of one or more of the ﬁgures in this letter are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/LSP.2017.2752806 used for making performance comparisons. A thorough review of single-image SR techniques is available in [5]. A. Related Literature A landmark contribution was made recently by Yang et al. [6], [7], who trained dictionaries for LR and HR image patches. The key assumption is that the LR and HR patches have the same sparse representation in respective dictionaries. The sparse rep- resentation corresponding to an LR patch from an unseen image is used to synthesize the corresponding HR patch, thus lead- ing to SR. With this as the central idea, Yang et al. [8] also developed a coupled LR/HR dictionary model optimized us- ing a joint cost function. Kim and Kwon used kernel ridge regression and incorporated image priors to suppress ringing artifacts [9]. Timofte et al. proposed anchored neighborhood regression [10] in which they solve a ridge regression prob- lem with neighborhood constraints, leading to a closed-form solution for the regression coefﬁcients. This approach is fast and leads to qualitatively the same results as the compet- ing techniques. In [11], they developed an advanced version of the algorithm, which learns from the patches in the local neighborhood of the anchor patch from the training dataset and not from the dictionary. These were by far the best per- forming techniques before the advent of neural network SR approaches. Within the learning paradigm, the SR problem is essentially posed as one of discovering the nonlinear association between the LR and HR patches. Dong et al. used a convolutional neural network (CNN) [12], [13] to learn the end-to-end mapping be- tween LR and HR pairs [14], [15]. The training is data-intensive and time-consuming, whereas the run-time complexity is low leading to fast SR. Recently, Dong et al. proposed a threefold acceleration strategy: 1) introducing a deconvolution layer at the end of the CNN; 2) reducing the dimensionality of the input feature; and 3) employing smaller ﬁlter sizes, all of which resulted in a 40× speed-up. This method is referred to as fast SRCNN [16]. Shi et al. developed an efﬁcient subpixel CNN to perform SR [17]. The early layers operate on the LR image, whereas the ﬁnal layer relies on subpixel convolution to upscale the image. Kim et al. proposed a very deep CNN architecture that learns the residual images instead of the HR image [18]. They showed that the very deep SR technique overcomes the limitations of SRCNN. They also proposed a method called deep recursive CNN, which incorporates skip connections between each hidden layer and 1070-9908 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.