1: \begin{abstract}
2: One impressive advantage of convolutional neural networks (CNNs) is their ability to automatically learn feature representation from raw pixels, eliminating the need for hand-designed procedures. However, recent methods for single image super-resolution (SR) fail to maintain this advantage. They utilize CNNs in two decoupled steps, \emph{i.e.}, first upsampling the low resolution (LR) image to the high resolution (HR) size with hand-designed techniques (\emph{e.g.}, bicubic interpolation), and then applying CNNs on the upsampled LR image to reconstruct HR results. In this paper, we seek an alternative and propose a new image SR method, which jointly learns the feature extraction, upsampling and HR reconstruction modules, yielding a completely end-to-end trainable deep CNN.
3: %As opposed to existing approaches, the proposed method conducts upsampling in latent feature space with learned filters, which is optimized for the task of image SR.
4: As opposed to existing approaches, the proposed method conducts upsampling in the latent feature space with filters that are optimized for the task of image SR.
5: In addition, the HR reconstruction is performed in a multi-scale manner to simultaneously incorporate both short- and long-range contextual information, ensuring more accurate restoration of HR images. To facilitate network training, a new training approach is designed, which jointly trains the proposed deep network with a relatively shallow network, leading to faster convergence and more superior performance. The proposed method is extensively evaluated on widely adopted data sets and improves the performance of state-of-the-art methods with a considerable margin. Moreover, in-depth ablation studies are conducted to verify the contribution of different network designs to image SR, providing additional insights for future research.
6: \end{abstract}
7: