Focusing on internal visual information included in one single image, internal methods have been widely studied in the computer vision community. Especially as it removes the necessity for collecting large-scale datasets which is usually accompanied with intensive human labor for labelling, deep internal learning has came into the limelight recently. However, in terms of practical usage of deep internal learning, there are still many obstacles to be overcome. For example, most existing deep internal methods are either (1) image-specific or task-specific, or (2) requires long training time. In this thesis, we push the limits of deep internal learning by proposing SinIR, a reconstruction-based framework trained on a single image for general image manipulation. SinIR is trained on a single image with cascaded multi-scale learning, where each network at each scale is responsible for image reconstruction. Having reconstruction as its training objective, SinIR is trained way faster and robustly. However, naively using reconstruction leads unsatisfactory visual quality due to its innate characteristics. Thus, to mitigate this problem, we apply random pixel shuffling, a simple solution to effectively enrich the learning process, inspired by the Denoising Autoencoder. SinIR solves various computer vision problems including super-resolution, editing, harmonization, paint-to-image, photo-realistic style transfer, and artistic style transfer. Quantitative evaluation shows SinIR has competitive performance comparable to those of dedicated external methods. Also it is found that SinIR is trained 33.5 times faster than SinGAN (for 500 x 500 images) that solves similar tasks.
| Date of Award | 2021 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
| Supervisor | Qifeng CHEN (Supervisor) |
|---|
SinIR : efficient general image manipulation with single image reconstruction
YOO, J. H. (Author). 2021
Student thesis: Master's thesis