The application and use of multimedia signals such as image, video, and sound have increased immensely in daily-life. These visual signals are contaminated with several varieties of distortions during the acquisition, compression, transmission and/or display of the signal on screens. The human vision is the ultimate receiver of these multimedia signals. Consequently, visual perception based Image Quality Assessment (IQA) and Just Noticeable Difference (JND) have become important as they can predict the signal quality and highlight the regions of importance which are compatible with human vision. With this view, in this thesis, we investigated how human vision perceives information and model human vision. The human vision models are used for the image quality assessment and just noticeable difference estimation. The main contributions of this thesis are to propose visual perception based three different IQA and JND algorithms, and to explore an application of perceptual IQA metrics in generic image reconstruction. The four works are summarized as below. The first work focuses on the estimation of Just Noticeable Difference for natural images. Contrast Sensitivity (CS), Luminance Adaptation (LA) and Contrast Masking (CM) are important contributing factors for JND in images. Most of the existing pixel domain JND algorithms are based only on LA and CM and, do not have the capability of incorporating CS during the JND estimation. Research shows that the human vision depends significantly on CS, and an underlying assumption in the existing algorithms is that CS cannot be estimated in the pixel domain JND algorithms. However, in the case of natural images, this assumption is not true. Recent studies on human vision suggest that CS can be estimated via the Root Mean Square (RMS) contrast in the pixel domain. With this perspective, we propose the first pixel-based JND algorithm that includes a very important component of the human vision, namely CS by measuring RMS contrast. We also proposed a feed-back mechanism to alleviate the under- and over-estimation of contrast masking. This feed-back mechanism is based on the relationship between CS and RMS contrast. Experiments validate that the proposed JND algorithm efficiently matches with human perception and produces significantly better results when compared to existing pixel domain JND algorithms. In our second work, we propose to use visual-perception-based IQA metrics for the purpose of reconstruction. Most of the image reconstruction algorithms proposed in the previous literature are application-specific and have generalization issues due to the necessity for parameter tuning and an unknown level of distortion to the signal. To address this problem, we propose an efficient perceptually motivated and Maximum-A-Posterior (MAP) based generic framework for image reconstruction. The proposed algorithm can be applied in wide variety of applications where the need is to improve edge accuracy or suppress the visible artifacts. Recent research in IQA area suggests that gradient magnitudes are generally insensitive to the moderate level of noise, and we propose to utilize this property to find the pixels with similar edge semantics in the neighborhood. With this view, we incorporate the gradient-magnitude based IQA matrices into the MAP formulation to enhance reconstruction accuracy. The proposed generic algorithm (without the necessity of manually tuning any parameters) is shown to produce better (and in a few cases, competitive) reconstruction quality when compared to the state-of-the-art application-specific algorithms for most of the image processing applications. The third work discusses the quality assessment of screen content images. In this work, we address issues associated with free-energy-principle based IQA algorithms for objectively assessing the quality of Screen Content (SC) images. The existing IQA algorithms do not give sufficient emphasis to the textual regions in SC images and assume that these regions do not contribute to the quality of an SC image. However, this is in contrast to the processing of human vision. Since our eyes are well trained to discern text in daily life, our human vision has prior information about text regions and can sense small distortions in these regions. With this view, we propose a new reduced-reference IQA algorithm for SC images based upon a more perceptual relevant prediction model, which overcomes the above described problem by giving more emphasis to the textual region. From experiments, it is validated that the proposed algorithm has a better ability of efficiently estimating the quality of SC images when compared to the recently developed reduced-reference IQA and of the full-reference IQA algorithms. The fourth research work is related to quality assessment of depth image-based rendering (DIBR)-synthesized views. Free-point video (FVV) are synthesized via DIBR procedure in the `blind' environment (i.e., without reference images), and a blind quality evaluation and monitoring system is urgently required. The FVV images are used in several technologies, such as virtual reality, augmented reality, and mixed reality. The existing assessment metrics do not render human judgments faithfully mainly because geometric distortions are generated by DIBR. To this end, this work proposes a novel referenceless quality metric of DIBR-synthesized images using the auto-regression based local image descriptor. It was found that, after the AR prediction, the reconstructed error between a DIBR-synthesized image and its AR-predicted image can accurately capture the geometry distortion. Experiments validated the superiority of our no-reference quality method compared with prevailing full-reference approaches.
| Date of Award | 2016 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
Visual perception and its application in image processing
Jakhetiya, V. (Author). 2016
Student thesis: Doctoral thesis