VLLCVD Subjective Image Fidelity Criteria and Its Applications Day-Fann Shen, Jen-Hsing Sung, W.C. Fan, M.Z. Lin, J.G. Fun, Z.H. Shen, F.K. Tsai and Y.Z. Shin Signal Compression and Multimedia Communication Laboratory Department of Electrical Engineering Yunlin University of Science and Technology, Douleo, Yunlin, Taiwan Abstract Human eyes is often the final receiver of digital images, human visual system plays an important role in digital image processing. Three of the issues in image processing related to human visual perception are (1) Criteria for evaluation of image fidelity, (2) Definition of visually lossless image in image coding or invisible image in watermarking and (3) Criteria for evaluation of human visual models. Many criteria, such as SNR, PSNR, MSE etc., are available for the first issue, but these criteria have well-known drawbacks in reflecting the true perceived image quality. Even worse, criteria for the second and third problems are seldom heard. In this paper, we resolve the three problems simultaneously, under a unified and accurate subjective VLLCVD (Visually LossLess Critical Viewing Distance) criteria. 1. Introduction Since human visual system (HVS) is the final receiver of image and video, it is very important to incorporate human visual properties (HVP) in various image-processing tasks, such as image coding, image segmentation and invisible image watermarking. In image coding application, the perceptual image quality can be optimized by effective bits allocation according to HVP that same amount of bits can achieve better perceptual image quality. 1-10 Image segmentation is a process required by MPEG-4 as well as image analysis and computer vision tasks. The HVP based pixel/region merge criteria in image segmentation is proven to be a simple and effective alternative to the complicated statistically based merge criteria in image segmentation. 11 In invisible image watermarking, sufficient amount of watermark signals can be added to an image without being noticed by human perception, while increasing the robustness to attacks from pirates. 12~16 All the above applications require effective incorporation of visual properties. HVS and consequently HVPs are quite complicated, 19 a complete description of the HVS and HVPs is not practical and unnecessary. Instead, an efficient and effective representation of HVPs in terms of a human visual model (HVM) is normally sufficient for a particular image- processing task. Among the HVPs, Just Noticeable Difference (JND) property is the most frequently used. JND property states the fact that HVS’s sensitivity to visual stimuli is no unlimited, it cannot detect the difference between two visual components, if the difference is smaller than a certain threshold. Further, HVS’s sensitivity is not linear to all visual stimuli, that is, visual contents of different luminance and frequencies are weighted differently by HVS. In addition, other factors including ambient lighting condition, quality of the image display monitor, viewing distance and personal eye sight may influence HVS’s sensitivity to image distortions and therefore the perceived image fidelity or quality. Many JND based HVMs have been proposed, 1-16 these HVMs were derived by researchers using different approaches and under different viewing conditions. For example, Watson et al. proposed HVMs in the form of DCT quantization matrices for individual images, 9b visibility of DCT basis function 9c and the visibility of wavelet quantization noises, 9a which were derived from DWT (Discrete Wavelet Transform) basis function stimuli and DWT uniform quantization noise stimuli. While Chou et al. 8 derived a JND/MND profile in spatial domain for an image by considering the local property for each pixel, including the background luminance as well as the texture masking effects. The JND/NMD profile is then transformed into (DCT or Wavelet) subbands and assigned different weights for each subband according to HVS’s sensitivity to each subband. 8 Shen et al. 6 derived a JND model based on measurements of JND threshold on square waves of different frequencies and directions; 6,7 the result is a set of JND thresholds for each wavelet subband. Three of the issues in image processing related to human visual perception are (1) Criteria for the evaluation of image fidelity, i.e. given two decoded lossy images, how do we judge which image has better image fidelity? (2) Definition of visually lossless image, i.e. given an lossy image claimed to be “visually lossless”, “near-lossless” or “invisible watermarked”, how do we verify whether the claim is true or not? and (3) Criteria for evaluation of IS&T's 2001 PICS Conference Proceedings 310