Improved Visible Differences Predictor Using a Complex Cortex Transform Alexey Lukin Laboratory of Mathematical Methods of Image Processing, Department of Computational Mathematics and Cybernetics, Moscow Lomonosov State University, Russia lukin@graphics.cs.msu.ru Abstract Prediction of visible differences involves modeling of the human visual response to distortions in the image data. Following the approach of Daly [1], this paper introduces several algorithm im- provements allowing for more accurate calculation of threshold ele- vation and modeling of the facilitation effect during phase-coherent masking. This is achieved by introducing a complex-valued cortex transform that separates the response magnitude from the instanta- neous phase within each band of the cortex transform. The magni- tude component is used for calculation of the mask contrast. The phase component takes part in modeling of the facilitation effect. Keywords: Visible Differences Predictor, VDP, Cortex Transform, Image Quality Assessment, Complex Cortex Transform, Human Vi- sual System, HVS, Masking. 1. INTRODUCTION Prediction of visible differences means estimation of subjective vis- ibility of distortions in the image data. Algorithms for prediction of such differences are important in automated quality assessment of imaging systems, including lossy compression of video signals, assessment of transmission channel distortions, optimization of re- alistic image synthesis algorithms, etc. Many image quality metrics have been proposed in the literature. The most successful objective metrics include models of the Human Visual System (HVS) for prediction of such effects as non-uniform sensitivity to spatial frequencies and visual masking, like the Visual Differences Predictor (VDP) proposed by Daly [1]. Daly’s VDP uses a (modiﬁed) cortex transform [2] to decompose the image into subbands of different spatial frequency and orienta- tion. It allows modeling of frequency-dependent and orientation- dependent masking in the human visual system. For each cortex transform band, the contrast of the difference signal and the con- trast of the masking signal are evaluated. Threshold elevations are calculated from the contrast of the mask signal. They are used to calculate the probability of detection of the difference signal, sub- ject to visual masking. The detection probabilities are summed over all cortex transform subbands. The work of Mantiuk et. al. [3] proposes several improvements to the model of Daly, including evaluation of contrast in JND (just noticeable difference) units, and varying CSF (contrast sensitivity function) depending on the local luminance adaptation level. A shortcoming of the ”traditional” cortex transform is the inability to accurately model phase-invariant masking (explained in the next section). For example, a chirp image signal in Fig. 1a would pro- duce an oscillating signal in each cortex band, as in Fig. 1b. This, in turn, would produce an oscillating mask contrast signal and os- cillating threshold elevation, as in Fig. 1c. In this paper, a modiﬁcation of the cortex transform is introduced to obtain phase-independent estimates of the masking contrast, as in Fig. 1d. Section 2 describes the cortex transform and its phase- variance. Section 3 introduces the Complex Cortex Transform (CCT) and its computation algorithm. Section 4 illustrates the use of CCT for evaluation of masking thresholds in VDP. Section 5 presents the computational results of threshold elevations. (a) (b) (c) (d) Figure 1: (a) A chirp image; (b) Response of a “traditional” cor- tex ﬁlter (only positive signal part is shown); (c) Threshold eleva- tion image (or mask contrast) produced using a “traditional” cortex ﬁlter; (d) Threshold elevation image produced using a “complex” cortex ﬁlter proposed in this paper. 2. CORTEX TRANSFORM IN VDP The cortex transform is ﬁrst described by Watson in [2] as an ef- ﬁcient means of modeling the neural response of retinal cells to visual stimuli. The cortex ﬁlter in the frequency domain is pro- duced as a product of 2 ﬁlters: the ‘dom’ ﬁlter providing frequency selectivity and the ‘fan’ ﬁlter providing orientation selectivity: cortex , (, ) ≡ dom  () ⋅ fan  () , (1) where  is the index of the frequency band,  is the index of orienta- tion, and (, ) are polar coordinates in the frequency space (corre- sponding Cartesian coordinates will later be denoted as (1,2)). Fig. 2 illustrates frequency responses of several cortex ﬁlters, and Fig. 3a shows the example impulse response (point spread function) of the cortex ﬁlter. The cortex transform decomposes the input image  (,  ) into a set of subband images  , (,  ) (cortex bands) as follows  , (,  ) ≡ℱ −1 {cortex , (, ) ⋅ℱ{ (,  )}} , (2) where ℱ is the 2D discrete Fourier transform.