Near Lossless Source Coding with Side Information at The Decoder: Beyond Conditional Entropy En-hui Yang Department of Electrical and Computer Engineering University of Waterloo Waterloo, ON N2L 3G1, Canada Email: ehyang@uwaterloo.ca Da-ke He Department of Multimedia Technologies IBM TJ Watson Research Center Yorktown Heights, NY 10598, USA Email: dakehe@us.ibm.com Abstract— In near lossless source coding with decoder only side information, i.e., Slepian-Wolf coding (with one encoder), a source X with ﬁnite alphabet X is ﬁrst encoded, and then later decoded subject to a small error probability with the help of side information Y with ﬁnite alphabet Y available only to the decoder. The classical result by Slepian and Wolf shows that the minimum average compression rate achievable asymptotically subject to a small error probability constraint for a memoryless pair (X, Y ) is given by the conditional entropy H(X|Y ). In this paper, we look beyond conditional entropy and investigate the tradeoff between compression rate and decoding error spectrum in Slepian-Wolf coding when the decoding error probability goes to zero exponentially fast. It is shown that when the decoding error probability goes to zero at the speed of 2 -δn , where δ is a positive constant and n denotes the source sequences’ length, the minimum average compression rate achievable asymptotically is strictly greater than H(X|Y ) regardless of how small δ is. More speciﬁcally, the minimum average compression rate achievable asymptotically is lower bounded by a quantity called the intrinsic conditional entropy H in (X|Y,δ), which is strictly greater than H(X|Y ), and is also asymptotically achievable for small δ. I. I NTRODUCTION Let (X, Y ) be a pair of random variables taking values in ﬁnite alphabets X and Y , respectively. Let {(X i ,Y i )} ∞ i=1 denote a sequence of independent copies of (X, Y ). For con- venience, the memoryless sources {X i } ∞ i=1 and {Y i } ∞ i=1 are also referred to simply as the sources X and Y , respectively. In the near lossless source coding of X with decoder only side information Y , the source X is ﬁrst encoded, and then later decoded subject to a small error probability with the help of the side information Y available only to the decoder. As such, a coding scheme of this type is described by a pair C n =(f n ,g n ), where f n , acting as an encoder, encodes X n = X 1 X 2 ··· X n into a binary codeword f n (X n ), and g n , acting as a decoder, decodes f n (X n ) into g n (f n (X n ),Y n ), an estimate of X n , with the help of the side information sequence Y n = Y 1 Y 2 ··· Y n . The performance of such a coding scheme is then measured by its compression rate and its decoding error probability ǫ n = Pr{X n = g n (f n (X n ),Y n )}. The above source coding paradigm was ﬁrst proposed and studied by Slepian and Wolf [1]. It was shown [1] that the This work was supported in part by the Natural Sciences and Engi- neering Research Council of Canada under Grants RGPIN203035-02 and RGPIN203035-06, and by the Canada Research Chairs Program. minimum average compression rate achievable asymptotically subject to a small error probability constraint ǫ n = o(1) for a memoryless pair (X, Y ) is still given by the conditional entropy H(X|Y ) of X given Y , the same rate as in the case where the side information Y is available to both the encoder and decoder. Recently, it was further shown [3] that this result remains valid no matter how fast ǫ n goes to 0 as long as − log ǫ n = o(n). On the other hand, if the decoding error probability ǫ n is required to be exactly 0, then Witsenhausen demonstrated [2] that for a memoryless pair (X, Y ) whose joint distribution has no zero entries, the minimum average compression rate achievable asymptotically is no longer the conditional entropy H(X|Y ), but rather the entropy H(X) of X. If we interpret lim n→∞ − log ǫ n n as the spectrum of the decoding error probability (i.e., the error exponent), then it is clear that the above results are corresponding to the two far ends of the spectrum. The conditional entropy H(X|Y ) is the minimum average com- pression rate asymptotically achievable at the spectrum of 0, which is one of the ﬁrst and perhaps one of the best- telling and inspiring results in network information theory. The entropy H(X) is the minimum average compression rate asymptotically achievable at the spectrum of ∞, which is, unfortunately, a negative result in the sense that the side information available only to the decoder does not help at all at this spectrum. What missing is the achievability result in the entire open spectrum other than the end points. The purpose of this paper is to address the above problem, i.e., the compression rate and error spectrum tradeoff. Speciﬁ- cally, we shall show that at the error spectrum 0 ≤ δ< ∞, the minimum average compression rate achievable asymptotically is lower bounded by a quantity called the intrinsic conditional entropy H in (X|Y,δ), which is strictly greater than H(X|Y ) regardless of how small δ > 0 is. Furthermore, we shall show that the intrinsic conditional entropy H in (X|Y,δ) is also asymptotically achievable for small δ. At δ =0, H in (X|Y,δ) is equal to H(X|Y ). As δ increases and passes a certain point, H in (X|Y,δ) is ﬂat and equal to H(X).