On the Codebook-Level Duality Between Slepian-Wolf Coding and Channel Coding Jun Chen Da-ke He IBM T. J. Watson Research Center Yorktown Heights, NY 10598, USA Email: {junchen, dakehe}@us.ibm.com En-hui Yang University of Waterloo Waterloo, Ontario, Canada N2L 3G1 Email: E.Yang@ece.uwaterloo.ca Abstract— A codebook-level duality between Slepian-Wolf cod- ing and channel coding is established. Specifically, it is shown that using linear codes over ZM (the ring of integers mod M), each Slepian-Wolf coding problem is equivalent to a channel coding problem for a semi-symmetric additive channel under optimal decoding, belief propagation decoding, and minimum entropy decoding. Various notions of symmetric channels are discussed and their connections with semi-symmetric additive channels are clarified. I. I NTRODUCTION Consider the problem (see Fig. 1) of encoding {X i } i=1 with side information {Y i } i=1 at the decoder. Here {(X i ,Y i )} i=1 is a memoryless process with joint probability distribution P XY on X×Y . Throughout this paper, X and Y are assumed to be finite with X = Z M and Y = Z N unless specified otherwise; for any positive integer K, + K and K denote modulo-K addition and subtraction, respectively, while a = K b means a K b =0. Slepian and Wolf [1] proved a surprising result 1 that the minimum rate for reconstructing {X i } i=1 at the decoder with asymptotically zero error probability is H(X|Y ), which is the same as the case where the side information {Y i } i=1 is available at the decoder. X n Y n R ˆ X n Encoder Decoder Fig. 1. Slepian-Wolf coding Shortly after Slepian and Wolf’s seminal work, Wyner [3] pointed out the possibility of using linear codes for Slepian- Wolf coding. The scheme works as follows: given the source sequences X n , the encoder sends X n H to the decoder, where H is the parity check matrix of a linear code C ; the decoder then tries to recover X n from X n H given the side information Y n . Wyner also noticed an intriguing connection between 1 The original problem considered by Slepian and Wolf is more general. But it can be shown that the general problem can be reduced to this special case via time-sharing or source-splitting [2]. Slepian-Wolf coding and channel coding in a simple example. Suppose that both X and Y are binary, and the correlation between X and Y can be modelled by a binary-symmetric channel with parameter p [0, 0.5) (BSC(p)), i.e., X i = Y i + 2 Z i for all i 1, where Z i denotes a binary random variable (independent of Y i ) that takes value 1 with probability p. Let H be a n × k parity check matrix of a binary linear channel code C for which there exists a decoding function g(·) such that Z n =(Z 1 ,Z 2 , ··· ,Z n ) can be decoded from its syndrome Z n H with error probability ǫ. Now in the Slepian- Wolf problem, upon receiving the syndrome S k = X n H, the decoder calculates S k + 2 Y n H =(X n + 2 Y n )H = Z n H and then uses g(·) to recover Z n with error probability ǫ. Since X n = Y n + 2 Z n , X n can also be recovered with error probability ǫ. It is well-known [4] that the capacity of binary symmetric channel is achievable with linear codes, therefore, we can let the rate nk n of channel code C be arbitrarily close to the channel capacity 1 H b (p) while maintaining any prescribed error probability ǫ> 0. Hence, the compression rate k n of Wyner’s coding scheme can be arbitrarily close to H(X|Y )= H b (p), which is exactly the Slepian-Wolf limit. Throughout this paper, H b (·) stands for the binary entropy function, i.e., H b (p)= p log p (1 p) log(1 p), and the logarithm function is to base 2. If we view g(·) as the maximum likelihood (ML) decoding function for BSC(p), then it is not hard to verify that the decoding in the aforementioned example is exactly the max- imum a posteriori (MAP) decoding for Slepian-Wolf coding. Therefore, Wyner’s simple example suggests that a general linear codebook-level duality may exist between Slepian- Wolf coding and channel coding under optimal decoding and Slepian-Wolf code design might be reduced to chan- nel code design. Unfortunately, designing practical capacity- approaching channel codes was still a formidable task at that time. As a result, Wyner’s observation had relatively little impact on the design of practical Slepian-Wolf codes. Inspired by the potential applications of distributed data compression in various networks and multimedia systems, Slepian-Wolf code design has received much attention in recent years. Moveover, due to the revolutionary advance in the development of capacity-approaching channel codes (e.g.,