CAPACITY OF RANDOM NETWORK CODING UNDER A PROBABILISTIC ERROR MODEL Danilo Silva * , Frank R. Kschischang Dept. of Electrical & Computer Engineering University of Toronto {danilo, frank}@comm.utoronto.ca Ralf K¨ otter Institute for Communications Engineering Technical University of Munich ralf.koetter@tum.de ABSTRACT A probabilistic error model for random network coding is consid- ered. An upper bound on capacity is obtained for any channel pa- rameters, and asymptotic expressions are provided in the limit of long packet length and/or large field size. A simple and efficient coding scheme is provided that achieves capacity in both limiting cases. The scheme has zero error probability and a probability of failure that decreases exponentially both in the packet length and in the field size in bits. 1. INTRODUCTION Random linear network coding [1] has been proposed as a simple and effective tool for information dissemination over networks; however, is not robust to pollution attacks, as it allows for corrupt packets to be combined with genuine packets throughout the network, causing error propagation that can potentially disrupt communication. One particularly attractive way to handle this problem is to consider end- to-end coding, where internal nodes operate independently of any outer coding, simply transporting information in the usual manner of random network coding, and where only the source and destination nodes apply error control techniques. There has been an increasing amount of research on end-to-end coding, under a variety of error models [2–6]. For each error model, it is natural to ask what is the ultimate limit on the rate of informa- tion that can be reliably transmitted from source to destination, i.e., the channel capacity. The common transmission model is that of a matrix channel given by Y = AX + Z, where X is an n × m ma- trix whose rows are the transmitted packets, Y is an N × m matrix whose rows are the received packets, and Z is a matrix of rank at most t corresponding to the injected error packets after propagation over the network. The transfer matrix A is typically assumed square (N = n) and nonsingular, corresponding to the case where the un- derlying network code is feasible. A “pessimistic” error model, where an all-powerful adversary may have knowledge of A and X and complete control over Z, can be addressed with the concepts introduced by K¨ otter and Kschis- chang [3]. Recently, Montanari and Urbanke [6] have considered the opposite “optimistic” scenario where the error matrix Z is random. Under the assumption that the transmitted matrix X must contain an n × n identity sub-matrix as a header, they compute the maximal mutual information in the limit of long packet length and present an iterative coding scheme with decoding complexity O(n 3 m) that asymptotically achieves this rate. The present paper is motivated by [6]. We consider a slightly more general scenario by removing the assumptions on packet head- ers. We provide an upper bound on the capacity for any channel Supported by CAPES Foundation (Brazil). parameters and derive capacity expressions in the limit of large field size and/or long packet length. We also present a very simple coding scheme with a significantly reduced decoding complexity, O(n 2 m). The paper is organized as follows. In Section 2, we provide general considerations on the type of channels studied in this paper. Then, our analysis proceeds gradually from simple to more complex channels. In Section 3, we start by solving a multiplicative channel given by Y = AX, where A is random. In Section 4, we com- pute the capacity of a channel subject to additive errors, given by Y = X + Z. We also present a coding scheme that achieves ca- pacity asymptotically in some channel parameters. The complete channel Y = AX + Z is addressed in Section 5, where we present our main results concerning capacity expressions and an efficient coding scheme. Finally, a discussion of these results is provided in Section 6. Due to space constraints, the proofs of many results are omitted, but can be found in a full version of this paper [7]. We will make use of the following notation. Let Fq be the finite field with q elements. We use F n×m q and Tn×m,t to denote the set of all n × m matrices over Fq and the set of all n × m matrices of rank t over Fq , respectively. We also use the notation Tn×m = T n×m,min{n,m} for the set of all full-rank n × m matrices. The n × m all-zero matrix and the n × n identity matrix are denoted by 0n×m and In×n, respectively, where the subscripts may be omitted when there is no risk of confusion. 2. MATRIX CHANNELS In a matrix channel, the input variable X and the output variable Y are matrices. Here, we consider the case where both X and Y are n × m matrices over Fq ; channels will differ only in the probability law relating X and Y . The capacity of a matrix channel is defined as C(q, n, m) = max p X I (X; Y ) where pX represents the input distribution. We will usually repre- sent the capacity in q-ary units per channel use, i.e., we will use a base-q logarithm. Note that the capacity is a function of the chan- nel parameters q, n and m, although this dependency will often be omitted to simplify notation. We are also interested in a normalized capacity given by ¯ C = 1 nm C represented in q-ary units per transmitted symbol. Achieving capacity involves, in general, using the channel many times. If the channel is used times, then the block length of a code is , and capacity may be achieved by letting →∞. More precisely, a codeword will be a sequence of matrices. 978-1-4244-1946-3/08/$25.00 ©2008 IEEE 9 24th Biennial Symposium on Communications