208 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 47, NO. 2, FEBRUARY 1999 Breadth-First Maximum Likelihood Sequence Detection: Basics Tor M. Aulin, Fellow, IEEE Abstract—The problem of performing breadth-first maximum likelihood sequence detection (MLSD) under given structural and complexity constraints is solved and results in a family of optimal detectors. Given a trellis with states, these are partitioned into classes where paths into each class are selected recursively in each symbol interval. The derived result is to retain only those paths which are closest to the received signal in the Euclidean (Hamming) distance sense. Each member in the SA family of sequence detectors (SA denotes s earch a lgorithm) performs complexity constrained MLSD for the additive white Gaussian noise (AWGN) (BSC) channel. The unconstrained solution is the Viterbi Algorithm (VA). Analysis tools are developed for each member of the SA class and the asymptotic (SNR) probability of losing the correct path is associated with a new Euclidean distance measure for the AWGN case, the vector Euclidean distance (VED). The traditional Euclidean distance is a scalar special case of this,termed the scalar Euclidean distance (SED). The generality of this VED is pointed out. Some general complexity reductions exemplify those associated with the VA approach. Index Terms— Asymptotic analysis, maximum likelihood se- quence detection, MLSD, optimal sequence detection, reduced complexity, vector Euclidean distance. I. INTRODUCTION AND BACKGROUND A good model of a digital communication system is achieved by using the Markovian property. This is often induced by selecting codes and modulations which have this property. Sometimes nature itself imposes this property by introducing memory via the transmission medium. All these Markov-type systems can be described by the use of a trellis [1] with a certain number of states. The components: channel encoder, modulator, and channel can all be separately described in this way, and by concatenation of the state-vectors for the components, a joint trellis description is achieved [2]. Hence the whole system is a sequential machine with a finite number of states whose input is the sequence of information digits to be transmitted and whose output is observed corrupted by noise. A mild restriction on the noise model is to assume that it is additive, sometimes also memoryless. It is now the Paper approved by S. S. Pietrobon, the Editor for Coding Theory and Techniques of the IEEE Communications Society. Manuscript received April 11, 1997; revised March 10, 1998 and June 10, 1998. This paper was presented in part at the International Symposium on Information Theory, San Antonio, TX, USA, January, 1993. This work was supported by NUTEK, The Swedish National Board for Industrial and Technical Development under Grants 93- 2331 and 94-6169. The author is with the Department of Computer Engineering, Telecom- munication Theory, Chalmers University of Technology, S-412 96 G¨ oteborg, Sweden (e-mail: tor@ce.chalmers.se). Publisher Item Identifier S 0090-6778(99)01921-2. detector’s task to find this sequence of information digits, based upon observation of the noisy channel symbols. An advantage of such a joint model is that there will be no model losses imposed by keeping the subsystems separate and applying optimal strategies for each one of them. In the early days of trellis coding/decoding, channel symbols were modeled as numbers from a finite field and the channel described as a discrete memoryless channel (DMC). The decoding problem was attacked by heuristic methods [3], [4]. The formulated algorithms were of depth-first type [5]. The breadth-first approach was taken by Viterbi [6] who formulated a dynamic programming algorithm for decoding of convolutional codes. In [2], Omura showed that the algorithm actually performs maximum likelihood sequence detection (MLSD) of the states the sequential machine is traversing. This famous procedure is known as the Viterbi Algorithm (VA) and has had a tremendous impact on both digital transmission theory and application. Its use is by no means limited to these areas but what limits its use is its complexity. In a breadth-first approach [5] the number of detector operations is the same in all information symbol intervals, whereas for the Fano (or stack) algorithm, this number can vary heavily from interval to interval depending upon the channel noise [3], [4]. This causes the need for buffering and also a random time delay in the delivered sequence of digits. It might occur that the probability of buffer overflow is several orders of magnitude larger than the probability of erroneous decisions when the detector operates close to the computational cutoff rate [7]. By defining the complexity of a sequence detection (SD) algorithm as the number of paths being traced in the trellis, the complexity of the VA is equal to , the number of states in the trellis. In a joint trellis model of a whole system, the number of states is typically very large. This is because the state vector in a joint description is the Cartesian product of the underlying component trellises. The cardinality of the overall state space is the product of the cardinalities of the component trellises which explains why easily grows to astronomical numbers. While depth-first SD algorithms use very few but a random number of operations per detected information symbol, the VA has the advantage of using a constant but very large number of operations in each symbol interval. The latter also minimizes the delay between input and detected sequence. It is natural to try to combine these two properties into one and have a general algorithm which is of breadth-first type and which uses few paths in the trellis [8]–[13]. A special case of the 0090–6778/99$10.00  1999 IEEE