Abstract This paper describes the design and implementation of Viterbi decoder using FPGAs. In this paper we explore an FPGA based implementation methodology for rapidly prototyping designs. We use high level synthesis to achieve this. Some of the implementation issues related to the Viterbi decoder, such as organization of path memory, decision memory reading techniques, and the clocking mechanism have been discussed. 1 Introduction Viterbi decoder is an important block in any CDMA modem. CDMA systems being interference based use forward error correction schemes like convolutional encoding to increase cell capacity.The Viterbi Algorithm may be viewed as a solution to the problem of maximum a posteriori probability estimation of the state sequence of a finite-state discrete-time Markov process observed in memory less noise. A tutorial on Viterbi algorithm can be found in [1]. One of the main blocks of a modem used during forward-link demodulation is a Viterbi decoder. A normal Viterbi decoder for a constraint length of 9 (256 states) uses more than 15 kb of memory and can occupy as high as 1/3 of the modem chip area. Our aim was to design a 19.2 kbps, 256 state Viterbi decoder with the added capability of catering to higher input data rates. To the best of our knowledge none of the literature discusses Viterbi decoder implementation based on high level synthesis targeted for Field Programmable Gate Arrays (FPGAs). This has been the focus in the present paper. Several important design issues, such as, organization of a path memory, decision memory, the decision memory reading techniques, and the clocking mechanism have not been made available due to the proprietary nature of the system implementation. For example, some of the implementations use twos' complement representation for the path metrics. However, this can only be achieved at the cost of an important metric used for input bit sequence synchronization. We aimed at retaining this bit-synchronization metric, even though it made the normalization of the path metrics essential. Broadly VLSI architecture implementations of the Viterbi algorithm can be divided into two categories node parallel node serial. In the node parallel approach, as the computations for all nodes in the trellis are done concurrently, the decoding speed is high. In the node serial approach, these computations are done sequentially by a single node and the decoding operation is much slower. These nodes or ACS (Add Compare Select) units decide which of the possible information sequences entering a state is more likely to survive. The ACS units need to be intercon- nected according to the trellis diagram; this gives rise to a problem of different kind where reduction of the wiring area on silicon chips is the main aim. To reduce global wiring which may occupy as high as 37% of the Viterbi decoder area, architectures reducing global wiring area and also with fewer ACS units have been proposed [2]. Figure 1 Viterbi decoder block diagram In our Viterbi decoder implementation we have used two ACS pairs to process the data serially. Using a single ACS unit was found to increase the interconnections and also the benefit of the butterfly structure of the Viterbi algorithm is lost. Decisions from the ACS units are stored in a partitioned internal path memory. A trace back through the path memory of depth 64 outputs a single bit data.The path memory has been partitioned into even and odd parity memories, and further subdivided into four blocks. The main drawback of this configuration is the complicated memory decoding structure. However, this is PATH METRIC MEMORY EVEN PARITY PATH METRIC MEMORY ODD PARITY SYNCHRONI- BRANCH TWO PAIRS CONTROL OUTPUT STACK input Decoder Output UNIT METRIC UNIT OF ACS UNITS 4 BANKS MEMORY DECISION ZING TWO STACK ONE DECISION DEVICE UNIT Symbol Design and Implementation of a Viterbi Decoder Using FPGAs Bupesh Pandita Analog Devices, Bangalore, India. email: bupesh.pandita@analog.com Subir K Roy Indian Institute of Technology, Kanpur Kanpur, India. email: skroy@flab.fujitsu.co.jp