A HARDWARE EFFICIENT REALISATION OF NUMBER ThEORETIC CONVOLVERS Wan-Chi SIU Department of Electronic Engineering, Hong Kong Polytechnic Hung Horn, Hong Kong. ABSTRACT In this paper, we propose hardware realisations of Number Theoretic Transforms that are based on the transformation of their fundamental relationships into recursive filter forms with single integer poles. Furthermore use is made of Read-Only- Memory(RON) to effect the multiplications by the root of unity, a. Suitable NTT5 are then suggested for the fast computation of cyclic convolutions using multi—dimensional and multi-modular techniques. The required RON size in the proposed realisations is small and the control of data flow is simple and straightforward. This new class of Number Theoretic Transforms can relax considerably the normal sequence length and wordlength constraints for the NTT. INTRODUCTION Advances in fabrication technology have produced high—speed and inexpensive read—only— memOries which in term lead to further possibilities for designing new and efficient algorithms for the computation of convolutions using NTT5[l—3); this is a point originally suggested by Pollard[41 . Jullien, Miller and Nagpel[5] have also proposed to implement the NTT with a sequence length of power of 2 using arrays of ROMs. However, the previous methods usually require a very large RON size[4-6] and a relatively complicated technique for the computation involving multiplications by powers of a (the root of unity of order N) has been used. In this paper we present the results of our study into a new technique of the use of ROM to realize NTT convolvers for fast cyclic convolutions. This new technique allows a very flexible choice of short and long sequence lengths of the transform. The RON size required is reasonably small and is well within practical limits. PRIME SEQUENCE LENGTH NUMBER THEORETIC TRANSFORMS The general equations for a length—N Number Theoretic Transform pair defined over the field or ring of integers modulo N can be written as follows: X(k) <x(n) ank >M x(n) = < X(k) ak >M A.G. CONSTANTINIDES Department of Electrical Engineering, Imperial College London, SW7 2BT, England for k,n=0,l,...,N—l The expression <C>M means the residue of the number C modulo N. Let us consider the modulus of eqns. and 2 to be a prime number, q, i.e. N = q. Let us also choose a prime sequence length, P. The relationship between P and q is given by [3): P (q—l) (3) For example q = 41 and P = 5 form a possible selection of q and P, for 5140 and both 5 and 41 are prime numbers. In this case one possible root of unity is 10. write Let nk=m for k=l,2 P-l. We can (4) (5) P— 1 X(0) = <x(n)>q and n0 X(k) = for k=l,2 P—l This is the same approach suggested by Siu and Constantinides[7) for the hardware realisation of Mersenne Number Transforms for fast digital convolution and so eqns. 5 and 6 are the generalization of the corresponding equations in [7) . Eqn. 5 can also be written as: X(k) = x(0) + x(<k1(P-2)>]a...+ x(<k.l>)]a (6) The essential feature of this equation is that the multiplier is of constant value a irrespective of the value of X(k) to be evaluated. It is clear that the transform can actually be considered as a recursive filter with a simple integer pole, a. The sequence of data x(<mk>0) has to be generated as shown in eqn. 5 by the expression <mk> for m,k=l,2 P—l The P values of the signal x(n) are usually stored in RAM or buffered registers. The term k1- can, of course, simply be found by the method of finite continued fractions. However, for the (1) implementation of real-time and high-speed digital signal processors this computational procedure is slightly complicated. It is more convenient to (2) store all (P—l) values of the set { <k'>p: k1,2 P_1} into a RON. Note that, 6. 7. 1 ICASSP 86, TOKYO CH2243-4/86/0000-0237 $1.00 © 1986 IEEE 237