A HARDWARE EFFICIENT REALISATION OF
NUMBER ThEORETIC CONVOLVERS
Wan-Chi SIU
Department of Electronic Engineering,
Hong Kong Polytechnic
Hung Horn, Hong Kong.
ABSTRACT
In this paper, we propose hardware
realisations of Number Theoretic Transforms that
are based on the transformation of their
fundamental relationships into recursive filter
forms with single integer poles. Furthermore use
is made of Read-Only- Memory(RON) to effect the
multiplications by the root of unity, a. Suitable
NTT5 are then suggested for the fast computation
of cyclic convolutions using multi—dimensional and
multi-modular techniques. The required RON size in
the proposed realisations is small and the control
of data flow is simple and straightforward. This
new class of Number Theoretic Transforms can relax
considerably the normal sequence length and
wordlength constraints for the NTT.
INTRODUCTION
Advances in fabrication technology have
produced high—speed and inexpensive read—only—
memOries which in term lead to further
possibilities for designing new and efficient
algorithms for the computation of convolutions
using NTT5[l—3);
this is a point originally
suggested by Pollard[41 . Jullien,
Miller and
Nagpel[5] have also proposed to implement the NTT
with a sequence length of power of 2 using arrays
of ROMs. However, the previous methods usually
require
a very large RON size[4-6] and a
relatively complicated technique
for the
computation involving multiplications by powers of
a (the root of unity of order N) has been used.
In this paper we present the results of
our study into a new technique of the use of ROM
to realize NTT convolvers for fast
cyclic
convolutions. This new technique allows a very
flexible choice of short and long sequence lengths
of the transform. The RON size required is
reasonably small and is well within practical
limits.
PRIME SEQUENCE LENGTH NUMBER THEORETIC TRANSFORMS
The general equations for a length—N
Number Theoretic Transform pair defined over the
field or ring of integers modulo N can be written
as follows:
X(k)
<x(n) ank >M
x(n) =
<
X(k) ak >M
A.G. CONSTANTINIDES
Department of Electrical Engineering,
Imperial College
London,
SW7
2BT, England
for k,n=0,l,...,N—l
The expression
<C>M
means the residue of the
number C modulo N.
Let us consider the modulus of eqns.
and 2 to be a prime number, q, i.e. N =
q.
Let us
also choose a prime sequence length, P. The
relationship between P and q is given by [3):
P (q—l) (3)
For example q = 41 and P = 5 form a possible
selection of q and P, for 5140 and both 5 and 41
are prime numbers. In this case one possible root
of unity is 10.
write
Let nk=m for k=l,2
P-l. We can
(4)
(5)
P— 1
X(0) =
<x(n)>q
and
n0
X(k) =
for k=l,2 P—l
This is the same approach suggested by Siu and
Constantinides[7) for the hardware realisation of
Mersenne Number Transforms for fast digital
convolution and so eqns. 5 and 6 are the
generalization of the corresponding equations in
[7) .
Eqn.
5 can also be written as:
X(k) = x(0)
+
x(<k1(P-2)>]a...+
x(<k.l>)]a (6)
The essential feature of this equation
is that the multiplier is of constant value a
irrespective of the value of X(k) to be evaluated.
It is clear that the transform can actually be
considered as a recursive filter with a simple
integer pole, a. The sequence of data x(<mk>0)
has to be generated as shown in eqn. 5 by the
expression
<mk>
for m,k=l,2 P—l
The P values of the signal x(n) are usually
stored in RAM or buffered registers. The term k1-
can, of course, simply be found by the method of
finite continued fractions. However, for the
(1)
implementation of real-time and high-speed digital
signal processors this computational procedure is
slightly complicated. It is more convenient to
(2)
store all (P—l) values of the set {
<k'>p:
k1,2
P_1}
into a RON. Note that,
6. 7. 1
ICASSP 86, TOKYO CH2243-4/86/0000-0237 $1.00 © 1986 IEEE 237