FPGA Implementation of the M-ary Modular Exponentiation Anane Nadjia CDTA (Centre de Développement des Technologies Avancées), Algiers, Algeria Anane Mohamed ESI (Ecole nationale Supérieure d’Informatique) Algiers, Algeria Abstract— Modular exponentiation is a key operation of RSA cryptosystem and is very time consuming for large operands. It is performed using successive modular multiplications. This paper describes hardware architecture of the m-ary modular exponentiation with reduced number of Montgomery modular multiplications. This architecture has been implemented on FPGA circuit of Virtex-2 and presents best performances in terms of computation time and occupied resources. Keywords—FPGA; Montgomery modular exponentiation; RSA. I. INTRODUCTION The rapid and continuous development in communications through open networks such as Internet has created a growing need to encrypt sensitive or confidential data before their transfer. RSA [1] is the most well-known public-key cryptography algorithm based on the modular exponentiation which is performed using successive modular multiplications. The RSA keys are over 1024 bits long and computing a modular exponentiation with such large numbers is extremely computation-intensive and needs to decrease the the multiplications number and the execution time of the modular multiplication. The main contribution of this paper is to develop hardware architecture to compute efficiently the m-ary modular exponentiation which has reduced a multiplication number of 39% compared to the well-known binary modular exponentiation at the expense of some pre-computations. This architecture is based on Montgomery modular multiplication and consumes few resources with allowing the storage of the (m-2) pre-computed values on the memory available on the targeted Xilinx FPGA circuit of Virtex-2 family. The remainder of this paper is organized as follows. In Section 2, we detail the m-ary modular exponentiation. In Section 3, the architecture for the 64-ary modular exponentiation is presented. In Section 4, we summarize the implementation results on FPGA of the architecture. Finally, a conclusion is given in Section 5. II. M-ARY MODULAR EXPONENETIATION The modular exponentiation is defined by: C=M e mod N. Where M is the plaintext, C is the cipher text, e is the public key, and N is the modulus. It is generally performed by the binary method which is based on scanning the exponent represented in binary bit by bit [2]. If r bits are scanned at once with m=2 r , it named the m-ary method [3]. The m-ary modular exponentiation breaks the exponent into r- bits windows, and then performs as many multiplications as there are nonzero windows. Hence the number of modular multiplications is reduced at the expense of some pre- computations [4]. The m-ary modular exponentiation is based on three steps: 1- Partitioning the binary exponent e in r-bit windows; 2- Pre-computing necessary powers of M; 3- Iterating the squaring of the partial result r times to shift it over, and then multiplying it by the power indicated in the next window, if it is different from zero. The complexity of the m-ary modular exponentiation depends on the reduction of the multiplications and the storage of pre- computations. The m-ary modular exponentiation requires (MM) multiplications depending on m=2 r and the exponent e of n bits with (MM) = 2 r -2+ ((n/r)-1)×(r+1). For each m, it exists an optimal r* for each size of e such that the number of multiplications required is minimum. The r* is equal to 6 for exponents of 1024 bits. The pre computed values for w=2, 3 … m-1, must be stored on memory and depends also on m and n. For an exponent e=1024 bits, reducing the number of modular multiplications to minimum and storing all the pre computed powers of M on the available memory of the target FPGA circuit the XC2V1000 requires a value of r equal to 6 which gives m=2 6 =64. III. M-ARY MODULAR EXPONENTIATION ARCHITECTURE The architecture of the 64-ary modular exponentiation is based on two main steps: 1. Pre-computation of the 62 powers of the message M for exponent blocks of 6-bits which will be stored in memory. 2. Computation of the 64-ary modular exponentiation using the pre-computed powers of M stored in memory. To do this, we use only one Montgomery modular multiplier [5] to firstly perform the pre computations then the computation of the 64-ary modular exponentiation using the pre-computed power of M: M 2 ...M 63 . The multiplier inputs are two memories (Mem_A), (Mem_B) and the result is stored in the output memory (Mem_C). 978-1-4799-3525-3/13/$31.00 ©2013 IEEE