IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 60, NO. 1, JANUARY 2013 217 Efficient Elliptic Curve Point Multiplication Using Digit-Serial Binary Field Operations Gustavo D. Sutter, Member, IEEE, Jean-Pierre Deschamps, and José Luis Imaña Abstract—This paper details the design of a new high-speed point multiplier for elliptic curve cryptography using either field- programmable gate array or application-specified integrated cir- cuit technology. Different levels of digit-serial computation were applied to the data path of Galois field (GF) multiplication and division to explore the resulting performances and find out an op- timal digit size. We provide results for the five National Institute of Standards and Technology recommended curves, outperforming the previous published results. In GF (2 163 ), we achieve a point multiplication in 19.38 μs in Xilinx Virtex-E. Using the modern Xilinx Virtex-5, the point multiplication times in GF (2 m ) for m = 163, 233, 283, 409, and 571 are 5.5, 17.8, 33.6, 102.6, and 384μs, respectively, which are the fastest figures reported to date. Index Terms—Digit-serial computation, elliptic curve cryptog- raphy (ECC), field-programmable gate array (FPGA), public key cryptography. I. I NTRODUCTION E LLIPTIC curve cryptography (ECC) is gaining popu- larity because it offers similar security to traditional systems, such as Ron Rivest, Adi Shamir and Leonard Adleman (RSA), but with significantly smaller key lengths. For example, 163-b ECC is considered equivalent to 1024-b RSA [1], [2]. This feature makes it highly suited for imple- mentation in resource-constrained environments. The use of field-programmable gate array (FPGA) [3]–[6] technology to implement in hardware (HW) the computationally intensive op- erations needed in ECC is justified by the performance and cost efficiency of today’s FPGA devices and mainly by the ability to easily update the cryptographic algorithm (for example, change the underlying field). The IEEE has standardized for the use of ECC in digital signature algorithm and key agreement [7]. It is a further testimony to its potential that the National Security Agency has endorsed the use of ECC in its Suite B1 set of algorithms [8]. The underlying operation in elliptic curve cryptosystems is scalar point multiplication Q = k · P , i.e., the multiplication of an elliptic curve point P by a scalar k to give the resultant Manuscript received February 23, 2011; revised July 18, 2011, October 15, 2011, and December 20, 2011; accepted January 15, 2012. Date of publication January 26, 2012; date of current version September 6, 2012. This work was supported in part by the Comisión Interministerial de Ciencia y Tecnología of Spain under Grants TIN2008-00508 and TEC2009-13385. G. D. Sutter is with the School of Engineering, Universidad Autónoma de Madrid, 28049 Madrid, Spain (e-mail: gustavo.sutter@uam.es). J.-P. Deschamps is with the University Rovira i Virgili, 43007 Tarragona, Spain (e-mail: jeanpierre.deschamps@urv.cat). J. L. Imaña is with the Faculty of Physics, Complutense University of Madrid, 28040 Madrid, Spain (e-mail: jluimana@dacya.ucm.es). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIE.2012.2186104 point Q [9], [10]. This involves many basic arithmetic opera- tions in the underlying finite field that were optimized in this work. Several FPGA-based elliptic curve HW accelerators and cryptographic processors have been presented in the literature [11]–[23], which show different acceleration techniques to improve the performance of the ECC operations. The optimization goal is typically to reduce the latency of the ECC operation (the point multiplication) in terms of the number of required cycles. Most of these implementation efforts are concentrated on algorithm optimization or improved arithmetic architectures using duplicating arithmetic blocks to exploit the parallelism [15], [16], [23]; on the other hand, some works are more concentrated in the use of the design techniques used in modern high-performance processors such as out-of-order execution, data forwarding, deep pipeline, and instruction-level parallelism [12], [23]. In this paper, underlying Galois field (GF) operation mer- its special attention. The digit-serial approach is used in GF multiplication and GF division in order to construct an efficient elliptic curve multiplier using projective coordinates. This paper is organized as follows. Section II introduces the concepts and arithmetic of finite field, ECC, coordinate sys- tem, and scalar point multiplication algorithm, with emphasis in Montgomery ladder algorithm and projective coordinates. Section III studies possibilities to advance the naïve imple- mentation, modifying and improving the original algorithm. Section IV reviews the finite-field primitives, mainly field multi- plication and division. Section V gives the FPGA implemen- tation results, and Section VI compares them with the previous main results. Finally, Section VII summarizes some conclusions. II. BACKGROUND A. Finite Field A finite field or GF is a set of elements denoted typically as GF (q). Modular arithmetic can be performed on field ele- ments; consequently, the field is closed under addition, subtrac- tion, multiplication, and inversion. The most used finite fields are of characteristic 2 and characteristic p, for some large prime p. Characteristic 2 is quite simple and can be implemented in HW using more efficient modulo-2 arithmetic. A finite field is said to have characteristic 2 if q =2 m . It is denoted GF (2 m ). Elements of GF (2 m ) can be represented using a polynomial basis, i.e., = m1 i=0 a i · x i , a i ∈{0, 1}. This is advantageous from a HW implementation perspective as a field element can be represented in m binary bits. All mathematical operations are performed modulo, a degree-m irreducible polynomial f (z). In Section III, the finite-field operation will be analyzed. 0278-0046/$31.00 © 2012 IEEE