IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 60, NO. 1, JANUARY 2013 217 Efﬁcient Elliptic Curve Point Multiplication Using Digit-Serial Binary Field Operations Gustavo D. Sutter, Member, IEEE, Jean-Pierre Deschamps, and José Luis Imaña Abstract—This paper details the design of a new high-speed point multiplier for elliptic curve cryptography using either ﬁeld- programmable gate array or application-speciﬁed integrated cir- cuit technology. Different levels of digit-serial computation were applied to the data path of Galois ﬁeld (GF) multiplication and division to explore the resulting performances and ﬁnd out an op- timal digit size. We provide results for the ﬁve National Institute of Standards and Technology recommended curves, outperforming the previous published results. In GF (2 163 ), we achieve a point multiplication in 19.38 μs in Xilinx Virtex-E. Using the modern Xilinx Virtex-5, the point multiplication times in GF (2 m ) for m = 163, 233, 283, 409, and 571 are 5.5, 17.8, 33.6, 102.6, and 384μs, respectively, which are the fastest ﬁgures reported to date. Index Terms—Digit-serial computation, elliptic curve cryptog- raphy (ECC), ﬁeld-programmable gate array (FPGA), public key cryptography. I. I NTRODUCTION E LLIPTIC curve cryptography (ECC) is gaining popu- larity because it offers similar security to traditional systems, such as Ron Rivest, Adi Shamir and Leonard Adleman (RSA), but with signiﬁcantly smaller key lengths. For example, 163-b ECC is considered equivalent to 1024-b RSA [1], [2]. This feature makes it highly suited for imple- mentation in resource-constrained environments. The use of ﬁeld-programmable gate array (FPGA) [3]–[6] technology to implement in hardware (HW) the computationally intensive op- erations needed in ECC is justiﬁed by the performance and cost efﬁciency of today’s FPGA devices and mainly by the ability to easily update the cryptographic algorithm (for example, change the underlying ﬁeld). The IEEE has standardized for the use of ECC in digital signature algorithm and key agreement [7]. It is a further testimony to its potential that the National Security Agency has endorsed the use of ECC in its Suite B1 set of algorithms [8]. The underlying operation in elliptic curve cryptosystems is scalar point multiplication Q = k · P , i.e., the multiplication of an elliptic curve point P by a scalar k to give the resultant Manuscript received February 23, 2011; revised July 18, 2011, October 15, 2011, and December 20, 2011; accepted January 15, 2012. Date of publication January 26, 2012; date of current version September 6, 2012. This work was supported in part by the Comisión Interministerial de Ciencia y Tecnología of Spain under Grants TIN2008-00508 and TEC2009-13385. G. D. Sutter is with the School of Engineering, Universidad Autónoma de Madrid, 28049 Madrid, Spain (e-mail: gustavo.sutter@uam.es). J.-P. Deschamps is with the University Rovira i Virgili, 43007 Tarragona, Spain (e-mail: jeanpierre.deschamps@urv.cat). J. L. Imaña is with the Faculty of Physics, Complutense University of Madrid, 28040 Madrid, Spain (e-mail: jluimana@dacya.ucm.es). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/TIE.2012.2186104 point Q [9], [10]. This involves many basic arithmetic opera- tions in the underlying ﬁnite ﬁeld that were optimized in this work. Several FPGA-based elliptic curve HW accelerators and cryptographic processors have been presented in the literature [11]–[23], which show different acceleration techniques to improve the performance of the ECC operations. The optimization goal is typically to reduce the latency of the ECC operation (the point multiplication) in terms of the number of required cycles. Most of these implementation efforts are concentrated on algorithm optimization or improved arithmetic architectures using duplicating arithmetic blocks to exploit the parallelism [15], [16], [23]; on the other hand, some works are more concentrated in the use of the design techniques used in modern high-performance processors such as out-of-order execution, data forwarding, deep pipeline, and instruction-level parallelism [12], [23]. In this paper, underlying Galois ﬁeld (GF) operation mer- its special attention. The digit-serial approach is used in GF multiplication and GF division in order to construct an efﬁcient elliptic curve multiplier using projective coordinates. This paper is organized as follows. Section II introduces the concepts and arithmetic of ﬁnite ﬁeld, ECC, coordinate sys- tem, and scalar point multiplication algorithm, with emphasis in Montgomery ladder algorithm and projective coordinates. Section III studies possibilities to advance the naïve imple- mentation, modifying and improving the original algorithm. Section IV reviews the ﬁnite-ﬁeld primitives, mainly ﬁeld multi- plication and division. Section V gives the FPGA implemen- tation results, and Section VI compares them with the previous main results. Finally, Section VII summarizes some conclusions. II. BACKGROUND A. Finite Field A ﬁnite ﬁeld or GF is a set of elements denoted typically as GF (q). Modular arithmetic can be performed on ﬁeld ele- ments; consequently, the ﬁeld is closed under addition, subtrac- tion, multiplication, and inversion. The most used ﬁnite ﬁelds are of characteristic 2 and characteristic p, for some large prime p. Characteristic 2 is quite simple and can be implemented in HW using more efﬁcient modulo-2 arithmetic. A ﬁnite ﬁeld is said to have characteristic 2 if q =2 m . It is denoted GF (2 m ). Elements of GF (2 m ) can be represented using a polynomial basis, i.e., = ∑ m−1 i=0 a i · x i , a i ∈{0, 1}. This is advantageous from a HW implementation perspective as a ﬁeld element can be represented in m binary bits. All mathematical operations are performed modulo, a degree-m irreducible polynomial f (z). In Section III, the ﬁnite-ﬁeld operation will be analyzed. 0278-0046/$31.00 © 2012 IEEE