Efcient Bit-Parallel Multipliers in Composite Fields Chiou-Yng Lee Lunghwa University of Science and Technology Email: lchiou@ieee.org Pramod Kumar Meher School of Computer Engineering, Nanyang Technological University„ Singapore-639798, Email: aspkmeher@ntu.edu.sg Abstract— Hardware implementation of multiplication in - nite eld GF(2 m ) based on sparse polynomials is found to be advantageous in terms of space-complexity as well as the time- complexity. In order to design multipliers for the composite elds, we have found another permutation polynomial to convert irreducible polynomials into like-trinomials of the forms(x 2 + x + 1) m +(x 2 + x + 1) n +1, (x 2 + x) m +(x 2 + x) n +1 and (x 4 + x + 1) m +(x 4 + x + 1) n +1. The proposed bit-parallel multiplier over GF(2 4m ) is found to offer a saving of about 33% multiplications and 42.8% additions over the corresponding existing architectures. Keywords: permutation polynomial, nite eld arithmetic, cryptography I. I NTRODUCTION Efcient design and implementation of nite eld multipli- ers have received high attention in recent years because of their applications in elliptic curve cryptography (ECC) and error control coding. Finite eld of characteristic two, i.e. GF(2 m ), is often a popular choice since it allows efcient hardware implementation in terms of the silicon area as well as the execution time. It is observed that the basis used for the representation of eld elements in GF(2 m ) plays a primary role in the complexity of implementation of arithmetic circuits. Moreover, compared with the normal basis and dual basis of representation, polynomial basis (PB) representation is found to provide more efcient implementation of multiplications. The ordered set is called the PB of GF(2 m ), where α is a root of an irreducible polynomial F (x) of degree m. Several PB nite eld multipliers have been suggested in the literature, followed by the rst parallel PB multiplier by Bartee and Schneider [1]. A systematic method is proposed in [4] for the modied Mastrovito multiplication in the Galois elds based on general irreducible polynomials. The choice of an irreducible polynomial used for constructing the eld, how- ever, has a great impact on the time- and space-complexities of the resulting multiplier. For example, Sunar and Koc [2] have presented a Mastrovito multiplication algorithm, where the space-complexity of the multiplier for irreducible trinomials of the form x m + x n +1 with 1 n < m/2 demands only (m 2 1) XOR gates and m 2 AND gates. But, when the generating trinomial is of the form x m + x m/2 +1 (with an even degree), it requires only (m 2 m/2) number of XOR gates. In [3] and [13], it is shown that, the space-complexity of the multipliers for some special irreducible pentanomials is less than that of general pentanomials. Utilizing the ba- sic characteristics of trinomials, efcient systolic multipliers were derived by Lee [5],[6] and Meher [7]. Bit-parallel/digit- serial systolic Montgomery multipliers for trinomials are also reported recently by Lee et al. in [8][9]. It is simple to nd that if the reduction polynomial is a binomial of the form x m+1 +1 over GF(2), the multiplier can produce a simpler architecture than the one for trinomial. Although a nite eld GF(2 m ) cannot be constructed from this polynomial, efcient multiplication algorithm and low- complexity multipliers for the elds based on all-one poly- nomials of the form F (x)= x m + x m1 + ··· + x +1 over GF(2) could be derived by using the binomial x m+1 +1, since (x+1)F (x)= x m+1 +1. Recently, Ahmadi and Menezes [10] have shown that, for nite elds based on the polynomial of the form F (x)= x m +x m1 +··· +x n+1 +x n1 +... +x+1, we can use the quadrinomial G(x) = (x + 1)F (x) = x m+1 + x n+1 + x n +1 for reduction. The reduction using G(x) with only 4 weights instead of F (x) is found to be as efcient as a pentanomial reduction polynomial. Negre [11] has presented a multiplier using general quadrinomials of the form x m+1 + x n + x k +1. Permutation polynomials have been extensively studied (see Lidl and Niederreiter [19], Chapter 7), and found to have numerous applications [20]. A method of mapping of elements of GF(2 k ) into the eld of composite degree GF(2 mn ) for k = nm can be found in the literature [15,21,26]. In the elliptic curve [22] and hyperelliptic curve [23] cryptosystems, nite elds of composite degrees are found to have some applications. The specic eld GF(2 8 ) has been standardized for space communication by ESA and NASA [24], and to be used in CD players and Advanced Encryption Standard (AES) [12]. In this paper we have shown that for the composite elds GF(2 2m ) and GF(2 4m ), like-trinomials of the forms (x 2 + x + 1) m +(x 2 + x + 1) n +1, (x 2 + x) m +(x 2 + x) n +1 and (x 4 + x + 1) m +(x 4 + x + 1) n +1 can be used to replace irreducible polynomials. Applying such like-trinomials, we derive bit-parallel multiplier over composite elds with less complexity compared with the existing multipliers [2,3,15]. It is shown that, for the nite eld GF(2 k ) with k =4m, the proposed multiplier requires only 6 bit-level multiplications and 12 bit-level additions over GF(2 m ), while an existing multiplier [15] requires 9 bit-level multiplications and 21 bit- level additions over GF(2 m ). 2008 IEEE Asia-Pacific Services Computing Conference 978-0-7695-3473-2/08 $25.00 © 2008 IEEE DOI 10.1109/APSCC.2008.103 686 2008 IEEE Asia-Pacific Services Computing Conference 978-0-7695-3473-2/08 $25.00 © 2008 IEEE DOI 10.1109/APSCC.2008.103 686