U NC ORRECTE D PROOF Microprocessors and Microsystems xxx (2006) xxx–xxx www.elsevier.com/locate/micpro 0141-9331/$ - see front matter 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.micpro.2006.09.001 MICPRO 1677 No. of Pages 12; Model 5+ ARTICLE IN PRESS 11 October 2006 Disk Used Aranganathan (CE) / Selvi (TE) Please cite this article in press as: K. Vlachos et al., Design and performance evaluation of a Programmable Packet Processing Engine (PPE) suitable for high-speed network processors units, Microprocess. Microsyst. (2006), doi:10.1016/j.micpro.2006.09.001 Design and performance evaluation of a Programmable Packet Processing Engine (PPE) suitable for high-speed network processors units K. Vlachos a,¤ , T. Orphanoudakis b , Y. Papaeftathiou c , N. Nikolaou d , D. Pnevmatikatos c , G. Konstantoulakis e , J.A. Sanchez-P f a Computer Engineering and Informatics Department, University of Patras, GR-26500, Rio, Greece b Telecommunication Science and Technology Department, University of Peloponnese, Greece c Electronic and Computer Engineering Department, Technical University of Crete, 73100 Chania, Greece d Ellemedia Technologies, GR 17121, Athens, Greece e InAccess Networks, GR 17672, Athens, Greece f Greek Research and Technology Network (GRNET) GR-11527, Athens, Greece Abstract In this paper, we present a Programmable Packet Processing Engine suitable for deep header processing in high-speed networking sys- tems. The engine, which has been – fabricated as part of a complete network processor, consists of a typical RISC-CPU, whose register Wle has been modiWed in order to support eYcient context switching, and two simple special-purpose processing units. The engine can be used in a number of network processing units (NPUs), as an alternative to the typical design practice of employing a large number of sim- ple general purpose processors, or in any other embedded system designed to process mainly network protocols. To assess the perfor- mance of the engine, we have proWled typical networking applications and a series of experiments were carried out. Further, we have compared the performance of our processing engine to that of two widely used NPUs and show that our proposed packet-processing engine can run speciWc applications up to three times faster. Moreover, the engine is simpler to be fabricated, less complex in terms of hardware complexity, while it can still be very easily programmed. 2006 Elsevier B.V. All rights reserved. Keywords: Embedded networking systems; Network processor; Special-purpose processor 1. Introduction The explosive growth of the Internet has created an insa- tiable demand for bandwidth. The emergence of Wave- length Division Multiplexing (WDM) has increased the backbone capacity to terabits per second, shifting the bot- tleneck back to the network processing systems, namely routers and relevant switching equipment. Furthermore, the convergence of voice and data networks, as well as the introduction of new Quality-of-Service (QoS) mechanisms and recently developed protocols, require network Xexibil- ity that the currently employed hardware units cannot probably provide. A promising solution to both problems was provided by the introduction of a new class of Embedded Integrated Cir- cuits called Network Processing Units or NPUs. NPUs are becoming the silicon core of every network system that requires a high degree of Xexibility to support evolving net- work services, at extremely high packet rates [1–3]. Whereas legacy architectures for building networking equipment were based either on general-purpose processors (GPPs), which oVer high Xexibility due to software programmability * Corresponding author. Tel.: +30 2610 996990; fax: +30 2610 969 007. E-mail address: kvlachos@ceid.upatras.gr (K. Vlachos). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45