International Journal For Emerging Trends in Engineering and Management Research (IJETEMR) –Volume II Issue 1- 21 st January 2016 http://www.ijetemr.com Page 1 A Pipelined Scheduler Architecture for Network on Chip Abhijit Gadge #1 , Prof. Supratim Saha *2 , Prof. Parag Jawarkar #3 # Electronics & Telecommunication Department Of Engineering, Tulsiramji Gayakwad Patil College Of Engineering RTMNU, Nagpur,India abhijitgadge23@gmail.com jawarkar@gmail.com supratimsaha2012@gmail.com Abstract—Network on chip is a paradigm for enabling efficient communication between processing elements in next generation systems on chip made up of tens and hundreds of intellectual property components. The network is composed of routers and links. The router in turn is made up of packet queues, crossbar, and scheduler. In this brief, we propose a folded pipelined architecture to the round robin scheduler used in the NoC router. Keywords— network on chip, router, scheduler, round robin, arbitration, pipelining I. INTRODUCTION Crossbar switches are the common building blocks for In- ternet routers, data-center and HPC interconnects and on- chip networks [1, 2, 3]. The core switching fabric often has no buffers, saving in this way memory area and buffer speed. Arriving packets issue requests to a central scheduler, and get switched upon scheduler grants; meanwhile, packets wait at input packet buffers, in front of the crossbar. In order to isolate traffic flows1 , and provide the basis for proper con- gestion management, these input buffers must be organized in per-flow queues, forming what is widely known as virtual- output-queuing (VOQ) [4, 5] crossbars, as shown in Fig. 1. Figure 1: Crossbar switch based router The speed/switching efficiency tradeoff of a VOQ cross- bar critically depends on the design and implementation of its crossbar scheduler. Most commercial crossbars today rely on independent, per-input and per-output round-robin (RR) arbiters [6, 7], that yield maximal matchings after a few rounds of handshaking. The basic time complexity of these schedul- ing algorithms is approximately equal to that of two pro- grammable priority arbiters [1], and increases linearly with the number of iterations; hence, iterations normally come along with a port speed penalty. Furthermore, although it- erations improve the delay performance –as measured un- der uniform traffic–, they do not improve switch through- put under unfavorable, non-uniform traffic. In those cases, speedup is usually employed to cover the missing through- put; speedup however seriously affects the energy and the effective capacity of switching systems. II. SCHEDULER The Scheduler acts as the central switch arbiter. It analyzes the occupied Virtual Output Queues of each input_block and configures the input_blocks and interconnect muxes to connect inputs to outputs and allow serial data transfer across the switch. The scheduling algorithm attempts to achieve a large number of simultaneous connections, but also avoids conflicts of multiple inputs connecting to a single output or a single input connecting to multiple outputs. The