Event-Driven Configuration of a Neural Network CMP System over a Homogeneous Interconnect Fabric M.M. Khan*, J. Navaridas†, A.D. Rast*, X. Jin*, L.A. Plana*, M. Luj´ an*, J.V. Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain email: khanm@cs.man.ac.uk Abstract—Configuring a million-core parallel system at boot time is a difficult process when the system has neither spe- cialised hardware support for the configuration process nor a preconfigured default state that puts it in operating condition. SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. Where most large CMP systems feature a sideband network to complete the boot process, SpiNNaker has a single homogeneous network interconnect for both application inter-processor communications and system control functions such as boot load and run-time user-system interaction. This network improves fault tolerance and makes it easier to support dynamic run-time reconfiguration, however, it requires a boot process that is transaction-level compatible with the application’s communications model. Since SpiNNaker uses event-driven asynchronous communications throughout, the loader operates with purely local control: there is no global synchronisation, state information, or transition sequence. A novel two-stage “unfolding” boot-up process efficiently configures the SpiNNaker hardware and loads the application using a high-speed flood-fill technique with support for run-time re- configuration. SystemC simulation of a multi-CMP SpiNNaker system indicates an error-free CMP configuration time of 1.3 ms, while a high-level simulation of a full-scale system (64K CMPs) indicates a mean application-loading time of ∼20ms (for a 100KB application), which is virtually independent of the size of the system. We verified the CMP configuration process with hardware-level Verilog simulation. I. I NTRODUCTION Flexible and efficient boot loading of distributed appli- cations is an essential support process for the SpiNNaker multi-CMP massively parallel system organized over a ho- mogenous communication fabric. The system must somehow break symmetry, assign and load memory resources, con- figure communications, and start up the processors, while balancing concurrency and resource contention for maximum efficiency. Where previous solutions [4][5] have typically been using sideband communications or dedicated preconfigured resources, SpiNNaker confronts the challenge of configuring an isotropic undifferentiated parallel processing system head- on. One approach would be to make no assumptions about the application and consider it as a problem in general- purpose computing, leading to a set of standardised, generic configuration techniques. However, since numerous studies indicate that parallel processing works best with specific applications having inherent parallelism, it seems reasonable Fig. 1. Multi-CMP SpiNNaker System forming a 2D Toroidal Network. to design parallel systems around a target application, whose boot process could be correspondingly specialised. SpiNNaker is a Chip Multiprocessor (CMP) for massively parallel spiking neural network applications. Simulating large, biologically realistic neural networks is an excellent candidate application for distributed processing systems: indeed, the consensus in the modelling community is that it may be necessary to use dedicated hardware with architectures more closely similar to the biology for large-scale neural modelling within realistic resource limitations [7]. It is efficient to simulate a spiking neural network as an event-driven real-time application [8], a model quite different from typical parallel applications and more akin to embedded applications [11]. A system for neural network simulation will be, correspondingly, architecturally different from parallel systems designed mostly for general- purpose computing. Dedicated parallel systems such as SpiN- Naker mostly adopt event-driven models of computation and boot-time configuration considerations that can make fewer assumptions about the initial state of the system than “con- ventional” parallel multiprocessor systems. SpiNNaker provides no sideband communication channel for boot processes: the system boot must use the same com- munications fabric as the application. All processors on the chip are identical; there is no dedicated processor hard-wired