A Brief Overview of the Challenges of the Multicore Roadmap Jacques Henri Collet CNRS-LAAS, 7 av. du colonel Roche 31400 Toulouse, France Université de Toulouse, LAAS, F-31400, France jacques.collet@laas.fr Abstract—Multicore chips are widely seen as the solution to continue the race toward ever more processing power, or in short, the continuation of Moore's Law. However, this poses many difficult challenges at different abstraction levels, as the preservation of dependability in the physical layer (related to the reduction of dimensions), the consumption and dissipation of the energy, the selection of the physical architecture beyond the current symmetric multiprocessing, the definition of a distributed operating system, and last but not least, the automatic and seamlessly parallelization of applications. Although many pro- blems have been studied separately for a long time by academics, finding a satisfactory trade-off is an open issue. We will begin by reviewing the different challenges. Then, we will focus on the problem of dependability in the physical layer, and especially on the Network on Chip (NoC). Here, the future is in the genera- lization of total self-healing to NoCs, i.e., in self-detection of errors, self-diagnosis of faults, and self-repair at runtime, while preserving the efficiency of communications in the presence of an increasing percentage of defective elements. We shall briefly dis- cuss the numerous constraints which apply to self-healing NoCs, and make that numerous tradeoffs are possible. The subject is widely open. Keywords—multicore processing; reliability; fault tolerant systems; Routing protocols I. INTRODUCTION Virtually every sector of our society has become very much dependent on the growth in processing performance. To respond to this challenge, the industry has resorted to increa- sing the number of cores on the same chip and laid out a roadmap for multicore design that calls for doubling of the numbers of cores on a chip with each silicon generation to satisfy performance growth. Of course, this vision makes sense only if it is possible to easily parallelize applications and execute an increasing number of threads by an increasing number of cores! At the moment, the situation is relatively satisfactory. Sure, it is always the programmer responsibility to analyze the application, to define the level and the granularity of paralle- lization, to integrate additional mechanisms in the code to protect the concurent accesses to shared data, and to preserve the consistency of memory accesses. Parallelization is thus not a simple procedure, but the programmer does not have to care about the physical layer, and in particular the management of the physical memory. The default scheduler makes transparent two crucial decisions: 1) Where to map threads in the memory, and 2) How to select a logical processor to allocate a thread. For instance with Windows, it is a breeze to create threads with simple calls to the functions CreateThread or AfxBeginThread and to allocate them to some selected logical processors with the function SetProcessAffinityMask, Thus, everything seems not far from the best in the best of all possible worlds. However, the future seems more uncertain, because the multiplication of cores worsens several dramatic challenges at different abstraction levels, that ultimately threaten the multi-threading simplicity that we just described. We review the main challenges in section II, namely the change of the physical architecture, the evolution of operating systems, the transparency of program parallelization, and last but not least, the preservation of the reliability in the physical layer. In the rest of the paper, we focus, and on a “small” challenge, namely, NoC reliability. In section III, we review the main approaches for detecting errors, and the actions conducted for tolerating transient and permanent faults in the physical layer. In section IV, we review fault-tolerance at routing proto- col level. Then we adress the question of adaptivity and self- healing in defective NoC, that is to say, self-healing directly in the physical layer conducted dynamically at runtime, with, ideally, the constraints that the connectivity between any two node is preserved, no message is lost and that no degradation of the efficiency of routing protocols should happen. II. THE CHALLENGES Let us briefly review some challenges that must be solved to maintain hope for processing growth in accordance with Moore's Law: Change of physical architecture: Perhaps, the most funda- mental reason which forces to abandon the architectures of today’s multicore processors [1] is that increasing paralle- lization and multithreading (that is the idea behind multiplying the number of cores) exacerbates the challenges on the on-chip communication infrastructure that cannot be sustained by today’s symmetric multiprocessors (SMP). Several architec- tural evolutions are possible. In the short term, a realistic evolution could consist in interconnecting a few nodes through a massively-parallel ring or a hierarchy of rings (see the former studies in the nineteen's [2]). Each node could be a today’s SMP processor including typically from 4 to 8 cores plus some SRAM and a network interface. Such a communication infrastructure needs no complicated routing algorithm but in the counter part, it has necessarily smaller bandwidth than