IEEE TRANSACTIONS ON COMPUTERS, zyxwvutsrqponm VOL. 38, NO. 9, SEPTEMBER 1989 1297 Reconfigurable Multipipelines for Vector Supercomputers RkTIV GUPTA. ALESSANDRO ZORAT, AND I. V. RAMAKRISHNAN Absfruct-Supercomputers typically use pipelines in their processors for achieving high performance. These pipelines consist of several stages and many such identical pipelines are used in vector supercomputers for doing vector operations. This paper addresses the problem of recovering multipipelines in the presence of faulty stages. The stages are assumed to be organized in rows and columns. We alternate the pipeline stages with reconfiguring circuitry which is used for bypassing the faulty stages. The pipelines are configured by programming the switches in a distributed manner using fault information available locally. The reprogrammability of the switches enables us to tolerate dynamic faults. Our configuration algorithm is optimal in the zyxwvut sense that it recovers the maximum number of pipelines under any fault pattern. Probabilistic bounds on the delay (the number of bypassed faulty stages) and yield (the number of nonfaulty pipelines recovered) and derived. We show that the maximum signal delay in any of the pipelines is B(logm), where m is the initial number of pipelines. Furthermore, a constant fraction of these pipelines can be recovered with our scheme, as opposed to an exponentially decreasing number when no reconfiguration is used. Our reconfiguration scheme can also be used for providing fault-tolerant buses on a wafer. zyxwvutsrqp Index Terms-Distributed algorithms, fault tolerance, multipi- pelines, reconfiguration, supercomputers. I. INTRODUCTION ULTIPIPELINING has been used in the architectures of M many vector supercomputers for enhancing their performance ([12]-[15]). Fig. 1 (adopted from [12]) is a block diagram of a multipipeline vector supercomputer. The instruc- tion processing unit delegates the vector operations to the vector controller. The vector controller is responsible for setting up instructions on the vector access controller, filling up the vector buffer and the multipipelines, scheduling the instructions and preempting them. The vector pipeline is a set of identical pipes each of which can execute all the instruc- tions. In this paper, we address the problem of recovering fault- free pipelines in the presence of faulty stages, thereby making the unit containing the multipipelines fault tolerant. We add Manuscript received February 26, 1988; revised July 28, 1988. This work was supported in part by NSF under Grant ECS-8404399 and in part by ONR Contract N00014-84-K-0530. R. Gupta is with the Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, CA 90089. A. Zorat is with the Instituto di Informatica, Universiti degli Studi di Trento, 38100 Trento, Italy. I. V. Ramakrishnan is with the Department of Computer Science, State University of New York, Stony Brook, NY 11794. IEEE Log Number 8929133. very simple switching circuitry between the stages of the pipelines that allows bypassing the faulty units. We model the pipelines by rows of processing elements (PE’s), each representing a stage of the pipeline, as shown in Fig. 2. The PE’s in a column are identical. The stages are distinct and a nonfaulty PE from one stage cannot be substituted by another one from a different stage. The pipelines are configured by programming the switches in a distributed manner using fault information available locally (see Fig. 3). The reprogammability of the switches enables us to tolerate dynamic faults. Our switch programming algorithm is optimal in the sense that it recovers the maximum number of pipelines under any fault pattern. A bound on the fault information that each switch must possess in order to program the switches optimally is also given in this paper. Reconfiguration may require wires to span several rows in order to bypass faulty stages in a column. This introduces extra delay (proportional to the number of bypassed stages) between two consecutive stages of a pipeline (for example, note the delay between stage 1 and stage 2 of the first pipeline in Fig. 3). We show that if the total number of pipelines is zyx m, the maximum delay is B(logm).l We also show that the number of pipelines recoverable by our reconfiguration scheme (yield) is cm (0 zyxw < zyxw c < l), as opposed to an exponentially decreasing yield if no reconfiguration is used. Our treatment of bounds is theoretical in nature and was inspired by [5] and [8] where similar bounds for linear arrays, square meshes, and certain other related structures are derived. We use the laws of large numbers in probability theory to obtain these bounds. Simulation results (see Section V) confirm that these bounds also hold for small to medium size systems. Throughout this paper, we assume that the switching circuitry is fault-free. We justify this assumption on the following grounds. First, compared to the stages, the switches are much simpler. In the absence of reconfiguration, stages were the critical elements whose failure would disrupt the normal functioning of the system. Our reconfiguration al- gorithm requires the switches, which are much less susceptible to faults, to be fault-free. Second, the failure of any single switch does not reduce the yield to zero. The algorithm wlll still work though with reduced yield. To that extent, the assumption that the switches are fault-free only aids us in simplifying the calculation of yield and delay. ’ e(), zyxw 00, and Q() stand for the exact, the upper, and the lower bounds, respectively. 0018-9340/89/0900-1297$01 .OO zyxwvu 0 1989 IEEE