DIGITAL VLSI IMPLEMENTATION OF A NEURAL, PROCESSOR F.Castillo, J.M.Moreno, J.Cabestany Universidad Politecnica de Catalunya Departamento de Ingenieria Electr6nica ETSE Telecomunicaci6n P.O.Box 30002 zyxwv Osos0 Barcelona, SPAIN Phone: 34-3-401.67.42 Fax: 34-3-401.68.01 ABSTRACT: This work presents the ideas leading to the design of a digital Neural zyxwvut Processor based on a VLSI architecture. Each processor is an element of a systolic array capable of performing the Back Propagation (BP) algorithm. The processor's internal structure and its characteristics is discussed and presented. 1.Introduction zyxwvuts The full potential of neural nets lies in their capabilityfor parallelization. This implies that processes can be executed in a fraction of the time needed in serial computers. A neural processor has been developed, which when cascaded in a systolic array, executes the Back Propagation algorithm in an efficient manner. The speed advantage of the array is achieved thanks to the interconnection scheme between processors and also due to the specialized processor design. 2.BP Parallelization It was first presented in [l] a particular ring systolic architecture which could execute the BP algorithm with few limitations with respect to connectivity, number of layers and layer sizes. Following is a brief description of the architecture. It is convenient to review some of its features to understand the ideas behind the processor design. The incentive behind the architecture is the use of a modular structure, easily expandable and flexible, which is capable of executing the BP algorithm without much demand in terms of connections between processors. The architecture emulates the total net using a layer by layer approach. In forward mode, the following equation: is performed such that all processors simultaneously perform the multiplication of a particular activation to its corresponding weight for a given layer. Afterwards, all processors sum this product to the accumulated partial sum-of-products. In the next cycle, the activation is passed on to its neighboring processor and the operation is repeated. If each processor keeps the weight values of the neurons it receives as inputs, all operations can be maintained local. In learning mode, the same can be said. In the following equation: CH29645/91/0000-0307$01.00 01991 IEEE 307 only 6, is the non-local variable, which needs data from other neurons: