A Mixed-Mode Design for a Self-programming Chip For real-time estimation, prediction, and control Khurram Waheed and Fathi M. Salam Department of Electrical and Computer Engineering Michigan State University East Lansing, MI 48824-1226 Abstract-The paper overviews the development of a self- learning computing chip in 0.18 micron copper technology. This chip supercedes, in its capabilities, present micro-computing paradigms (micro-processors, micro-controllers, and DSPs) in the application domains of process identification, modeling, prediction, and real-time control. In particular, specific domains of targeted potential applications include: (i) Nano-level on-line bio-probing and actuation (ii) image analysis and feature extraction, (iii) channel equalization for high speed mobile communications, (iv) inertial navigation sensor fusion, and network management for routers. I. INTRODUCTION The core of the chip is a neurally inspired scalable (re- configurable) array network for compatibility with VLSI. The chip is endowed with tested auto-learning capability, realized in hardware, to achieve global task auto-learning execution times in the micro to milli seconds. The core consists of basic building blocks of 4-quadrant multipliers, transconductance amplifiers, and active load resistances, for analog (forward-) network processing and learning modules. Super-imposed on the processing network is a digital memory and control modules composed of D-Flip-flops, ADC, Multiplying D/A Converter (MDAC), and comparators for parameter (weight) storage and analog/digital conversions. The architectural forward network (and learning modules) process in analog continuous-time mode, while the (converged, steady state) weights/parameters can be stored on chip in digital form. The overall architectural design also adopts engineering methods from adaptive networks and optimization principles[1]. The 6 level Cu interconnect, single poly, 0.18 micron process enables dense connectivity and dense die area of this highly interconnected network resulting in a compact powerful engine. Moreover, the special low resistance and low capacitance electrical properties of Cu permit the design to achieve the high connectivity while still managing precise distributions of resistive and capacitive loads. These properties enable one to predict performance and limit signal time-delays along the interconnect. The small feature size and the electrical interconnect properties for copper are enablers to the realization of such a powerful chip with dense interconnectivity. The resulting chip design would require no traditional programming or coding. In addition to novel architectural design, the hardware also performs the heavy computational burden by selectively realizing programmability as on-chip auto-learning modules. The resulting chip super-machine operates on 1.5V power source and would consume less than 1 m Watt.. We mention two prominent potential applications as examples. One example is a smart probe in the medical and biological fields for biological cell measurement and stimulation where no reliable process model exists and where decisions have to be made on-line. Applications in this domain include, drug injections, condition monitoring and micro-surgery. Another example domain is to determine or model the quality of combustion in vehicle engines to detect misfiring and its consequence on exhaust gases and the environment. Both of these application domains generate huge amount of signals or data, and would require massive processing for standard computing paradigms. Similar challenging problems do exist in pattern matching, feature extraction, data mining, to name a few. II. DESIGN OVERVIEW Some of the design guidelines we pursued in the design project include: 1) The forward network's processing and the learning module is analog, while the weight storage, control signals are digital has given rise to a mixed mode circuit implementation. The multilayer network uses 16 inputs and 16 outputs (see Fig.1). INPUT LAYER HIDDEN LAYER I 16 HIDDEN LAYER II OUTPUT LAYER OPTIONAL FEEDBACK PATH (FOR RECURRENT STRUCTURE) I N P U T S O U T P U T S 16 16 16 16 16 16 16 Fig.1: Block Architecture 2) The I/O specification is however flexible and can easily be reduced/expanded in our scalable design to inputs compatible with the available packages. The expansion, however, can be achieved by using several of the chips in cascade and parallel combinations[4]. 3) The chip operates in four different modes: (i) learn, (ii) (on-chip) store, (iii) program read/write, and (iv) process (see Fig. 2) (i) Learn: The chip activates the learning process based on the inputs and (desired) output targets supplied by the application or the user. (ii) Store: Once the user is satisfied with the performance of the network in the learning mode, the store mode saves the computed weights in on- chip static digital memory. (iii) Program: This mode was added to give the chip the capability of weight read-out or read-in. The read-in Proc. 43rd IEEE Midwest Symp. on Circuits and Systems, Lansing MI, Aug 8-11, 2000 0-7803-6475-9/00/$10.00 ©IEEE 2000 798