A Mixed-Mode Design for a Self-programming Chip For real-time estimation,
prediction, and control
Khurram Waheed and Fathi M. Salam
Department of Electrical and Computer Engineering
Michigan State University
East Lansing, MI 48824-1226
Abstract-The paper overviews the development of a self-
learning computing chip in 0.18 micron copper technology. This
chip supercedes, in its capabilities, present micro-computing
paradigms (micro-processors, micro-controllers, and DSPs) in
the application domains of process identification, modeling,
prediction, and real-time control. In particular, specific domains
of targeted potential applications include:
(i) Nano-level on-line bio-probing and actuation
(ii) image analysis and feature extraction,
(iii) channel equalization for high speed mobile
communications,
(iv) inertial navigation sensor fusion, and network
management for routers.
I. INTRODUCTION
The core of the chip is a neurally inspired scalable (re-
configurable) array network for compatibility with VLSI. The
chip is endowed with tested auto-learning capability, realized
in hardware, to achieve global task auto-learning execution
times in the micro to milli seconds. The core consists of basic
building blocks of 4-quadrant multipliers, transconductance
amplifiers, and active load resistances, for analog (forward-)
network processing and learning modules. Super-imposed on
the processing network is a digital memory and control
modules composed of D-Flip-flops, ADC, Multiplying D/A
Converter (MDAC), and comparators for parameter (weight)
storage and analog/digital conversions. The architectural
forward network (and learning modules) process in analog
continuous-time mode, while the (converged, steady state)
weights/parameters can be stored on chip in digital form. The
overall architectural design also adopts engineering methods
from adaptive networks and optimization principles[1].
The 6 level Cu interconnect, single poly, 0.18 micron
process enables dense connectivity and dense die area of this
highly interconnected network resulting in a compact
powerful engine. Moreover, the special low resistance and
low capacitance electrical properties of Cu permit the design
to achieve the high connectivity while still managing precise
distributions of resistive and capacitive loads. These
properties enable one to predict performance and limit signal
time-delays along the interconnect. The small feature size
and the electrical interconnect properties for copper are
enablers to the realization of such a powerful chip with dense
interconnectivity.
The resulting chip design would require no traditional
programming or coding. In addition to novel architectural
design, the hardware also performs the heavy computational
burden by selectively realizing programmability as on-chip
auto-learning modules. The resulting chip super-machine
operates on 1.5V power source and would consume less than
1 m Watt..
We mention two prominent potential applications as
examples. One example is a smart probe in the medical and
biological fields for biological cell measurement and
stimulation where no reliable process model exists and where
decisions have to be made on-line. Applications in this
domain include, drug injections, condition monitoring and
micro-surgery. Another example domain is to determine or
model the quality of combustion in vehicle engines to detect
misfiring and its consequence on exhaust gases and the
environment. Both of these application domains generate
huge amount of signals or data, and would require massive
processing for standard computing paradigms. Similar
challenging problems do exist in pattern matching, feature
extraction, data mining, to name a few.
II. DESIGN OVERVIEW
Some of the design guidelines we pursued in the design
project include:
1) The forward network's processing and the learning
module is analog, while the weight storage, control
signals are digital has given rise to a mixed mode circuit
implementation. The multilayer network uses 16 inputs
and 16 outputs (see Fig.1).
INPUT
LAYER
HIDDEN
LAYER I 16
HIDDEN
LAYER II
OUTPUT
LAYER
OPTIONAL FEEDBACK PATH (FOR RECURRENT STRUCTURE)
I N P U T S
O U T P U T S
16 16 16 16 16 16 16
Fig.1: Block Architecture
2) The I/O specification is however flexible and can easily
be reduced/expanded in our scalable design to inputs
compatible with the available packages. The expansion,
however, can be achieved by using several of the chips in
cascade and parallel combinations[4].
3) The chip operates in four different modes: (i) learn, (ii)
(on-chip) store, (iii) program read/write, and (iv)
process (see Fig. 2)
(i) Learn: The chip activates the learning process based
on the inputs and (desired) output targets supplied
by the application or the user.
(ii) Store: Once the user is satisfied with the
performance of the network in the learning mode,
the store mode saves the computed weights in on-
chip static digital memory.
(iii) Program: This mode was added to give the chip the
capability of weight read-out or read-in. The read-in
Proc. 43rd IEEE Midwest Symp. on Circuits and Systems, Lansing MI, Aug 8-11, 2000
0-7803-6475-9/00/$10.00 ©IEEE 2000
798