LOW POWER OPPORTUNITIES FOR A SIMD VLSI ARCHITECTURE INCORPORATING INTEGRATED OPTOELECTRONIC DEVICES Huy H. Cat, John C. Eble 1 , D. Scott Wills 1 , Vivek K. De 1 , Martin Brooke, and Nan Marie Jokerst School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0250 1 Sponsored by the Advanced Research Project Agency (Contract: FY1123-95-08217) Introduction Integrated optoelectronic interconnect offers a potentially lower cost, higher density alternative to wire-based technologies for I/O. For most applications, low cost IC packages provide an effective means of I/O in a system. However, some applications, such as image processing, require higher levels of off-chip I/O bandwidth than can be provided using perimeter bonded ICs. The SIMD Pi xel Processor (SIMPil) is a new image processing architecture that takes advantage of integrated optoelectronics to form a lightweight, high frame rate smart camera. To achieve high frame rates while maintaining adequate resolution, the SIMPil system consists of an array of SIMD nodes integrated with an array of optoelectronic detectors. This integration allows for a compact and lightweight system. However, such a system will require a high power level resulting in a short battery life and possible thermal problems. Plausible low power opportunities for the SIMPil system are presented using early power estimation, technology scaling, and parallelism The SIMD Pixel Processor (SIMPil) Current CCD arrays offer low cost, high resolution image capture. However, these devices require different processing than digital CMOS VLSI, and the charge coupled devices consume most of the available circuit array. This prohibits an integrated sensor / processing system. Because the detectors and processors are decoupled, CCD-based systems cannot achieve high frame rates, and its decoupled nature overly complicates the design for “active detectors,” where a closely coupled processor continually adjusts a detector’s sensitivity, noise rejection, etc. A closely coupled system eliminates high capacitance interconnects that exist between the CCD array and digital processing. The SIMD Pixel Processor (SIMPil) system [1] being designed here at Georgia Tech uses integrated optoelectronic detectors for on-chip conversion and delivery of optical image data to the digital processors. Its SIMD (single instruction stream, multiple data streams) processors also provide greater processing power and programmability than “smart-pixel” systems proposed in the optical computer field. By coupling an array of GaAs thin-film P-i-N detectors on top of (and electrically bonded to) Si VLSI-based SIMD processors, the SIMPil system operates at the image focal- plane providing high-performance image processing. This processing approach provides greater I/O bandwidth between detectors and processors. It also allows real time interaction between sensors and processors (e.g., for sensitivity adjustment, etc.) that is not possible with non-integrated systems. While integrated silicon (Si) detectors have been demonstrated in VLSI processing systems, this technique allows Si processing circuitry to be placed underneath the GaAs detector area, providing greater fill factor for better image coverage and more efficient detectors. This integrated approach can lead to extremely compact image processing systems. Processor Architecture The digital component of the SIMPil system consists of an array of SIMD processors. Figure 1 illustrates a block diagram of one SIMPil node. Local Memory (64 words) NEWS Registers Register File (8 words) Special Registers S&H / ADC Arithmetic Logical Shift Unit Multiply Accumulate Thin Film Detectors Figure 1: Block diagram of a SIMD Pixel Processor node The node includes a traditional processor datapath plus additional units for interfacing with the thin film detector array. The first implementation of the node includes an 8 bit datapath with an arithmetical, logical, shift unit, and a 16 bit multiply- accumulator (MACC) used in many image processing applications. These functional units access an eight word register file. Each node has 64 words of local memory. (Up to 256 words can be addressed in the instruction set.) SIMPil nodes communicate through a nearest neighbor NEWS (north, east, west, and south) network using special registers in the datapath. Each SIMPil node interfaces to a sub-array of thin film detectors in the focal plane array. In the current SIMPil implementation, a processor addresses 16 detectors. Future SIMPil node implementations can address up to 256 detectors. Each processor includes circuitry to convert light intensities at the detectors into digital values for processing. The instruction set architecture provides a SAMPLE instruction to synchronously capture light intensities at each detector of the focal plane array. The SIMD execution model allows the entire image to be sampled synchronously. Once the detector array data has been digitized, it can be processed on the SIMD nodes in a data parallel fashion. System Integration The focal plane processing approach to optical interconnect dispenses with the need to electrically convey input matrices to integrated processing circuitry by incorporating photosensitive devices on the same substrate as the processing circuitry. The photodetectors provide I/O data to processors underneath. Optical interconnect technology is ideal for image processing tasks, since it can be used for sampling incident images in real time and in parallel.