Vol.27 No.1 JOURNAL OF ELECTRONICS(CHINA) January 2010 KEY OPTIMIZATION TECHNIQUES IN JPEG-LS IP CORE DESIGN 1 Zhang Xiaoyu Chen Xinkai Li Xiaowen Jiang Hanjun * Zhang Chun * Wang Zhihua * (Department of Electronic Engineering, Tsinghua University, Beijing 100084, China) * (Institute of Microelectronics, Tsinghua University, Beijing 100084, China) Abstract This paper presents the key optimization techniques for an efficient accelerator im- plementation in an image encoder IP core design for real-time Joint Photographic Experts Group Lossless (JPEG-LS) encoding. Pipeline architecture and accelerator elements have been utilized to enhance the throughput capability. Improved parameters mapping schemes and resource sharing have been adopted for the purpose of low complexity and small chip die area. Module-level and fine-grained gating measures have been used to achieve a low-power implementation. It has been proved that these hardware-oriented optimization techniques make the encoder meet the re- quirements of the IP core implementation. The proposed optimization techniques have been verified in the implementation of the JPEG-LS encoder IP, and then validated in a real wireless endoscope system. Key words Joint Photographic Experts Group Lossless (JPEG-LS); Low power; Resource sharing; Logarithm priority encoder CLC index TP751 DOI 10.1007/s11767-009-0026-2 I. Introduction Compression algorithms for static images have been widely developed for information exchanging in communications, e.g. internet, remote terminals, wireless monitoring device for healthcare. The lossless and near-lossless compression standard of Joint Photographic Experts Group (JPEG-LS) [1,2] has provided a way to satisfy the requirements of high fidelity, which is especially suitable for medical image communications. Additionally, IP core design of JPEG-LS encoder becomes more and more important in medical care and healthcare applications. Actually, there have been always tradeoffs among compression ratio, speed, com- plexity and power. This paper analyzes key tech- niques in JPEG-LS algorithm implementation and presents a real-time processing architecture for IP core design with low power and low complexity. The optimization is based on Very Large Scale Integration (VLSI) architecture which mainly 1 Manuscript received date: April 13, 2009; Revised date: October 12, 2009. Supported by National High Technology Research and Development Program (No. 2008AA010707). Communication author: Zhang Xiaoyu, 1981, male, Ph.D. candidate. Tsinghua University, Beijing 100084, China, E-mail: zhangxiaoyu00@mails.tsinghua.edu.cn. consists of four parts: pre-filter, mode decision module, parallel pipelines for each mode, and a two-tier data packer. The architecture is practical for both Red-Green-Blue (RGB) format and Bayer Color Filter Array (CFA) format [3,4] , which are the most popular data formats for current image sen- sors. In the 4 parts (or stages) of the JPEG-LS implementation, the pre-filter can be designed to re-arrange pixels and smooth the edges of a image, by which we can obtain a higher compression ra- tio [5,6] ; the mode decision module determines to feed data stream into the corresponding pipeline; the parallel pipelines proceed error prediction and calculation; the data packer receives data streams from the pipelines and packs them in sequence. Moreover, a data bus interface is designed for in- coming stream and a buffer interface for outgoing stream. The classic JPEG-LS algorithm description is software-oriented. Some key steps contain the op- erators beyond arithmetic, which may cause more complexity in hardware design. To meet the re- quirements of adequate performance, low com- plexity and low power, we utilize the techniques of pipeline architecture, accelerator elements, im- proved mapping schemes, resource sharing, mod- ule-level gating and fine-grained gating. Further- more, the key step of the algorithm can be more