INTEGRATION, the VLSI journal 39 (2005) 1–11 A high-throughput, memory efficient architecture for computing the tile-based 2D discrete wavelet transform for the JPEG2000 G. Dimitroulakos à , M.D. Galanis, A. Milidonis, C.E. Goutis VLSI Design Laboratory, Electrical and Computer Engineering Department, University of Patras, 26500 Patras, Greece Received 26 November 2003; received in revised form 1 November 2004; accepted 30 November 2004 Abstract In this paper, the design and implementation of an optimized hardware architecture in terms of speed and memory requirements for computing the tile-based 2D forward discrete wavelet transform for the JPEG2000 image compression standard, are described. The proposed architecture is based on a well-known architecture template for calculating the 2D forward discrete wavelet transform. This architecture is derived by replacing the filtering units by our previously published throughput-optimized ones and by developing a scheduling algorithm suited to the special features of our filtering units. The architecture exhibits high- performance characteristics due to the throughput-optimized filters. Also, the extra clock cycles required due to the tile-based version of the discrete wavelet transform are partially compensated by the proper scheduling of the filters. The developed scheduling algorithm results in reduced memory requirements compared with existing architectures. r 2005 Elsevier B.V. All rights reserved. Keywords: Discrete wavelet transform; JPEG2000 standard; Tile-based 2D wavelet transform; Scheduling ARTICLE IN PRESS www.elsevier.com/locate/vlsi 0167-9260/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.vlsi.2004.11.002 à Corresponding author. Tel.: +30 2610 997324; fax: +30 2610 994798. E-mail address: dhmhgre@vlsi.ee.upatras.gr (G. Dimitroulakos).