ORIGINAL RESEARCH PAPER Efficient approximate core transform and its reconfigurable architectures for HEVC Maher Jridi 1 Ayman Alfalou 1 Pramod K. Meher 2 Received: 6 September 2017 / Accepted: 27 March 2018 Ó Springer-Verlag GmbH Germany, part of Springer Nature 2018 Abstract This paper describes a new approximate transform for the high efficiency video coding (HEVC). A 8 9 8 discrete cosine transform (DCT) approximation is proposed and then down-sampled or expanded to generate the 4 9 4, 16 9 16, and 32 9 32 approximate matrices. The proposed 8 9 8 approximation is carried out in part by neighbourhood in order to take the advantage of adjacent pixels correlation of natural images. Hence, rather than approximating the odd basis vectors of DCT kernel by referring to their intrinsic values, we choose to quantize that by taking into account their signs and positions. The proposed approximation matrices respect the properties of transform matrices prescribed by HEVC like orthogonality and bit- length of the basis vector elements. Furthermore, they have nearly the same arithmetic complexity and hardware requirement as those of recently proposed related methods, but involve significantly less error energy. Moreover, a reconfigurable design based on the 8 9 8 approximation transform is proposed in order to allow the simultaneous computation of eight 4-, four 8-, two 16-, or one 32-point approximate DCTs. It is found that the reconfigurable design can involve nearly 26% less area-delay product (ADP) when compared with the separate non-reconfigurable designs. Experimental results obtained from FPGA prototype and HM simulations have demonstrated the advantages of the proposed transforms. Keywords Discrete cosine transform (DCT) Approximation High efficiency video coding (HEVC) FPGA-based hardware implementation 1 Introduction Today, users turn to mobile devices for everything, especially for video content. It is a known fact that video usage and video quality (image size) are growing faster than bandwidth. Therefore, a common challenge of processing and commu- nicating video content is the need for efficient video codecs, where the efficiency has many dimensions. It firstly concerns the compression ratio, but also includes the quality of decoded video, the real-time capability which is related to the com- putational complexity of the codec, and more importantly the energy consumption. This is all the more important since for wearable devices and for high-quality video content, the chip has to incorporate a lot of functions and the battery lifetime is limited. Therefore, low-power design of video codecs has become a primary concern. The use of High Efficiency Video Coding (H.265/ HEVC) standard [1] provides twice the compression ratio compared to H.264. This allows a substantial reduction in the bandwidth. Besides, the use of new coding tools in HEVC codec such as advanced predictors and additional intra-modes increases the encoder and decoder complexi- ties of about 5.2 times and 2.1 times, respectively [2]. Therefore, constraint of HEVC codec needs to be alleviated by decreasing the area and power consumptions and maintaining a higher quality of experience (QoE). Profiling results of HEVC encoder and decoder are given in [3] to show the time spent in various C?? classes of the HM reference software. It is indicated that in the all- intra configuration, a significant amount of encoding time (about a quarter of the total) is spent in the TComTrQuant & Maher Jridi maher.jridi@isen-bretagne.fr Ayman Alfalou Ayman.Alfalou@isen-bretagne.fr 1 Equipe Vision, ISEN Brest, CS 42807, 29228 Brest, France 2 School of Computer Engineering, Nanyang Technological University, Nanyang Avenue, N4-02B-69A, Singapore, Singapore 123 Journal of Real-Time Image Processing https://doi.org/10.1007/s11554-018-0768-x