1051-8215 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2019.2956455, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. X, X 2019 1 Hybrid Video Codec Based on Flexible Block Partitioning with Extensions to the Joint Exploration Model Wei-Jung Chien ∗ , Muhammed Coban ∗ , Jie Dong ∗ , Hilmi E. Egilmez ∗ , Member, IEEE, Nan Hu ∗ , Marta Karczewicz ∗ , Amir Said ∗ , Fellow, IEEE, Vadim Seregin ∗ , Geert Van der Auwera ∗ , Senior Member, IEEE, Philippe Bordes † , Franck Galpin † , Fabrice Le L´ eannec † , Tangi Poirier † , Fabrice Urban † (Invited Paper) Abstract—This article describes the main video coding tech- nologies included in a joint proposal submitted by Qualcomm and Technicolor, in response to a Call for Proposals (CfP) issued by ITU-T SG16 WP3 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) in Oct. 2017. The proposal contains the majority of the tools that have been adopted into the Joint Exploration Model (JEM), developed in the exploratory phase that preceded the CfP. A ﬂexible multi-tree type (MTT) block-partitioning scheme is proposed to extend the quadtree and binary tree (QTBT) based partitioning in JEM by including triple tree (TT) and asym- metric binary tree (ABT) partitions. In addition, several JEM tools in intra and inter prediction, transforms and arithmetic coding are modiﬁed, and new tools such as sign prediction and motion compensated padding are proposed. Objective standard dynamic range (SDR) gains of 43.1% and 15.5% in terms of average luma BD-rate improvement have been achieved for the CfP constraint set 1 (random-access conﬁguration) relative to HEVC/H.265 (HM) and JEM anchors, respectively. For the CfP constraint set 2 (low-delay conﬁguration), the average luma BD- rate improvements are 33.7% relative to the HM anchor and 12.7% relative to the JEM anchor. The proposed codec scored highly in both subjective evaluations and objective metrics and was among the best-performing CfP proposals. Index Terms—Video coding, video compression, video coding standards. I. I NTRODUCTION T HIS paper describes the video coding technology pro- posal jointly submitted by Qualcomm and Techni- color [1], [2], in response to a call for proposals (CfP) issued in Oct. 2017 by the ITU-T SG16 WP3 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) committees [3], where it is tar- geted to have a new video technology providing substantially higher compression capability without introducing signiﬁcant complexity over the HEVC standard 1 . As part of this effort, Earlier versions of this work [1], [2] were submitted as a response to the call for proposals (CfP) in [3]. ∗ Authors are with Qualcomm Technologies Inc., San Diego, CA, USA. Corresponding author: G. Van der Auwera (e-mail: geertv@qti.qualcomm.com) † Authors are currently with InterDigital, France, and were with Technicolor, France, when the earlier version of this work was submitted as a CfP response. Manuscript received April, 2019; revised in August, 2019; second revision in October, 2019. 1 Additional requirements and details regarding the CfP process, test se- quence set, coding constraint conditions, subjective evaluation methodology, anchors used are presented in [3], [4]. the Joint Exploration Model (JEM) [5] has been developed to study and evaluate new video coding tools beyond HEVC. Among other differences with HEVC discussed in [6], the main change in JEM is the adoption of binary tree (BT) based partitioning in addition to quadtree (QT) partitions in HEVC. Our proposal further extends the QTBT-based partitioning in JEM by introducing a multi-tree type (MTT) partitioning including two more partition types based on triple trees (TT) and asymmetric binary trees (ABT). Moreover, several new methods, enhancements and simpliﬁcations to JEM tools are proposed. Below we summarize the main ones: • A new method for sign prediction of transform coefﬁcients is proposed to reduce the signalling overhead of sign infor- mation in coefﬁcient coding. • New padding techniques for motion compensation are pro- posed. • In afﬁne motion prediction, a 6-parameter afﬁne model is introduced in addition to the 4-parameter model in JEM. Also, the number of initial motion candidates in JEM is reduced from 15 to 2. • For the merge modes used in inter prediction, unlike JEM, the proposed merge candidate list includes afﬁne merge and non-adjacent spatial candidates. Bilateral or template matching reﬁnement is performed only for one candidate, identiﬁed by the signalled merge index, and sub-block level reﬁnements based on template matching and bilateral reﬁnement are removed. • To improve intra prediction, the number of intra modes are increased to 131 by introducing new angular modes on top of the existing 65 modes in JEM. • For context-adaptive binary arithmetic coding (CABAC), a more effective probability estimation method is proposed to allow different rates of adaptation on a per-context basis. • An extended set of separable transforms, based on the adaptive multiple-core transform (AMT) scheme in JEM, is introduced to improve coding efﬁciency. • Since the decoder-side motion vector reﬁnement (DMVR) [7] in JEM does not provide sufﬁcient beneﬁt for a complex decoding process, it is not included in our proposal. The overall coding efﬁciency of the proposed video codec is summarized in Table I. The table shows that Copyright c  2019 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org.