A Novel Approach to Distributed Quantization via Multivariate Information Bottleneck Method Shayan Hassanpour, Dirk W¨ ubben, and Armin Dekorsy Department of Communications Engineering University of Bremen, 28359 Bremen, Germany Email: {hassanpour, wuebben, dekorsy}@ant.uni-bremen.de Abstract—Consider following setup: A number of observations from a data source shall be compressed jointly prior to a forward transmission via several rate-limited links to a central processing unit. To design the respective quantizers, here, Mutual Information is chosen as the ﬁdelity criterion and the broad-ranging structure of Multivariate Information Bottleneck is then aptly tailored to that purpose. This, indeed, not only yields a novel design approach for the considered distributed scenario but also paves the way towards perceiving the chance of leveraging this ﬂexible conceptual frame in a vast variety of applications regarding digital data transmission. Explicitly, it immediately enables addressing various extensions of the presumed arrangement, incorporating the parallel construction of intertwined compression systems for several correlated sources. I. I NTRODUCTION The joint compression of multiple observations from a given source is considered. This frequently appearing distributed setup is, indeed, the underlying scenario in a variety of applications, i.a., decentralized inference sensor networks wherein a certain number of measured (sensed) values must be quantized ahead of transmission to the fusion center [1], cooperative relaying schemes with Quantize-and-Forward strategy [2], and last but not least, Cloud-based Radio-Access Networks with rate-limited fronthaul links to the central processor in the cloud [3]. Most studies in the available literature on this setup follow the Rate-Distortion philosophy and propose some algorithmic approaches for the quantization design problem w.r.t. a speciﬁc distortion measure, e.g., the Mean-Squared-Error (MSE) [4], the Ali-Silvey distance [5], or the Fisher Information [6]. Contrary to the previous investigations, here we employ the novel design paradigm of the Multivariate Information Bottleneck (MIB) [7]. MIB is an immediate extension of the preliminary idea of the Information Bottleneck (IB) [8] that has emerged originally in the Machine Learning context as a novel, information-theoretic approach towards Clustering which is a fundamental task in the sub-branch of Unsupervised Learning [9]. To put it in a nutshell, the IB method is a variational principle aiming for compressing a Random Variable (RV) in a fashion that it retains most of the information content w.r.t. another relevant variable and, interestingly, this preservation capability can be controlled through twiddling a trade-off parameter. To attain an overall picture on the IB method and several related algorithmic approaches, interested readers are referred to [10]–[12]. There exist a number of intriguing aspects which support the idea of deploying this framework for communication applications as well. Concerning a totally connected example, in case of noisy source coding, following the IB philosophy, a purely statistical design structure is achieved which directly engages the actual source into its formulation. Besides, a major special instance of this principle boils down to designing quantizers that maximize the end-to-end data transmission rate for a given input statistics, something sought in (almost) all communication schemes. In fact, the IB paradigm has already found its path into various aspects of modern transmission systems from construction of polar codes [13] to advanced discrete (channel) decoding concepts [14] with relatively low complexity and yet quite promising performance. MIB is a generic principle that not only enables considering the cases for which the compression shall be relevant w.r.t. multiple variables but also allows for simultaneous construction of several systems of clusters. To make that happen, it utilizes the concept of Multi-Information, a natural extension of the pairwise concept of Mutual Information, over two Bayesian Networks (BNs). The ﬁrst network stipulates the imposed constraints, i.e., statistical independencies among the involved RVs, and identiﬁes the set of compression variables. The second one, speciﬁes the relations that shall be retained. The general principle is then formulated as a trade-off between the multi-information each network carries. The fascinating feature of this mathematical establishment is that the optimal solution and subsequently the relevant algorithms are derived formally, i.e., irrespective of particular choices of BNs. This, indeed, brings about a lot of ﬂexibility into play and turns the MIB into a comprehensive framework that can be suitably applied to address a wide range of applications, especially, more sophisticated situations wherein multiple RVs are involved. To vividly demonstrate the usability of exploiting MIB, within this work we consider the predescribed distributed quantization setup and tailor the general framework of MIB to that matter. An asymptotic case of this Variational Principle then aims for maximizing the mutual information between the given source and the random vector comprising all the compressed variables. This scenario has been recently investigated in [1] and as shown, it engenders a set of quantizers which perform quite comparably to the ones exclusively designed for the estimation and detection purposes. That can be reckoned as another cogent argument for MIB deployment. Indeed, it will be shown that our suggested algorithm not only outperforms the proposed approach in [1], but also broadens the scope of the underlying problem through establishing a fundamental trade-off between the acquired level of compression on the one hand and the amount of achievable relevant information preservation on the other. 978-1-7281-0962-6/19/$31.00 ©2019 IEEE