Analysis of Set-theoretic and Stochastic Models for Fusion under Unknown Correlations Marc Reinhardt, Benjamin Noack, Marcus Baum, and Uwe D. Hanebeck Intelligent Sensor-Actuator-Systems Laboratory (ISAS), Institute for Anthropomatics, Karlsruhe Institute of Techology (KIT), Germany. Email: marc.reinhardt@student.kit.edu, noack@kit.edu, marcus.baum@kit.edu, uwe.hanebeck@ieee.org Abstract—In data fusion theory, multiple estimates are com- bined to yield an optimal result. In this paper, the set of all possible results is investigated, when two random variables with unknown correlations are fused. As a first step, recursive processing of the set of estimates is examined. Besides set- theoretic considerations, the lack of knowledge about the un- known correlation coefficient is modeled as a stochastic quantity. Especially, a uniform model is analyzed, which provides a new optimization criterion for the covariance intersection algorithm in scalar state spaces. This approach is also generalized to multi-dimensional state spaces in an approximative, but fast and scalable way, so that consistent estimates are obtained. Keywords: filtering, estimation, fusion, Bayesian, correla- tion coefficient. I. I NTRODUCTION In many practical applications, distributed sensor systems are utilized in order to take advantage of different angles, distances etc. By means of a Bayesian state estimator, the measurement information can be fused with the current esti- mate, uncertainties can be modeled and taken into account, and for further processing the obtained estimates can be predicted. From a central architecture, where all estimates and correlations between them are managed centrally, to fully distributed approaches, where data is processed and collected on different nodes and no information on cross-correlations is available, different distributed fusion architectures have been developed [1]. In this paper, we focus on linear estimation problems in distributed fusion architectures [1]–[4]. Distributed fusion algorithms have the advantage of lower infrastructure costs, such as communication or data storage expenses, and are robust to failures. The main challenge is to handle cross- correlations between the estimates, since ignoring correlations and applying standard Kalman filter equations for the fusion in general lead to inconsistent results. Suppose for example a distributed sensor network, where node B gets information from node A and the data of both nodes should be fused in node A. If we assume independence, the uncertainty is erroneously reduced due to the fusion although both nodes share the same information. Different approaches to cope with the problem of unknown correlations have been developed. In particular, the covariance intersection algorithm (CI), which has been proposed by Julier and Uhlmann [5], [6], is often used as a baseline. Minimizing the determinant of the fusion result is the most commonly used optimization criterion for the CI algorithm. Especially in scalar state spaces, this implies that CI does not update an estimate as long as no information with smaller variance is available. Since this is not desirable in most applications, we start our discussions with the impacts of the cross-correlation on the fusion result particularly in one-dimensional state spaces. We present closed-form equations for the interval of possible means and variances and show that the set of possible fusion results may diverge, when cross-correlations are not restricted. In a next step, we model the lack of knowledge about the correlation coefficient by a uniform distribution, i.e., as a uniform random variable on the interval [−1, 1]. In order to provide a practical estimator, we also derive closed-form solutions for mean and variance by marginalizing out the correlation variable. Based on these solutions, we derive a new optimization criterion for CI and generalize it to multi- dimensional state spaces. II. PROBLEM FORMULATION In data fusion theory, estimates are combined to yield an optimal fused estimate. The estimates characterize uncertain quantities, which are modeled by random variables. This paper concentrates on the fusion of two estimates x and y to a resulting estimate z , when the correlation between x and y is unknown. We denote the mean vectors by ˆ x , ˆ y , and ˆ z . The joint covariance matrix is C = C x C xy C yx C y , (1) whereas the fused covariance matrix is C z . Let ¯ ξ denote the true statistics, then the estimation errors are ˜ x = ¯ ξ − ˆ x and ˜ y = ¯ ξ − ˆ y . Let C x∗ =E ˜ x ˜ x T and C y ∗ =E ˜ y ˜ y T denote the unknown actual mean squared error (MSE) matrices. The input data are consistent estimates, if C x − C x∗ ≥ 0 and C y − C y ∗ ≥ 0 , i.e., if the difference between the matrices is a positive semi- definite matrix. A central problem in distributed data fusion is to find an optimal estimate of the true statistics, if the cross-correlation matrices C xy = C yxT are unknown. Ignoring the cross-