A probabilistic framework for image information fusion with an application to mammographic analysis Marina Velikova a,⇑ , Peter J.F. Lucas a , Maurice Samulski b , Nico Karssemeijer b a Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands b Department of Radiology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands article info Article history: Received 19 May 2011 Received in revised form 20 November 2011 Accepted 16 January 2012 Available online 24 January 2012 Keywords: Bayesian networks Information fusion Multi-view image analysis Computer-aided detection Mammography abstract The recent increased interest in information fusion methods for solving complex problem, such as in image analysis, is motivated by the wish to better exploit the multitude of information, available from different sources, to enhance decision-making. In this paper, we propose a novel method, that advances the state of the art of fusing image information from different views, based on a special class of probabi- listic graphical models, called causal independence models. The strength of this method is its ability to systematically and naturally capture uncertain domain knowledge, while performing information fusion in a computationally efficient way. We examine the value of the method for mammographic analysis and demonstrate its advantages in terms of explicit knowledge representation and accuracy (increase of at least 6.3% and 5.2% of true positive detection rates at 5% and 10% false positive rates) in comparison with previous single-view and multi-view systems, and benchmark fusion methods such as naïve Bayes and logistic regression. Ó 2012 Elsevier B.V. All rights reserved. 1. Introduction The increasing number and heterogeneity of information sources and techniques for data acquisition, especially in complex domains such as image analysis, has produced vast amounts of information, giving rise to an increasing demand for combined pro- cessing of information to extract valuable knowledge and make better decisions. The term information fusion is used when such merging of information from different sources is done automatically. There have been given numerous definitions of the notion of information fusion in the past 20 years (Boström et al., 2007); its key point is that the synergy of multiple sources should lead to more effective support, i.e., better decisions or actions, e.g. in terms of accuracy, than when these sources were used separately. Yet, this should be done without sacrificing computational efficiency, as otherwise the fusion of information might be intractable. Often, in fusing image information one also has to deal with the inherent uncertainty of the information, e.g. due to measurement or inter- pretation errors. The fusion of such uncertain image information is the subject of this paper. Given the rich expertise of human interpreters and the comput- ing power of intelligent computer-based systems, we believe that their integration is the main road for building smarter computer- aided detection (CAD) systems. In this work, we adopted this synergistic principle to develop a novel CAD system for automated multi-view image analysis by combining knowledge derived from the analysis of the way humans interpret images, on the one hand, and image information automatically extracted by a CAD system for image interpretation, on the other hand. In particular, we built a multi-stage system using Bayesian networks—one type of probabi- listic graphical model—which are especially promising in bridging the gap between the capabilities of humans and computer-aided interpretation, as they can support the explicit representation of ex- pert knowledge, handle uncertainty and missing information, and allow combining multiple sources of knowledge. The application domain of this paper is the analysis of X-ray images, or mammograms, for breast cancer detection in screening programs, which will be referred to as mammographic analysis in the remainder of the paper. In mammographic analysis, the major- ity of current CAD systems are mainly used to analyze single images only and often act as prompt systems, meant to focus the radiologist’s attention to particular image regions, rather than pro- viding overall classification of the patient’s condition. They have no or limited capability to capture the working principles employed by radiologists. An example of one of such principles is multi-view mammographic analysis, where the radiologists judge for the pres- ence of cancer on the basis of two projections, or views, of the same breast: mediolateral oblique (MLO), taken under 45° angle and showing part of the pectoral muscles, and craniocaudal (CC), taken head to toe. Human readers also normally compare image parts and different images of the breasts to each other, i.e., they interpret 1361-8415/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.media.2012.01.003 ⇑ Corresponding author. E-mail address: marinav@cs.ru.nl (M. Velikova). Medical Image Analysis 16 (2012) 865–875 Contents lists available at SciVerse ScienceDirect Medical Image Analysis journal homepage: www.elsevier.com/locate/media