Multiple-instance learning improves CAD detection of masses in digital mammography Balaji Krishnapuram 1 , Jonathan Stoeckel 2 , Vikas Raykar 1 , Bharat Rao 1 , Philippe Bamberger 2 , Eli Ratner 2 , Nicolas Merlet 2 , Inna Stainvas 2 , Menahem Abramov 2 , and Alexandra Manevitch 2 1 CAD and Knowledge Solutions (IKM CKS), Siemens Medical Solutions Inc., Malvern PA 19355, USA, 2 Siemens Computer Aided Diagnosis Ltd., Jerusalem, Israel Abstract. We propose a novel multiple-instance learning (MIL) algo- rithm for designing classifiers for use in computer aided detection (CAD). The proposed algorithm has 3 advantages over classical methods. First, unlike traditional learning algorithms that minimize the candidate level misclassification error, the proposed algorithm directly optimizes the patient-wise sensitivity. Second, this algorithm automatically selects a small subset of statistically useful features. Third, this algorithm is very fast, utilizes all of the available training data (without the need for cross- validation etc.), and requires no human hand tuning or intervention. Ex- perimentally the algorithm is more accurate than state of the art support vector machine (SVM) classifier, and substantially reduces the number of features that have to be computed. 1 Background Traditionally, in an almost universal architecture, CAD algorithms operate in a sequence of three stages. In the first stage, a candidate generation (CG) al- gorithm identifies suspicious regions. In the second stage each suspicious region is characterized by a set of features. In the third, classification stage, each re- gion is evaluated in light of the features and a decision is made whether the region is sufficiently suspicious that it should be highlighted to a radiologist. This paper focuses largely on the design of the classifier for the third stage of this architecture. Many off-the-shelf classifier learning algorithms have been used during the design of CAD algorithms, e.g. support vector machines (SVM) [1], neural net- works (NN) [3], etc. However, the derivations behind most of these algorithms make unwarranted assumptions that are violated in CAD data sets. For ex- ample, most classifier-learning algorithms assume that the training samples are independent and identically distributed (i.i.d.). However, there are high levels of correlations among the suspicious locations from the same region of a breast (both within a breast image, and across multiple images of the same breast), so the training samples are clearly not independent.