Ann Oper Res (2010) 174: 169–183
DOI 10.1007/s10479-008-0506-z
Optimizing feature selection to improve medical
diagnosis
Ya-Ju Fan · Wanpracha Art Chaovalitwongse
Published online: 6 January 2009
© Springer Science+Business Media, LLC 2008
Abstract In this paper, we propose a new optimization framework for improving feature
selection in medical data classification. We call this framework Support Feature Machine
(SFM). The use of SFM in feature selection is to find the optimal group of features that show
strong separability between two classes. The separability is measured in terms of inter-class
and intra-class distances. The objective of SFM optimization model is to maximize the cor-
rectly classified data samples in the training set, whose intra-class distances are smaller than
inter-class distances. This concept can be incorporated with the modified nearest neighbor
rule for unbalanced data. In addition, a variation of SFM that provides the feature weights
(prioritization) is also presented. The proposed SFM framework and its extensions were
tested on 5 real medical datasets that are related to the diagnosis of epilepsy, breast cancer,
heart disease, diabetes, and liver disorders. The classification performance of SFM is com-
pared with those of support vector machine (SVM) classification and Logical Data Analysis
(LAD), which is also an optimization-based feature selection technique. SFM gives very
good classification results, yet uses far fewer features to make the decision than SVM and
LAD. This result provides a very significant implication in diagnostic practice. The outcome
of this study suggests that the SFM framework can be used as a quick decision-making tool
in real clinical settings.
Keywords Feature selection · Classification · Optimization · Medical diagnosis · Decision
making
1 Introduction
In medical malpractice, the highest dollar payouts are often related to misdiagnosis, failure
to diagnose or delayed diagnosis of a severe medical condition. When decision has to be
This work is supported by the National Science Foundation under CAREER Grant No. 0546574.
Y.-J. Fan · W.A. Chaovalitwongse ( )
Department of Industrial and Systems Engineering, Rutgers University, Piscataway, NJ 08854, USA
e-mail: wchaoval@rci.rutgers.edu