Machine Vision and Applications (2011) 22:449–460 DOI 10.1007/s00138-010-0263-2 ORIGINAL PAPER Exploiting sparse representations in very high-dimensional feature spaces obtained from patch-based processing J. E. Hunter · M. Tugcu · X. Wang · C. Costello · D. M. Wilkes Received: 6 October 2008 / Revised: 29 December 2009 / Accepted: 17 March 2010 / Published online: 8 April 2010 © Springer-Verlag 2010 Abstract Use of high-dimensional feature spaces in a system has standard problems that must be addressed such as the high calculation costs, storage demands, and training requirements. To partially circumvent this problem, we pro- pose the conjunction of the very high-dimensional feature space and image patches. This union allows for the image patches to be efficiently represented as sparse vectors while taking advantage of the high-dimensional properties. The key to making the system perform efficiently is the use of a sparse histogram representation for the color space which makes the calculations largely independent of the feature space dimen- sion. The system can operate under multiple L p norms or mixed metrics which allows for optimized metrics for the feature vector. An optimal tree structure is also introduced for the approximate nearest neighbor tree to aid in patch clas- sification. It is shown that the system can be applied to various applications and used effectively. Keywords High-dimensional feature space · Object recognition · Scene recognition · Machine learning J. E. Hunter (B ) · M. Tugcu · X. Wang · C. Costello · D. M. Wilkes Center for Intelligent Systems, Vanderbilt University, Nashville, TN 37235-0131, USA e-mail: jonathan.e.hunter@vanderbilt.edu M. Tugcu e-mail: mtugcu@stm.com.tr X. Wang e-mail: xcawang@yahoo.com C. Costello e-mail: christopher.j.costello@vanderbilt.edu D. M. Wilkes e-mail: mitch.wilkes@vanderbilt.edu 1 Introduction Use of high-dimensional feature spaces in a system is a concept that raises many questions. Some of the standard problems that accompany high dimensionality are the high calculation costs, storage demands, and training require- ments associated with the “curse of dimensionality”. Many modern pattern recognition approaches exhibit computa- tional costs that grow quadratically (or even faster) with the number of features, causing problems for these methods in high dimensions. However, Lee and Landgrebe [1] observed that distance measures alone fail to fully take advantage of the discriminating power of high-dimensional vectors because they only use first-order statistics when use of second-order statistics are effective and concludes that covariance-based classifiers often work better. It should also be noted that many modern methods are very restrictive in the choice of metric or norm that may be used by the classifier. Kalayeh and Landgrebe [3] reported that the number of training vectors necessary to train a system using linear clas- sifiers is on the order of five times the dimensionality of the feature space and is even greater for quadratic classifiers. Due to the large amount of training vectors needed, applications using high-dimensional feature vectors may fall short on the necessary amount of training data needed to fully train the system. In fact, there are a number of articles discussing high- dimensional systems operating with limited amounts of data and analyzed using covariance-based classification methods [15]. In some applications, the amount of training data available is very limited, which may cause problems for high-dimen- sional systems. However, in the area of vision systems, the assumption of the availability of only limited amounts of training data is not always appropriate. For example, sys- tems using developmental learning techniques may actually 123