Computational Statistics & Data Analysis 47 (2004) 225–236 www.elsevier.com/locate/csda Computational aspects of algorithms for variable selection in the context of principal components Jorge Cadima a , J. Orestes Cerdeira a ; ∗ , Manuel Minhoto b a Departamento de Matem atica, Instituto Superior de Agronomia, Tapada da Ajuda, Lisboa 1349-017, Portugal b Departamento de Matem atica, Universidade de Evora, Col egio Luis Ant onio Vernay, Rua Rom˜ ao Ramalho 59, Evora 7000, Portugal Received 4 November 2003; received in revised form 4 November 2003; accepted 8 November 2003 Abstract Variable selection consists in identifying a k -subset of a set of original variables that is optimal for a given criterion of adequate approximation to the whole data set. Several algorithms for the optimization problems resulting from three dierent criteria in the context of principal components analysis are considered, and computational results are presented. c 2003 Elsevier B.V. All rights reserved. Keywords: Combinatorial optimization; Heuristics; Principal components; Principal variables; Variable selection 1. Introduction In the analysis of data sets with large numbers of variables a frequent aim is to reduce the dimensionality of the data set. A typical way of reducing the dimension of a data set is through a principal component analysis (PCA). Standard results guarantee that retaining the k principal components (PCs) with the largest associated variance produces the k -subset of linear combinations of the p original variables which, under various criteria, best approximates the original variables (see, for example, Jollie, 2002, Section 6.3). PCA is an appropriate tool for deriving low-dimension subspaces which capture most of the information of the whole data set. * Corresponding author. Tel.: +351-2136-53467; fax: +351-2136-30723. E-mail addresses: jcadima@isa.utl.pt (J. Cadima), orestes@isa.utl.pt (J.O. Cerdeira), minhoto@uevora.pt (M. Minhoto). 0167-9473/$ - see front matter c 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2003.11.001