Nonparametric Statistics Vol. 16(3–4), June–August 2004, pp. 365–384 PRINCIPAL COMPONENT ESTIMATION OF FUNCTIONAL LOGISTIC REGRESSION: DISCUSSION OF TWO DIFFERENT APPROACHES M. ESCABIAS ∗ , A. M. AGUILERA and M. J. VALDERRAMA Department of Statistics and Operation Research, University of Granada, Spain (Received 14 December 2002; Revised 21 July 2003; In ﬁnal form 10 September 2003) Over the last few years many methods have been developed for analyzing functional data with different objectives. The purpose of this paper is to predict a binary response variable in terms of a functional variable whose sample information is given by a set of curves measured without error. In order to solve this problem we formulate a functional logistic regression model and propose its estimation by approximating the sample paths in a ﬁnite dimensional space generated by a basis. Then, the problem is reduced to a multiple logistic regression model with highly correlated covariates. In order to reduce dimension and to avoid multicollinearity, two different approaches of functional principal component analysis of the sample paths are proposed. Finally, a simulation study for evaluating the estimating performance of the proposed principal component approaches is developed. Keywords: Functional data analysis; Logistic regression; Principal components 1 INTRODUCTION Data in many different ﬁelds come to us through a process naturally described as functions. This is the case of the evolution of a magnitude such as, for example, temperature in time. It usually happens that we only have discrete observations of functional data in spite of its continuous nature. In order to reconstruct the true functional form of data many approximation techniques have been developed, such as interpolation or smoothing in a ﬁnite dimensional space generated by a basis. A general overview of functional data analysis (FDA) can be seen in Ramsay and Silverman (1997, 2002) and Valderrama et al. (2000). The great development on FDA in recent years has meant that many studies with longi- tudinal data historically raised from a multiple point of view are now analyzed on the basis of their functional nature. Some of these works have focused on modeling the relationship between functional predictor and response variables observed together at varying times. This is the case for example, in Liang and Zeger (1986) who proposed a set of estimating equations which take into account the correlation between discrete longitudinal observations in order to estimate a set of time-independent parameters in the generalized linear model context. With the objective of estimating a functional variable in a future period of time from its past evo- lution, Aguilera et al. (1997a) introduced the principal component prediction models (PCP) ∗ Corresponding author. E-mail: mescabias@ugr.es ISSN 1048-5252 print; ISSN 1029-0311 online c  2004 Taylor & Francis Ltd DOI: 10.1080/10485250310001624738