Kernel-based Weighted Discriminant Analysis with QR Decomposition and Its Application to Face Recognition JIANQIANG GAO Liaocheng University School of Mathematical Sciences Hunan Road NO.1, Liaocheng P.R. CHINA gaojianqiang82@126.com LIYA FAN Liaocheng University School of Mathematical Sciences Hunan Road NO.1, Liaocheng P.R. CHINA fanliya63@126.com Abstract: Kernel discriminant analysis (KDA) is a widely used approach in feature extraction problems. However, for high-dimensional multi-class tasks, such as faces recognition, traditional KDA algorithms have a limitation that the Fisher criterion is non-optimal with respect to classification rate. Moreover, they suffer from the small sample size problem. This paper presents two variants of KDA called based on QR decomposition weighted kernel discriminant analysis (WKDA/QR), which can effectively deal with the above two problems, and based on singular value decomposition weighted kernel discriminant analysis (WKDA/SVD). Since the QR decomposition on a small size matrix is adopted, the superiority of the proposed method is its computational efficiency and can avoid the singularity problem. In addition, we compare WKDA/QR with WKDA/SVD under the parameters of weighted function and kernel function. Experimental results on face recognition show that the WKDA/QR and WKDA/SVD are more effective than KDA, and WKDA/QR is more effective and feasible than WKDA/SVD. Key–Words: QR decomposition, Kernel discriminant analysis (KDA), Feature extraction, Face recognition, small sample size (SSS) 1 Introduction Linear discriminant analysis (LDA), seeking optimal linear projections such that the Fisher criterion of the between-class scatter versus the within-class scatter is maximized, is one of the most well-known statistical technique for feature extraction and dimension reduc- tion [1-4]. Recently, several extensions of LDA [5-8] have been developed concerning robustness issue. Al- though LDA is an effective method for feature extrac- tion, it is still a linear technique in nature. Hence, it is not sufficient to deal with some features which have nonlinear relationships. To overcome the problem, the kernel trick is applied to effectively describe non- linear relationships of input data. Recently, kernel- based learning methods have attracted much attention in the areas of pattern recognition and machine learn- ing. Scholkopf et al. [9] applied the kernel trick to principal component analysis (KPCA), which can ef- fectively compute principal components in the high- dimensional feature space. Mika et al. [10] proposed kernel discriminant analysis (KDA) for two-class cas- es. Baudat and Anouar [11] developed a generalized kernel discriminant analysis (GKDA) for multiclass problems. Because of its ability to extract discrimi- nant nonlinear features, KDA has been used widely in many real-world applications such as document anal- ysis, face recognition and image retrieval [12-16]. Yang et al. [16] further discussed kernel Fish- er discriminant analysis and pointed out that kernel Fisher discriminant analysis is equivalent to kernel principal component analysis plus Fisher linear dis- criminant analysis. Therefore, for high-dimensional multi-class tasks such as faces recognition, the origi- nal KDA-based algorithms usually encounter three d- ifficulties: the first is the singularity problem caused by the small sample size (SSS) problem [11-13], in which the number of training samples is far small- er than the dimension of the sample. Moreover, be- cause KDA uses an implicit nonlinear mapping to project low-dimensional input patterns into a high- dimensional feature space, many large sample size problem in input space maybe changed into SSS prob- lems in the feature space. The second is that the Fisher separability criterion is not directly related to classi- fication rate, that is, the classes with larger distance to each other in feature space are more emphasized when the Fisher criterion is optimized, which leads that the resulting projection preserves the distance of already well-separated classes, causing a large over- lap of neighboring classes [17-20]. The third is that these algorithms still face the computational difficulty of the eigen-decomposition of matrices in the high- WSEAS TRANSACTIONS on MATHEMATICS Jianqiang Gao, Liya Fan ISSN: 1109-2769 358 Issue 10, Volume 10, October 2011