International Journal of Computer Applications (0975 8887) Volume 63No.1, February 2013 39 PCA plus LDA on Wavelet Co-occurrence Histogram Features for Texture Classification and its Applications Shivashankar S. Dept. of Computer Science Karnatak Science College, Dharwad, Karnataka, India Hiremath P.S. Dept. of Computer Science, Gulbarga University, Gulbarga, Karnataka, India ABSTRACT In this paper, we propose a combined approach, namely, PCA plus LDA on Wavelet Co-occurrence Histogram Features (WCHF) for texture classification. The texture features are extracted using the Wavelet Co-occurrence Histogram (WCH) from wavelet decomposed images, which capture the information about relationships between each high frequency subband and that in low frequency subband of the wavelet transformed image at the corresponding level. The correlation between the subbands at the same resolution exhibits a strong relationship, indicating that this information is significant for characterizing a texture. Thus WCH features thus extracted form a feature vector of dimension 384 for gray scale image which is very high. A combination of Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) is applied on WCH feature vector for dimensionality reduction and enhancement of the class separability respectively. The vectors obtained from the LDA are representative of each image. The classification performance is tested on a set of 32 Brodatz textures. The results are compared with the method proposed in [Hiremath and Shivashankar, 2008]. The effectiveness of proposed method is demonstrated for two different applications, i.e., CBIR and script identification (both printed and handwritten). The classification performance is analyzed using the k-NN classifier. It is evident from the experimental results that the proposed method exhibits superior performance in the reduced feature set. Keywords Texture classification; PCA and LDA; Wavelet; CBIR; document image; script identification 1. INTRODUCTION In human vision, the texture analysis plays a significant role in object recognition by the human brain. Thus, in the computer vision, the object recognition algorithms rely on the effectiveness of texture analysis. Textures provide important characteristics for surface and object identification from aerial or satellite photographs, biomedical images and many other types of images. The texture classification is fundamental to many applications such as automated visual inspection, biomedical image processing, content based image retrieval, script identification and remote sensing. Much research work has been done on the texture classification and segmentation for the last four decades. Despite these efforts, the texture classification is still considered an interesting but difficult problem in image processing. Suppose there is a finite number of texture classes C i , i = 1,2,3,…,n, and a number of training samples of each class are available. Based on the information extracted from the training samples, a decision rule is designed which classifies a given sample of unknown class into one of the n classes. To design an effective algorithm for the texture classification, it is essential to find a set of texture features with good discriminating power. More recently, a number of new algorithms for extracting features from the coefficients of a wavelet transform have been proposed in the literature. In each of these feature extraction techniques, the filter coefficients for each subband are analyzed separately. Moreover, the correlation between the bands of the same and that at the different resolution levels is ignored, even though it is well-known that strong relationships between the neighbour bands exist. Portilla and Simoncelli have shown that without knowledge of these relations, accurate reconstruction of the texture is not possible, indicating that this information is significant for characterizing a texture [14]. However, Hiremath and Shivashankar proposed a texture feature extraction method based on the co-occurrence histograms of wavelet decomposed images, which capture the information about relationships between each high frequency subband and that in low frequency subband of the transformed image at the corresponding level. The features extracted form a feature vector of dimensionality 384 for the gray scale image [13]. Despite having excellent discriminative power, the feature set suffers from high dimensionality. This high dimensionality of the feature vectors creates problems in constructing efficient data structures for classification. One of the problems with high-dimensional feature sets is that, in many cases, not all the measured features are important. For this reason, there is considerable interest in reducing the dimensionality of the descriptors while preserving the original topology of the high dimensional space. An ideal dimensionality reduction technique has the capability of efficiently reducing the data into a lower- dimensional model, while preserving the properties of the original data. The traditional and current state-of-the-art dimension reduction methods are published in the statistics, signal processing and machine learning literature. In the last decade, Fisher linear discriminant analysis(FLDA) has been demonstrated to be a successful discriminant analysis algorithm in face recognition [3, 4, 5, 6]. It performs dimensionality reduction by trying to find a mapping from originally high-dimensional space to a low-dimensional space in which the most discriminant features are preserved. As LDA has been broadly applied and well studied in recent years, a series of LDA algorithms have been developed, the most famous method of which is Fisherface [6, 5]. The PCA plus LDA is found to be an effective framework for linear discriminant analysis(LDA) in high dimensional and singular case[12].