International Journal of Statistics and Applications 2015, 5(5): 208-212 DOI: 10.5923/j.statistics.20150505.04 Effect of Sampling Methods on Misclassification of Fisher's Linear Discriminant Analysis Ghasem Rekabdar * , Bahare Soleymani Department of Mathematics, Abadan Branch, Islamic Azad University, Abadan, Iran Abstract In this study, the effect of stratified sampling design has been studied on the accuracy of Fisher's linear discriminant function or Anderson's ˆ W . For this purpose, we put on weighted estimators in function ˆ W instead of simple random sampling estimators. The results of a simulation study indicated that the performance of ˆ W affected by alteration of sampling methods. The performance of proposed discriminant function st ˆ W in comparison to the classical discriminant function is more appropriate. Specially, in case of the mean of strata have significant difference compared with the overall mean of each group. Keywords Fisher's linear discriminant function, Multivariate normal distribution, Stratified sample design 1. Introduction The discrimination between two groups using multivariate data has been recognized as an important problem that was firstly studied by Fisher (1936). The linear discriminant function (LDF) is a standard approach to yield optimal results when the two groups have a conditional multivariate normal distribution with distinct mean vectors and common covariance matrix (Mardia & et al, 1979). Computing the misclassification probabilities or error rates of the discriminant function are interesting issues. When competing groups have known parameters, the LDF distribution can be obtained exactly by univariate normal distribution (Johnson & Wichern, 1992). In practice, the parameters of the LDF are unknown. Then we estimate these parameters by means of independent random "training samples". The sample distribution of LDF has been studied by several authors. Anderson (1973) obtained the asymptotic expansion of the distribution of the sample Fisher's linear discriminant function ˆ W in terms of order 2 O(n ) . Atakan (2009) compared the performance of seven well known methods in literature to estimating probability of misclassification by bootstrap percentile confidence intervals. This research can provide a good literature review for more study. In several researches, the sampling design effects on statistical methods have been studied. Especially, in * Corresponding author: ghasem_rekabdar@yahoo.com (Ghasem Rekabdar) Published online at http://journal.sapub.org/statistics Copyright © 2015 Scientific & Academic Publishing. All Rights Reserved regression analysis effect of sampling designs on least square estimator studied by some authors (DuMuchel & Duncan, 1981; Horton & Fitzmaurice, 2004). Also, in analysis of variance about mean difference of groups, effect of cluster sampling design on F ratio studied in social and psychological survey, frequently (Hegges & Rhoads, 2011). In multivariate statistical analysis, complex sampling design lead to complicated methods. However, little study has been dedicated to the effect sampling methods on LDF because analytical complexity. Nonetheless, some researchers examining the effect of sampling design on the misclassification probability of the LDF (Kao & McCabe, 1991; Leu & Tsui, 1997). In light of stratified random sampling, Tsui & Leu (1998) indicated that asymptotic expansion of LDF has an error of order O(1) . Therefore, using of LDF without correction can increases the probability of misclassification. Recently, Shahrokh Esfahani & Dougherty (2014) by simulation study showed that separate sampling with an inappropriate sampling ratio can significantly reduce classification accuracy of LDF. The main contribution of the present paper is to approximate LDF probability of misclassification using weighted estimators. In some researches, we have auxiliary information about the groups and it is beneficial to use it to construct LDF. For example, we can be able to categorize each group on the basis of a qualitative variable. In this case, stratified sampling design can be used to draw data from each group. In this study, we substitute unbiased weighted estimators in LDF when the sample design is stratified. Also, a comparison between two linear discriminant functions is made by a simulation study.