SHORT COMMUNICATION doi:10.1111/j.1365-2052.2006.01455.x Estimation of the number of genetic markers required for individual animal identification accounting for genotyping errors J. I. Weller, E. Seroussi and M. Ron Institute of Animal Sciences, A.R.O., The Volcani Center, Bet Dagan 50250, Israel Summary Nearly all studies that consider the power of exclusion for individual identification using genetic markers ignore the possibility of erroneous genotypes, although individual genotype error rates are approximately 1% for microsatellites. Single nucleotide polymorphisms (SNPs) have lower error rates, but because of their lower information content, more SNPs than microsatellites will be required to obtain the same power of exclusion for traceability. In this study, we accounted for genotyping mistakes by requiring at least two discrepancies to reject a match. Exclusion probabilities were computed analytically and by simulation. A microsatellite with five alleles was approximately comparable in exclusion power to 2–2.25 SNPs. At least eight SNPs were required to achieve a 99% probability of rejection for a match between two individuals, while with 25 SNPs there was a <1% chance for a match between any of five million individuals. Keywords genotyping errors, individual identification, microsatellites, single nucleotide polymorphisms. Individual identification and parentage identification by genetic markers are usually based on the Ôexclusion prin- cipleÕ. Nearly all studies that consider the power of exclusion ignore genotyping errors. Bonin et al. (2004) found error rates in the range of 1%, but wrote that Ôgenotyping errors remain a taboo subjectÕ. Estimation of error rates varies among microsatellites, samples analysed and analysis methods. Single nucleotide polymorphisms (SNPs) are rapidly replacing DNA microsatellites as the genetic marker of choice for most objectives, despite the fact that SNPs are nearly always biallelic, while many microsatellites are multiallelic. Advantages of SNPs are summarized by Werner et al. (2004). Genotyping error rates tend to be lower for SNPs (Kennedy et al. 2003; Bonin et al. 2004), but because of their lower information content, more SNPs are required than microsatellites to obtain the same power of exclusion. Although various studies have computed exclusion power for a particular sample of SNPs or microsatellites, only Heaton et al. (2002) estimated the maximum population for which all individuals could be identified as a function of SNP number, but they assumed a zero error rate. Bowling et al. (1997) proposed that paternity should be rejected only if a discrepancy between the putative parent and progeny was found for at least two markers. The objective of this study was to determine by analytical formula and simulation the number of SNPs and micro- satellites required to obtain a 99% probability that none of the samples would be erroneously matched, as a function of the number of individuals, accounting for genotyping mistakes. The probability that individual A should have the same genotype as individual B by chance, P 0,N , for markers 1 to N is: P 0;N ¼ Y N i¼1 X ri i¼1 q 4 ij þ 4 X ri 1 j¼1 X ri k¼jþ1 q 2 ij q 2 ik ! ð1Þ where r i ¼ the number of alleles for marker i, and q ij ¼ the frequency of allele j of marker i. The probability that indi- viduals A and B, genotyped for N markers, should have the same genotype for all but one marker by chance, P 1,N , is: P 1;N ¼ Y N l¼1 X ri j¼1 q 2 lj 1 q 2 lj " þ2 X ri 1 j¼1 X ri k¼jþ1 q lj q lk 1 2q lj q lk ð Þ # P 0;l1 P 0;Nlþ1 ð2Þ where the lth marker is different for individuals A and B, and P 0,l)1 and P 0,N)l+1 are the probabilities that individuals A and B have identical genotypes for markers 1 to l ) 1, and l + 1 to N respectively. Equation (2) was solved for Address for correspondence J. I. Weller, Institute of Animal Sciences, A.R.O., The Volcani Center, Bet Dagan 50250, Israel. E-mail: weller@agri.huji.ac.il Accepted for publication 22 March 2006 Ó 2006 The Authors, Journal compilation Ó 2006 International Society for Animal Genetics, Animal Genetics, 37, 387–389 387