Haplotypic relationship between SNP and microsatellite markers at the NOS2A locus in two populations D Burgner 1,3 , K Rockett 1 , H Ackerman 1 , J Hull 1 , S Usen 2 , M Pinder 2 and DP Kwiatkowski 1 1 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; 2 Medical Research Council Laboratories, Fajara, The Gambia; 3 School of Paediatrics and Child Health, University of Western Australia, Perth, Australia The density of genetic markers required for successful association mapping of complex diseases depends on linkage disequilibrium (LD) between non-functional markers and functional variants. The haplotypic relationship between stable markers and potentially unstable but highly informative markers (e.g. microsatellites) indicates that LD might be maintained over considerable genetic distance in non-African populations, supporting the use of such ‘mixed marker haplotypes’ in LD-based mapping, and allowing inferences to be drawn about human origins. We investigated sequence variation in the proximal 2.6 kb of the inducible nitric oxide synthase (NOS2A) promoter and the relationship between SNP haplotypes and a pentanucleotide microsatellite (the ‘NOS2A 2.6 microsatellite’) in Gambians and UK Caucasians. UK Caucasians exhibited a subset of sequence diversity observed in Gambians, sharing four of 11 SNPs and a similar haplotypic structure. Five SNPs were found in the sequence of interspersed repetitive DNA elements. In both populations, there was dramatic loss of LD between SNP haplotypes and microsatellite alleles across a very short physical distance, suggesting a high intrinsic mutation rate of the NOS2A 2.6 microsatellite, the SNP haplotypes are relatively ancient, or that this was a region of frequent recombination. Understanding locus- and population-specific LD is essential when designing and interpreting genetic association studies. Genes and Immunity (2003) 4, 506–514. doi:10.1038/sj.gene.6364022 Keywords: linkage disequilibrium; single nucleotide polymorphism; microsatellite; inducible nitric oxide synthase Introduction Association analysis exploits linkage disequilibrium (LD) between genetic markers to identify functional genetic variants underlying human diseases. Mapping of disease loci by association has much higher sensitivity than traditional linkage analysis and is therefore poten- tially valuable for locating the many subtle genetic determinants that are thought to underlie most complex diseases. 1 To date, its use has largely been limited to fine-mapping of loci identified either through linkage analysis or by analysis of candidate loci, but association mapping has also been successfully employed for genome-wide association studies to confirm susceptibil- ity loci for complex diseases originally identified by linkage analysis 2 or to identify novel susceptibility loci. 3 The feasibility of genome-wide LD-based mapping is currently the subject of much interest and debate, 4–6 as several fundamental issues remain unresolved. Crucially, it is unclear how densely genetic markers should be spaced 7 and which types of markers should be employed. 8 Underlying this uncertainty is the issue of the extent of LD between markers and disease-modifying functional variants, which will largely determine the chances of identifying functional disease-causing variants. Increasingly, regions of low haplo- type diversity and high LD (‘haplotype blocks’) have been described in the human genome 9 and the influence of local patterns of LD in determining the likely success of association mapping has been highlighted. 10,11 Single nucleotide polymorphisms (SNPs) are generally considered the ideal genetic marker, as they are common, stable and increasingly amenable to automated high- throughput genotyping methods. Estimates of the density of the SNP map required for mapping disease loci have varied with corresponding estimates of the extent of LD across the human genome. Although the number of reported SNPs is increasing rapidly, 12 the density of markers may not be sufficient to allow association mapping of some loci. Moreover, it is clear that LD varies markedly across genetic regions 10,13,14 and between populations, 15,16 so that an LD map of the human genome 8,17 may need to be constructed for many different populations. 18 If the more pessimistic estimates of LD in human populations are borne out 5 , then sufficiently informative SNPs may not occur often enough in the genome to allow their exclusive use in association mapping and other markers will need to be incorporated into fine-mapping studies of disease loci. 19 Polymorphic repetitive sequence motifs, particularly microsatellites, may serve as useful additional markers for mapping studies. Their frequency in the genome and high heterozygosity potentially make them informa- tive and attractive candidates for such studies, and Received 07 April 2003; revised 06 June 2003; accepted 02 July 2003 Correspondence: Dr D Burgner, School of Paediatrics and Child Health, University of Western Australia, Princess Margaret Medical Centre, GPO Box D184, Perth, WA 6840, Australia. E-mail: dburgner@paed.uwa.edu.au Genes and Immunity (2003) 4, 506–514 & 2003 Nature Publishing Group All rights reserved 1466-4879/03 $25.00 www.nature.com/gene