A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15 Stephan Menzel 1 , Chad Garner 2 , Ivo Gut 3 , Fumihiko Matsuda 3 , Masao Yamaguchi 3 , Simon Heath 3 , Mario Foglio 3 , Diana Zelenika 3 , Anne Boland 3 , Helen Rooks 1 , Steve Best 1 , Tim D Spector 4 , Martin Farrall 5 , Mark Lathrop 3 & Swee Lay Thein 1,6 F cells measure the presence of fetal hemoglobin, a heritable quantitative trait in adults that accounts for substantial phenotypic diversity of sickle cell disease and b thalassemia. We applied a genome-wide association mapping strategy to individuals with contrasting extreme trait values and mapped a new F cell quantitative trait locus to BCL11A, which encodes a zinc-finger protein, on chromosome 2p15. The 2p15 BCL11A quantitative trait locus accounts for 15.1% of the trait variance. Genome-wide association methodology has recently identified sus- ceptibility loci for several diseases, but it has a relatively high per- sample cost and requires large samples to detect modest risk effects. Strategies to increase power include selecting subjects with increased genetic load through early onset or identifying familial clustering of disease. Here, we apply a powerful alternative approach that uses a comparatively small number of study subjects taken from the extremes of a quantitative distribution. In healthy adults, fetal hemoglobin (HbF; also known as a 2 g 2 ) is present at residual levels (o0.6% of total hemoglobin) with over twenty-fold variation. Ten to fifteen percent of adults in the upper tail of the distribution have HbF levels between 0.8% and 5.0%, a condition referred to as heterocellular hereditary persistence of fetal hemoglobin (hHPFH) 1 . Although these HbF levels are modest in otherwise healthy individuals, interaction of hHPFH with b thalasse- mia or sickle cell disease (SCD) can increase HbF output in these individuals to levels that are clinically beneficial 2 . The ameliorating effect of HbF on SCD and b thalassemia has prompted numerous genetic and pharmacological approaches to reactivation of HbF synthesis 3 , but the molecular mechanisms are not fully understood. Current pharmacological agents, such as hydroxycarbamide and butyrate analogs, show that it is possible to augment HbF production therapeutically, but these agents are limited by toxic effects and variable patient response. HbF in the normal range (including hHPFH) is most sensitively measured by the proportion of F cells (that is, the proportion of erythrocytes containing measurable amounts of HbF 1 ). The majority of the quantitative variation is highly heritable (h 2 ¼ 0.89) 4 , but the genetic etiology is complex, with several contributing quantitative trait loci (QTLs). To date, major QTLs have been identified with strong and reproducible statistical support at XmnI- G g in the b globin locus on chromosome 11p15 (ref. 5) and in the HBS1L-MYB intergenic region on chromosome 6q23 (ref. 6). To map additional QTLs, we selected a panel of 179 unrelated individuals from the extreme upper and lower tails (above the 95 th or below the 5 th percentile points (that is, 4P 95 or oP 5 )) of the F cell distribution, drawn from a database of 5,184 phenotyped indivi- duals from the St. Thomas Adult Twin Registry (http://www. twinsuk.ac.uk 7 ), and genotyped them using the Illumina Sentrix HumanHap300 BeadChip (Supplementary Methods online). The study was approved by the local ethics committee of St. Thomas’ and King’s College Hospitals, London (LREC number 00-245), and all participants gave informed written consent. For the 308,015 markers retained after quality control, we assessed association using a Fisher exact w 2 statistic for the allele counts in the high or low trait categories along with a linear regression analysis of the continuous trait against genotype (additive effects), with age and sex included as covariates. The two analyses gave similar results, and P values from the allele count test are presented in the text. Tests of non-additivity in the linear regression led to identical conclusions. Although extreme discordant sampling designs violate the usual normality assumption of linear regression, it does not inflate the type 1 error rate 8 , which we confirmed by simulations and inspection of the Q-Q plot (Supple- mentary Fig. 1 online). The genomic control parameter was 1.01, indicating that there was minimal admixture or cryptic relatedness in this sample 9 . Principal components analysis 10 confirmed this. We identified major QTLs on chromosomes 2p15 (P ¼ 4.0 10 –16 ), 6q23 (P ¼ 8.8 10 –25 ) and 11p15 (P ¼ 1.7 10 –26 )(Fig. 1a). The 6q23 QTL was first localized through linkage analysis in a large Asian-Indian family with beta thalassemia 11 , then validated and fine- mapped in northern Europeans 6 . The association signal on 11p15 maps to the beta globin cluster, where the functional variant is thought to be the XmnI- G g variant at position –158 upstream of the G g globin gene 5 . Markers within a 126-kb segment on chromosome 2p15 (nucleo- tides 60456396 to 60582798) identified a third, previously unreported QTL close to the oncogene BCL11A 12 . We genotyped an additional Received 15 March; accepted 2 July; published online 2 September 2007; doi:10.1038/ng2108 1 King’s College London School of Medicine, Division of Gene and Cell Based Therapy, King’s Denmark Hill Campus, London SE5 9PJ, UK. 2 University of California at Irvine, Epidemiology Division, Department of Medicine, Irvine, California 92697-7550, USA. 3 Centre National de Ge ´ notypage, Institut Ge ´ nomique, Commissariat a ` l’Energie Atomique, 91006 Evry, France. 4 King’s College London School of Medicine, Division of Genetics and Molecular Medicine, St. Thomas’ Hospital, London SE1 7EH, UK. 5 The Wellcome Trust Centre for Human Genetics, Department of Cardiovascular Medicine, University of Oxford, Headington, Oxford OX3 7BN, UK. 6 King’s College Hospital, Department of Haematological Medicine, Denmark Hill, London SE5 9RS, UK. Correspondence should be addressed to S.L.T. (sl.thein@kcl.ac.uk). NATURE GENETICS VOLUME 39 [ NUMBER 10 [ OCTOBER 2007 1197 BRIEF COMMUNICATIONS © 2007 Nature Publishing Group http://www.nature.com/naturegenetics