Abstract The human fragile-X syndrome is associated with expansions of a (CGG) n triplet repeat within the FMR1 gene. Whilst normal FMR1 arrays consist of vari- able numbers of (CGG) 7–13 blocks punctuated with single AGG triplets, unstable arrays contain longer blocks of un- interrupted (CGG) n . The degree of instability, and subse- quent risk of expansion to the fragile-X mutation, is de- pendent upon the length of this uninterrupted repeat. De- tailed analyses of normal FMR1 array structures suggest that longer uninterrupted blocks of repeat could arise ei- ther through a process of gradual slippage or a more dra- matic loss of an intervening AGG triplet. Up to 15% of Japanese and Chinese individuals have FMR1 triplet ar- rays centred on 36 repeats in length, a modal group not found in Caucasians. As longer FMR1 arrays have been associated with high-risk fragile-X haplotypes in some populations, we investigated the nature of these larger ar- rays. Sequence analysis revealed that the unusual length is due to the presence of a novel (CGG) 6 block within the ar- ray. Several haplotypically related arrays contain blocks of (CGG) 16 or (CGG) 15 , consistent with the fusion of ad- jacent (CGG) 9 and (CGG) 6 blocks after loss of the inter- vening AGG triplet. This is compatible with inferences from the Caucasian population that AGG loss is a mecha- nism by which long blocks of identical repeats are gener- ated. Introduction Fragile-X syndrome is the most common form of inher- ited mental handicap after Down syndrome. It is caused through a dramatic expansion of an unstable triplet repeat (Fu et al. 1991; Oberlé et al. 1991; Verkerk et al. 1991; Yu et al. 1991) which results in loss of gene transcription (Pierreti et al. 1991). The triplet array lies within the 5- untranslated portion of the FMR1 gene (Verkerk et al. 1991). Its length varies within the normal population from 6 to 52 copies and the distribution of array lengths is multi-modal (Fu et al. 1991), a feature caused by the un- derlying compound nature of the array. In most arrays, two or three smaller (CGG) 7–13 blocks are interspersed with single AGG triplets, giving a symmetrical and highly ordered modular structure with major modal group lengths exhibiting a ten-repeat periodicity (Hirst et al. 1994; Kunst and Warren 1994; Snow et al. 1994; Zhong et al. 1995). Allelic diversity results from the variable number and length of these (CGG) 7–13 blocks. In contrast to normal, stable arrays, fragile-X premuta- tion chromosomes carry arrays longer than 54 repeats that are either entirely uninterrupted or have long por- tions of (CGG) n at their 3end (Eichler et al. 1994; Hirst et al. 1994; Snow et al. 1994; Zhong et al. 1995). Expan- sion to fragile-X mutation length (> 200 repeats) appears to occur exclusively within the uninterrupted repeat, and the degree of array instability is related to its length (Eich- ler et al. 1994; Snow et al. 1994). Several unstable arrays, not known to be associated with fragile-X syndrome, have 34 and 31 perfect CGG repeats, demonstrating that a low level of instability exists below the length normally con- sidered as a premutation (Eichler et al. 1994; Snow et al. 1994). Many other arrays within the normal size range carry long portions of (CGG) n at their 3end, with over 10% having (CGG) > 17 (Hirst 1995). The similarity in their structure with unstable and premutation arrays and their association with certain high-risk fragile-X haplotypes has led to the suggestion that some of these may be pre- cursors for recurrent expansion into the premutation range Mark C. Hirst · Tadao Arinami · Charles D. Laird Sequence analysis of long FMR1 arrays in the Japanese population: insights into the generation of long (CGG) n tracts Hum Genet (1997) 101 : 214–218 © Springer-Verlag 1997 Received: 18 March 1997 / Accepted: 18 July 1997 ORIGINAL INVESTIGATION M. C. Hirst () Institute of Molecular Medicine, The John Radcliffe, Headley Way, Headington, Oxford, OX3 9DS, UK M. C. Hirst · C. D. Laird Program in Molecular Medicine, Fred Hutchinson Cancer Research Center, 1124 Columbia Street, Seattle, WA 98104, USA T. Arinami Department of Medical Genetics, Institute of Basic Sciences, University of Tsukuba, Tsukuba, Ibaraki, 305 Japan