DATABASES OFFICIAL JOURNAL www.hgvs.org Spinocerebellar Ataxias: An Example of the Challenges Associated with Genetic Databases for Dynamic Mutations Joanne E. Martindale, 1,2 Sara Seneca, 1,3 Stefan Wieczorek, 1,4 and Jorge Sequeiros 1,5 ∗ 1 European Molecular Genetics Network (EMQN), spinocerebellar ataxias (SCAs) External Quality Assessment (EQA) Scheme organizer (JS) and assessors (JEM, SS, SW, JS), Manchester, UK; 2 Sheffield Diagnostic Genetics Service, Sheffield Children’s NHS Foundation Trust, Western Bank, Sheffield, UK; 3 UZ Brussel, Center for Medical Genetics, Research Group Reproduction and Genetics, Vrije Universiteit Brussel (VUB), Brussels, Belgium; 4 Human Genetics, Ruhr-University, Bochum, Germany; 5 IBMC—Institute for Molecular and Cell Biology; and ICBAS, Universidade do Porto, Porto, Portugal For the Databases in Neurogenetics Special Issue Received 7 March 2012; accepted revised manuscript 21 June 2012. Published online 2 July 2012 in Wiley Online Library (www.wiley.com/humanmutation).DOI: 10.1002/humu.22156 ABSTRACT: Locus-specific databases are an important source of information for diagnostic laboratories and a valued means of improving quality of genetic testing. Although increasingly frequent, databases for oligonu- cleotide repeat expansions are still scarce, due to factors that make them different and the building of databases much more difficult. Definition of what constitutes “the repeat” to measure is not a simple matter and correct siz- ing is not always straightforward. Reference ranges and penetrance classes are not easy to establish. Acceptable margins of error depend on the disease and allele-size dis- tribution, and vary according to size range and pathogenic significance. Inter- and intralaboratorial variance is well documented and allele distribution may vary among pop- ulations. The spinocerebellar ataxias, used only as an ex- ample of those difficulties, are also a highly heterogeneous group, which includes loci with both pathogenic repeat expansions and point mutations or insertions/deletions. They display a variable, but often overlapping pheno- type, where genotype–phenotype correlation is difficult or nonexistent. Standard (Human Genome Variation So- ciety) nomenclature is not appropriate for oligonucleotide repeats, as established at harmonization among all EMQN (European Molecular Genetics Network) external quality assessment (EQA) schemes for “repeat disorders.” Cura- tion of such databases is a difficult task, but one that needs to be addressed adequately and without much delay. Hum Mutat 33:1359–1365, 2012. C 2012 Wiley Periodicals, Inc. KEY WORDS: dominant ataxia; SCA; triplet repeats; oligonucleotide; database; LSDB; population diversity Introduction Databases of information about genes and mutations causing genetic diseases are an invaluable resource for any center offer- This work was funded in part by FEDER through the Operational Competitive- ness Programme – COMPETE and by national funds through FCT – Fundac ¸˜ ao para a Ciˆ encia e a Tecnologia under the project FCOMP-01-0124-“FEDER-022718 (PEst- C/SAU/LA0002/2011)”. ∗ Correspondence to: Jorge Sequeiros, UnIGENe, IBMC, R. Campo Alegre 823, 4150- 180 Porto, Portugal. E-mail: jorge.sequeiros@ibmc.up.pt ing genetic testing as they provide important technical details, a means to update scientific knowledge and thus constitute a straight- forward way to ascertain whether a variant has been identified previously in association with a particular disorder. They may also indicate if there is any experimental evidence supporting a pathogenic role and give useful information about specific fea- tures of a phenotype, genotype–phenotype correlations (if exist- ing), and whether there is evidence for founder effects in certain populations. A number of such databases are widely available and may be locus specific, such as the X-linked Adrenoleukodystro- phy Database (www.x-ald.nl/) and the Cystic Fibrosis Mutation Database (www.genet.sickkids.on.ca/cftr/app). Other databases contain data pertaining to a wide variety of genes, examples being the Leiden Open Variation Database (LOVD; www.lovd.nl/2.0), which provides access to data from over 4,000 individual genes, and the Diagnostic Mutation Database (DMuDB) in the United Kingdom, which was established in 2005 by the National Genetics Reference Laboratory in Manchester, as “a repository of diagnostic variant data, to support the diagnos- tic process in UK genetic testing laboratories” (www.ngrl.org.uk/ Manchester/projects/informatics/dmudb). Not many databases exist for diseases caused by dynamic mutations, despite the increasing number of neurodegenerative dis- orders (NDDs) identified as being caused by expansion of oligonu- cleotide repeat sequences. For example, trinucleotide repeat expan- sions, such as those associated with Huntington’s disease (HD) and a number of the autosomal dominant spinocerebellar ataxias (SCAs), are individually rare, but collectively they contribute significantly to this class of genetic disorders. Although for HD there is a general consensus as to what con- stitutes normal, large normal unstable, reduced penetrance, and disease alleles, this is not so for the SCAs, which present more of a challenge to laboratories offering testing. These disorders are also clinically and genetically heterogeneous, exhibiting sig- nificant differences in prevalence and in the type of mutations causing ataxia among different populations. For some genes, dis- ease may result from point mutations, as well as repeat expan- sions, although the phenotype may be very different; for exam- ple, episodic ataxia type 2 and familial hemiplegic migraine type 1 are due to point mutations in the CACNA1A gene, expansions in which cause SCA6 [Barros et al., 2012]. We thus believe that, in addition to some challenges and problems specific to the SCAs, these are also a good paradigm to show the great difficulties asso- ciated with building of mutation databases in the case of “repeat disorders.” C 2012 WILEY PERIODICALS, INC.