Complete sequencing of expanded SAMD12 repeats by long-read sequencing and Cas9-mediated enrichment Takeshi Mizuguchi, 1 Tomoko Toyota, 2 Satoko Miyatake, 1,3 Satomi Mitsuhashi, 4 Hiroshi Doi, 5 Yosuke Kudo, 6 Hitaru Kishida, 7 Noriko Hayashi, 8 Rie S. Tsuburaya, 9 Masako Kinoshita, 10 Tetsuhiro Fukuyama, 11 Hiromi Fukuda, 1,5 Eriko Koshimizu, 1 Naomi Tsuchida, 1 Yuri Uchiyama, 1 Atsushi Fujita, 1 Atsushi Takata, 1 Noriko Miyake, 1 Mitsuhiro Kato, 12 Fumiaki Tanaka, 5 Hiroaki Adachi 2 and Naomichi Matsumoto 1 A pentanucleotide TTTCA repeat insertion into a polymorphic TTTTA repeat element in SAMD12 causes benign adult familial myoclonic epilepsy. Although the precise determination of the entire SAMD12 repeat sequence is important for molecular diagnosis and research, obtaining this sequence remains challenging when using conventional genomic/genetic methods, and even short-read and long-read next-generation sequencing technologies have been insufficient. Incomplete information regarding expanded repeat sequences may hamper our understanding of the pathogenic roles played by varying numbers of repeat units, genotype–phenotype correlations, and mutational mechanisms. Here, we report a new approach for the precise determination of the entire expanded re- peat sequence and present a workflow designed to improve the diagnostic rates in various repeat expansion diseases. We examined 34 clinically diagnosed benign adult familial myoclonic epilepsy patients, from 29 families using repeat-primed PCR, Southern blot, and long-read sequencing with Cas9-mediated enrichment. Two cases with questionable results from repeat-primed PCR and/ or Southern blot were confirmed as pathogenic using long-read sequencing with Cas9-mediated enrichment, resulting in the identifi- cation of pathogenic SAMD12 repeat expansions in 76% of examined families (22/29). Importantly, long-read sequencing with Cas9-mediated enrichment was able to provide detailed information regarding the sizes, configurations, and compositions of the expanded repeats. The inserted TTTCA repeat size and the proportion of TTTCA sequences among the overall repeat sequences were highly variable, and a novel repeat configuration was identified. A genotype–phenotype correlation study suggested that the insertion of even short (TTTCA) 14 repeats contributed to the development of benign adult familial myoclonic epilepsy. However, the sizes of the overall TTTTA and TTTCA repeat units are also likely to be involved in the pathology of benign adult familial myoclonic epilepsy. Seven unsolved SAMD12-negative cases were investigated using whole-genome long-read sequencing, and in- frequent, disease-associated, repeat expansions were identified in two cases. The strategic workflow resolved two questionable SAMD12-positive cases and two previously SAMD12-negative cases, increasing the diagnostic yield from 69% (20/29 families) to 83% (24/29 families). This study indicates the significant utility of long-read sequencing technologies to explore the pathogenic contributions made by various repeat units in complex repeat expansions and to improve the overall diagnostic rate. 1 Department of Human Genetics, Yokohama City University Graduate School of Medicine, Yokohama 236-0004, Japan 2 Department of Neurology, University of Occupational and Environmental Health School of Medicine, Kitakyushu 807-8555, Japan 3 Clinical Genetics Department, Yokohama City University Hospital, Yokohama 236-0004, Japan 4 Department of Genomic Function and Diversity, Medical Research Institute Tokyo Medical and Dental University, Tokyo 113- 8510, Japan Received July 09, 2020. Revised November 02, 2020. Accepted November 17, 2020. Advance access publication April 1, 2021 V C The Author(s) (2021). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For permissions, please email: journals.permissions@oup.com doi:10.1093/brain/awab021 BRAIN 2021: 144; 1103–1117 | 1103 Downloaded from https://academic.oup.com/brain/article/144/4/1103/6204783 by guest on 29 April 2023