385 Sallah SR, et al. J Med Genet 2022;59:385–392. doi:10.1136/jmedgenet-2020-107404 Original research Improving the clinical interpretation of missense variants in X linked genes using structural analysis Shalaw Rassul Sallah , 1,2 Jamie M Ellingford , 1,2 Panagiotis I Sergouniotis, 2 Simon C Ramsden, 2 Nicholas Lench, 3 Simon C Lovell, 1 Graeme C Black 1,2 Diagnostics To cite: Sallah SR, Ellingford JM, Sergouniotis PI, et al. J Med Genet 2022;59:385–392. Additional material is published online only. To view please visit the journal online (http://dx.doi.org/10.1136/ jmedgenet-2020-107404). 1 Division of Evolution and Genomic Sciences, The University of Manchester Faculty of Biology, Medicine and Health, Manchester, UK 2 Manchester Centre for Genomic Medicine, St Mary’s Hospital, Manchester Academic Health Sciences Centre, Manchester, UK 3 Congenica Ltd, Biodata Innovation Centre, Wellcome Genome Campus, Hinxton, London, UK Correspondence to Professor Graeme C Black; graeme.black@manchester. ac.uk SCL and GCB are joint senior authors. Received 19 August 2020 Revised 18 January 2021 Accepted 21 January 2021 Published Online First 25 March 2021 © Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY. Published by BMJ. ABSTRACT Background Improving the clinical interpretation of missense variants can increase the diagnostic yield of genomic testing and lead to personalised management strategies. Currently, due to the imprecision of bioinformatic tools that aim to predict variant pathogenicity, their role in clinical guidelines remains limited. There is a clear need for more accurate prediction algorithms and this study aims to improve performance by harnessing structural biology insights. The focus of this work is missense variants in a subset of genes associated with X linked disorders. Methods We have developed a protein-specifc variant interpreter (ProSper) that combines genetic and protein structural data. This algorithm predicts missense variant pathogenicity by applying machine learning approaches to the sequence and structural characteristics of variants. Results ProSper outperformed seven previously described tools, including meta-predictors, in correctly evaluating whether or not variants are pathogenic; this was the case for 11 of the 21 genes associated with X linked disorders that met the inclusion criteria for this study. We also determined gene-specifc pathogenicity thresholds that improved the performance of VEST4, REVEL and ClinPred, the three best-performing tools out of the seven that were evaluated; this was the case in 11, 11 and 12 different genes, respectively. Conclusion ProSper can form the basis of a molecule- specifc prediction tool that can be implemented into diagnostic strategies. It can allow the accurate prioritisation of missense variants associated with X linked disorders, aiding precise and timely diagnosis. In addition, we demonstrate that gene-specifc pathogenicity thresholds for a range of missense prioritisation tools can lead to an increase in prediction accuracy. INTRODUCTION Advances in high-throughput DNA sequencing tech- nologies have transformed how clinical diagnoses are made in individuals and families with Mende- lian disorders. Genetics tests using these approaches are now widely used in the clinical setting, reducing diagnostic uncertainty and improving patient management. 1 Notably, results of these tests are often ambiguous and it is common for these inves- tigations to yield a number of variants of uncertain significance (VUS). Interpreting these VUS is not a trivial task and numerous in silico prediction tools have been developed to filter and prioritise such changes for further analysis. However, these tools lack robustness and are commonly inconsistent in their predictions 2 3 and their performance. 4 Taking this into account, the American College of Medical Genetics and Genomics (ACMG) and the Associa- tion for Molecular Pathology guidelines for variant interpretation 5 have concluded that bioinformatics tools can provide only supporting evidence for pathogenicity. Improving the performance of these algorithms is expected to have significant implica- tions for variant interpretation and ultimately for clinical decision making. In a previous study, we integrated genetic and structural biology data to predict variant–disease association with high accuracy in the X linked gene CACNA1F (MIM: 300110); the area under the receiver operating characteristic (ROC) and precision–recall (PR) curves was 0.84; Matthews correlation coefficient (MCC) was 0.52. 6 Here, we replicate the accuracy and robustness of this approach in several other disease-implicated X linked genes. Furthermore, we evaluate seven prediction tools and show that the meta-predictors REVEL (rare exome variant ensemble learner), 7 VEST4 (variant effect scoring tool 4.0), 8 and Clin- Pred 9 are generally the most accurate in predicting the impact of missense variants in this group of disorders. We also show that applying a gene- specific pathogenicity threshold when using these tools can improve their performance at least for some genes. More importantly, we demonstrate that the protein-specific variant interpreter (ProSper) that we developed as part of this study performs better than REVEL, VEST4 and ClinPred in 11 of the 21 studied genes. These insights can help clini- cians and diagnostic laboratories better prioritise missense changes in these molecules. METHODS Missense variant data sets The Human Gene Mutation Database (HGMD V.2019.4) 10 was used to retrieve missense variants that have been associated with disease (marked ‘DM’), that is, presumably pathogenic. The Genome Aggregation Database (gnomAD V.2.1.1) 11 was used to retrieve benign/likely benign missense variants reported in males. The variants present in gnomAD which were also present in HGMD as ‘DM?’, that is, disease association is dubious, or as ‘DM’ were filtered out to minimise the inclusion of possible misannotated variants. Missense changes reported in patients tested at the Manchester Genomic Diagnostic Laboratory (MGDL), a UK on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from