385 Sallah SR, et al. J Med Genet 2022;59:385–392. doi:10.1136/jmedgenet-2020-107404
Original research
Improving the clinical interpretation of missense
variants in X linked genes using structural analysis
Shalaw Rassul Sallah ,
1,2
Jamie M Ellingford ,
1,2
Panagiotis I Sergouniotis,
2
Simon C Ramsden,
2
Nicholas Lench,
3
Simon C Lovell,
1
Graeme C Black
1,2
Diagnostics
To cite: Sallah SR,
Ellingford JM, Sergouniotis PI,
et al. J Med Genet
2022;59:385–392.
► Additional material is
published online only. To view
please visit the journal online
(http://dx.doi.org/10.1136/
jmedgenet-2020-107404).
1
Division of Evolution and
Genomic Sciences, The
University of Manchester Faculty
of Biology, Medicine and Health,
Manchester, UK
2
Manchester Centre for
Genomic Medicine, St Mary’s
Hospital, Manchester Academic
Health Sciences Centre,
Manchester, UK
3
Congenica Ltd, Biodata
Innovation Centre, Wellcome
Genome Campus, Hinxton,
London, UK
Correspondence to
Professor Graeme C Black;
graeme.black@manchester.
ac.uk
SCL and GCB are joint senior
authors.
Received 19 August 2020
Revised 18 January 2021
Accepted 21 January 2021
Published Online First 25 March
2021
© Author(s) (or their
employer(s)) 2022. Re-use
permitted under CC BY.
Published by BMJ.
ABSTRACT
Background Improving the clinical interpretation
of missense variants can increase the diagnostic
yield of genomic testing and lead to personalised
management strategies. Currently, due to the imprecision
of bioinformatic tools that aim to predict variant
pathogenicity, their role in clinical guidelines remains
limited. There is a clear need for more accurate prediction
algorithms and this study aims to improve performance
by harnessing structural biology insights. The focus of this
work is missense variants in a subset of genes associated
with X linked disorders.
Methods We have developed a protein-specifc variant
interpreter (ProSper) that combines genetic and protein
structural data. This algorithm predicts missense variant
pathogenicity by applying machine learning approaches
to the sequence and structural characteristics of variants.
Results ProSper outperformed seven previously
described tools, including meta-predictors, in correctly
evaluating whether or not variants are pathogenic; this
was the case for 11 of the 21 genes associated with X
linked disorders that met the inclusion criteria for this
study. We also determined gene-specifc pathogenicity
thresholds that improved the performance of VEST4,
REVEL and ClinPred, the three best-performing tools out
of the seven that were evaluated; this was the case in
11, 11 and 12 different genes, respectively.
Conclusion ProSper can form the basis of a molecule-
specifc prediction tool that can be implemented
into diagnostic strategies. It can allow the accurate
prioritisation of missense variants associated with X
linked disorders, aiding precise and timely diagnosis.
In addition, we demonstrate that gene-specifc
pathogenicity thresholds for a range of missense
prioritisation tools can lead to an increase in prediction
accuracy.
INTRODUCTION
Advances in high-throughput DNA sequencing tech-
nologies have transformed how clinical diagnoses
are made in individuals and families with Mende-
lian disorders. Genetics tests using these approaches
are now widely used in the clinical setting, reducing
diagnostic uncertainty and improving patient
management.
1
Notably, results of these tests are
often ambiguous and it is common for these inves-
tigations to yield a number of variants of uncertain
significance (VUS). Interpreting these VUS is not a
trivial task and numerous in silico prediction tools
have been developed to filter and prioritise such
changes for further analysis. However, these tools
lack robustness and are commonly inconsistent in
their predictions
2 3
and their performance.
4
Taking
this into account, the American College of Medical
Genetics and Genomics (ACMG) and the Associa-
tion for Molecular Pathology guidelines for variant
interpretation
5
have concluded that bioinformatics
tools can provide only supporting evidence for
pathogenicity. Improving the performance of these
algorithms is expected to have significant implica-
tions for variant interpretation and ultimately for
clinical decision making.
In a previous study, we integrated genetic and
structural biology data to predict variant–disease
association with high accuracy in the X linked
gene CACNA1F (MIM: 300110); the area under
the receiver operating characteristic (ROC) and
precision–recall (PR) curves was 0.84; Matthews
correlation coefficient (MCC) was 0.52.
6
Here,
we replicate the accuracy and robustness of this
approach in several other disease-implicated X
linked genes. Furthermore, we evaluate seven
prediction tools and show that the meta-predictors
REVEL (rare exome variant ensemble learner),
7
VEST4 (variant effect scoring tool 4.0),
8
and Clin-
Pred
9
are generally the most accurate in predicting
the impact of missense variants in this group of
disorders. We also show that applying a gene-
specific pathogenicity threshold when using these
tools can improve their performance at least for
some genes. More importantly, we demonstrate that
the protein-specific variant interpreter (ProSper)
that we developed as part of this study performs
better than REVEL, VEST4 and ClinPred in 11 of
the 21 studied genes. These insights can help clini-
cians and diagnostic laboratories better prioritise
missense changes in these molecules.
METHODS
Missense variant data sets
The Human Gene Mutation Database (HGMD
V.2019.4)
10
was used to retrieve missense variants
that have been associated with disease (marked
‘DM’), that is, presumably pathogenic. The
Genome Aggregation Database (gnomAD V.2.1.1)
11
was used to retrieve benign/likely benign missense
variants reported in males. The variants present
in gnomAD which were also present in HGMD as
‘DM?’, that is, disease association is dubious, or as
‘DM’ were filtered out to minimise the inclusion of
possible misannotated variants. Missense changes
reported in patients tested at the Manchester
Genomic Diagnostic Laboratory (MGDL), a UK
on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from on June 9, 2022 by guest. Protected by copyright. http://jmg.bmj.com/ J Med Genet: first published as 10.1136/jmedgenet-2020-107404 on 25 March 2021. Downloaded from