Contents lists available at ScienceDirect Journal of Proteomics journal homepage: www.elsevier.com/locate/jprot Proteogenomic analysis of Mycobacterium tuberculosis Beijing B0/W148 cluster strains Julia Bespyatykh a, , Alexander Smolyakov b,c , Andrei Guliaev a , Egor Shitikov a , Georgij Arapidi a,b,c , Ivan Butenko a , Marine Dogonadze d , Olga Manicheva d , Elena Ilina a , Victor Zgoda e , Vadim Govorun a,b,c a Federal Research and Clinical Centre of Physical-Chemical Medicine, Malaya Pirogovskaya 1a, Moscow 119435, Russian Federation b Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, Russian Federation c Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russian Federation d Research Institute of Phtisiopulmonology, St. Petersburg, Russian Federation e Orekhovich Research Institute of Biomedical Chemistry, Moscow, Russian Federation ARTICLE INFO Keywords: Mycobacterium tuberculosis Beijing B0/W148 Proteome Proteogenomic Label-free proteomic analysis Tuberculosis ABSTRACT Nowadays proteomics is one of the major instruments for editing and correcting annotation of genomic in- formation. The correct genome annotation is necessary for omics studies of clinically relevant pathogens like Mycobacterium tuberculosis as well as for the progress in drug design and in silico biology. Here, we focused on the proteogenomic analysis of W-148 strain belonging to the Beijing B0/W148 cluster. This cluster, also known as a successfulclone possesses unique pathogenic properties and has a unique genome organization. Taking into account high similarity of cluster strains at the genomic level we analyzed MS/MS dataset obtained for 63 clinical isolates of Beijing B0/W148. Based on H37Rv and W-148 annotations we identied 2546 proteins re- presenting more than 60% of total proteome. A set of peptides (n = 404) specic for W-148 was found when compared with H37Rv. Start sites for 32 genes were corrected based on the combination of LC-MS/MS proteomic data with genomic six-frame translation. Additionally, we have shown the presence of peptides related to 10 genes earlier known as pseudogenes. Signicance: Mycobacterium tuberculosis is one of the most dangerous pathogens. Phylogenetically, it may be divided into major lineages and among them, lineage 2 (predominantly Beijing genotype) one of the most successful lineages with an increasing prevalence in the global population. At the same time, strains of the Beijing B0/W148 cluster, a successfulclone of Mycobacterium tuberculosis possess even more interesting fea- tures. Only one complete genome of this cluster, W-148, present in the NCBI database (CP012090.1) and it demonstrates a number of signicant dierences from the well-known reference genome H37Rv. For the W-148 strain many genes are annotated as pseudoand no attempts were made to correct this. Thereby, in this study, we have conducted a proteomic analysis of the cluster strains and corrected current genome annotation. We hope that the data obtained will help to increase the quality of identications in proteomic and transcriptomic analysis of M. tuberculosis Beijing B0/W148 cluster strain in subsequent studies. 1. Introduction Mycobacterium tuberculosis causing tuberculosis is one of the most studied bacterial pathogens. Nevertheless, in 2016, 10.4 million people were infected and more than 1.5 million deaths occurred were due to tuberculosis [1]. Despite a decrease in new cases of active TB, the si- tuation remains extremely tense due to the frequent detection of mul- tidrug- and extensively drug-resistant strains, which leads to an ex- pensive treatment and temporary disability. A large amount of such resistant strains belong to the Beijing family. Among them, Beijing B0/ W148 cluster (also named СС2[2], East European 2 [3] and ECDC0002 [4]) strains are widespread in Russia and Former Soviet Union countries [5]. Isolates of this cluster have a number of features in comparison with other genotypes: increased virulence, strong association with drug resistance, increased transmissibility and tness success [5]. For these reasons, the cluster was called a successfulclone of M. tuberculosis [6]. Today whole genome sequences for more than 200 Beijing B0/ https://doi.org/10.1016/j.jprot.2018.07.002 Received 13 April 2018; Received in revised form 29 June 2018; Accepted 10 July 2018 Corresponding author. E-mail address: JuliaBespyatykh@gmail.com (J. Bespyatykh). Journal of Proteomics xxx (xxxx) xxx–xxx 1874-3919/ © 2018 Elsevier B.V. All rights reserved. Please cite this article as: Bespyatykh, J., Journal of Proteomics (2018), https://doi.org/10.1016/j.jprot.2018.07.002