REGULAR ARTICLE Functional subgenomics of Clostridium thermocellum cellulosomal genes: Identification of the major catalytic components in the extracellular complex and detection of three new enzymes Vladimir V. Zverlov 1,2 , Josef Kellermann 3 and Wolfgang H. Schwarz 1 1 Institute for Microbiology, Technische Universität München, Freising-Weihenstephan, Germany 2 Institute of Molecular Genetics, Russian Academy of Science, Moscow, Russia 3 Max Planck Institute for Biochemistry, Martinsried, Germany Clostridium thermocellum produces the most efficient enzyme-complex for the degradation of polysaccharides in biomass, the large extracellular cellulosome. The draft complete genomic sequence of Clostridium thermocellum was screened for open reading frames (ORF) containing cellulosomal dockerin sequences. Seventy-one putative cellulosomal genes were detected. One third of these ORFs may be involved in cellulose hydrolysis. Most of the others showed homology to hemicellulases, pectinases, chitinases, glycosidases or esterases potentially involved in the unwrapping of cellulose fibers. To identify the predominant catalytic components, cellulosomes were purified and the components were separated by an adapted two-dimensional gel electro- phoresis technique. The apparent major spots were identified by MALDI-TOF/TOF. Ten of the components were previously known: the structural protein CipA, the endo-glucanases Cel8A, Cel5G, Cel9N, the cellobiohydrolases Cbh9A, Cel9K, Cel48S, the xylanases Xyn10C, Xyn10Z, and the chitinase Chi18A. In addition, three hitherto unknown major components were detected, Cel9R, Xyn10D and Xgh74A. These major components in the cellulosomal particles most prob- ably constitute the essential enzymes for crystalline cellulose hydrolysis. Received: August 5, 2004 Revised: November 5, 2004 Accepted: December 14, 2004 Keywords: Cellulase / Protein pattern / Two-dimensional gel electrophoresis / Xylanase / Xyloglu- canase 3646 Proteomics 2005, 5, 3646–3653 1 Introduction The strictly anerobic, thermophilic bacterium Clostridium ther- mocellum is the microorganism with the fastest growth rate on the recalcitrant substrate crystalline cellulose [1]. The higher efficiency of its extracellular hydrolytic machinery over that of other microorganisms is due to the formation of a huge en- zyme complex, the cellulosome, which has a size of ,18 nm diameter and a mass in excess of 2610 3 kDa [2]. About 30 cel- lulosome-related genes were hitherto isolated by screening of genomic libraries [3, 4]. They include the scaffoldin protein CipA containing nine type I cohesin modules to which enzymes and other protein components specifically dock by virtue of their type I dockerin modules [5]. Type II cohesin- dockerin modules bind the CipA protein to the cell wall anchoring proteins OlpB or SdbA and possibly others [6, 7]. The non-enzymatic component CseP is presumably involved in structure formation of the huge complex [8]. The presence of a dockerin module in a protein can be used as a marker sequence for cellulosomal genes. However, the exact composi- tion of the cellulosomal particles has never been investigated. Correspondence: Dr. Wolfgang H. Schwarz, Research Group Microbial Biotechnology, Institute for Microbiology, Technische Universität München, Am Hochanger 4, D-85350 Freising-Wei- henstephan, Germany E-mail: wschwarz@wzw.tum.de Fax: 149-8161-71-5475 Abbreviations: CBM, carbohydrate-binding module; GH/GHF , glycosyl hydrolase/glycosyl hydrolase family 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de DOI 10.1002/pmic.200401199