Proteomics 2013, 13, 663–673 663 DOI 10.1002/pmic.201200312 REVIEW Proteomics of nonmodel plant species Antoine Champagne and Marc Boutry Institut des Sciences de la Vie, Universit ´ e catholique de Louvain, Louvain-la-Neuve, Belgium Until recently, large scale proteomic investigations in the plant field have only been possible for a few model species for which the whole genome sequence had been fully determined. In contrast, for many other species with a strong economic interest as sources of human food and animal feed, as well as industrial and pharmacological molecules, little was known about their genome sequence and identifying the proteome in these species was still considered challenging. However, progress has been made as a result of several recent advances in pro- teomics tools, e.g. in MS technology and data search programs, and the increasing availability of genomic and cDNA sequences from various species. Moreover, next-generation sequencing technologies now make it possible to rapidly determine, at a reasonable cost, the genome or RNA sequence of species not currently considered as models, thus considerably expanding the plant sequence databases. This review will show how these advances make it possible to identify a large set of proteins, even for species for which few sequences are currently available. Keywords: Complex sample fractionation / Next generation sequencing / Nonmodel plants / Plant proteomics / Proteogenomics Received: July 23, 2012 Revised: October 17, 2012 Accepted: October 22, 2012 Additional supporting information may be found in the online version of this article at the publisher’s web-site 1 Which plant model(s)? There is probably no ideal plant model in the sense that all the required experimental tools and knowledge are available to answer the biological questions in which the research com- munity might be interested. However, various plant species have served as models for different purposes. Of the impor- tant crop species, maize (Zea mays) has long been a genetic model, notably because it is a cross pollinator with separate fe- male and male flowers, which makes genetic analysis easier. Although not a crop, thale cress or Arabidopsis (Arabidopsis thaliana) has also been used as a model in classical genet- ics because it is small and has a short life cycle. However, it was its small genome, fully sequenced in 2000 [1], that made Arabidopsis the plant paradigm for molecular biology and Correspondence: Dr. Marc Boutry, Institut des Sciences de la vie, Universit ´ e Catholique de Louvain, Croix du Sud, 4-15, Box 7.07.14, 1348 Louvain-la-Neuve, Belgium E-mail: marc.boutry@uclouvain.be Fax: +32-10-473872 Abbreviations: MudPIT, multidimensional protein identification technology; NGS, next generation sequencing various “omics” approaches. In the last 10 years, research on this species has resulted in tremendous progress in various fields, such as growth and development, primary metabolism, bioenergetics, and nutrition. Not surprisingly, a survey of the literature indicates that the number of publications per year related to plant pro- teomics is increasing each year (Fig. 1A) and that Arabidop- sis is the species most used for proteomics studies (Fig. 1B). Next comes rice (Oryza sativa), a cereal model for which the genome was sequenced in 2002, followed by species such as wheat, maize, soybean, and potato for which the genome se- quence is not available or was not available at the time the vast majority of these studies were performed. Species other than Arabidopsis and rice account for 58% of the publications, in- dicating that a large percentage of proteomics studies have been performed on nonmodel species, defined here as species for which the genome sequence is not available or was not available when the proteomics studies were performed). As mentioned above, no single plant species can be used for all purposes. For instance, some biological or physiologi- cal processes in crops of interest might not occur in Arabidop- sis. One example is nitrogen fixation by legumes. Here, the model used has been Medicago truncatula, for which genetic C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com