Molecular cloning of the complete 11S seed storage protein gene of Coffea arabica and promoter analysis in transgenic tobacco plants Pierre Marraccini*, Alain Deshayes, Vincent Pétiard, William John Rogers Nestlé Research Center Tours, Plant Science and Technology, 101, avenue Gustave-Eiffel, B.P. 9716, 37097 Tours cedex 2, France * Author to whom correspondence should be addressed (fax +33 2 47 49 14 14; e-mail pierre.marraccini@rdto.nestle.com) (Received December 10, 1998; accepted February 4, 1999) Abstract — In this paper, we present the complete nucleotide sequence of the csp1 gene from Coffea arabica coding for the 11S-globulin seed storage protein. To investigate the sequences responsible for the regulated expression of this seed-specific coffee storage protein gene, about 1 kb of the 5’-upstream region from the csp1 gene was isolated using inverse polymerase chain reaction (IPCR) and then sequenced. Several DNA boxes were found in this coffee sequence that had similarity to those previously identified as being essential for grain (endosperm) specific expression in other plants. To study the ability of this sequence to direct grain-specific expression, the whole fragment, as well as a series of 5’ deletions, was fused to the reporter gene -glucuronidase (uidA) and analysed in transgenic Nicotiana tabacum plants. GUS measurements showed that all the deletions of the csp1 promoter directed the expression of the reporter gene in tobacco grain but not in the other tissues examined. GUS activities also revealed that the csp1 promoter constructs function as very strong promoters by comparison to the strength of the cauliflower mosaic virus (CaMV) 35S promoter. Therefore, this 11S promoter could represent a useful tool to change the expression of targeted genes in the grain of transgenic coffee plants. © Elsevier, Paris 11S storage protein / Coffea arabica / endosperm–specific promoter / coffee genetic engineering CaMV, cauliflower mosaic virus / csp1, coffee storage protein gene / CSPD, disodium 3-(4-methoxyspiro{1,2-dioxetane- 3,2’-(5’-chloro)tricyclo[3.3.1.1 37 ]decan}-4-yl) phenyl phosphate / GUS, -glucuronidase / IPCR, inverse polymerase chain reaction / Ta, annealing temperature / TMAC, tetramethyl ammonium chloride / UTR, untranslated region / WAF, weeks after flowering 1. INTRODUCTION The seed storage proteins constitute a major fraction of the proteins found in the mature seed. The expres- sion of these proteins is temporally regulated during embryogenesis and is restricted to seed tissues such as cotyledons or endosperm [34]. For example, globulin storage proteins found in dicot embryos provide an excellent model for the study of plant gene regulatory mechanisms [11, 27, 36]. Their corresponding mRNAs accumulate to high levels during the maturation phase and are mainly under transcriptional regulation [41]. Several 5’-flanking DNA sequences from globulin protein genes were characterized by their ability to direct gene expression in seeds of transgenic plants [35, 36]. They appeared to include many regu- latory DNA sequences involved in endosperm-specific expression and are therefore considered to be essential molecular tools to modify the composition of seed protein in transgenic plants [11, 41]. In order to find an endosperm-specific promoter for coffee that could facilitate the expression of engi- neered genes in this plant, we analysed seed coffee storage proteins by 2D-gel electrophoresis and N-terminal sequencing [31]. We found that the major seed storage proteins are members of the 11S-legumin family. This result was confirmed after cloning and sequencing of a full-length cDNA coding for the precursor of one coffee legumin. Northern analysis also showed that the pronounced peak of 11S mRNAs Plant Physiol. Biochem., 1999, 37 (4), 273-282 Plant Physiol. Biochem., 0981-9428/99/4/© Elsevier, Paris