A new frontier in synthetic biology: automated design of small RNA devices in bacteria Guillermo Rodrigo 1, 2 , Thomas E. Landrain 1, 2 , Shensi Shen 1, 2 , and Alfonso Jaramillo 1, 2 1 Institute of Systems and Synthetic Biology (iSSB), CNRS, F-91000 E ´ vry, France 2 Universite ´ d’E ´ vry Val d’Essonne, iSSB, F-91000 E ´ vry, France RNA devices provide synthetic biologists with tools for manipulating post-transcriptional regulation and condi- tional detection of cellular biomolecules. The use of computational methods to design RNA devices has im- proved to the stage where it is now possible to automate the entire design process. These methods utilize structure prediction tools that optimize nucleotide sequences, together with fragments of known indepen- dent functionalities. Recently, this approach has been used to create an automated method for the de novo design of riboregulators. Here, we describe how it is possible to obtain riboregulatory circuits in prokaryotes by capturing the relevant interactions of RNAs inside the cytoplasm using a physicochemical model. We focus on the regulation of protein expression mediated by intra- or intermolecular interactions of small RNAs (sRNAs), and discuss the design of riboregulators for other func- tions. The automated design of RNA devices opens new possibilities for engineering fully synthetic regulatory systems that program new functions or reprogram dys- functions in living cells. RNA design automation The design of synthetic regulatory circuits with defined behaviors enables the reprogramming of cells for particu- lar applications [1]. However, the path between sequence and function remains challenging to map due to the un- predictability of protein function, preventing the general application of design methods from quantitative models to engineer gene circuits in living cells [2]. Due to this chal- lenge, synthetic biologists have typically relied on time- consuming experimental screening techniques to obtain the desired functionality of the engineered circuit. The predictability of gene regulation circuits can be increased by avoiding the translation layer as much as possible. The engineering of devices made of sRNAs, noncoding RNAs with regulatory functions [3,4], would enable higher func- tional predictability from physicochemical models. Recent studies have exploited these properties, using computa- Review Glossary Activation energy (DG act ): height of the free energy barrier according to a reaction coordinate that has to be overcome to produce a chemical reaction. For RNA–RNA interactions, the reaction is mediated by the seed region, the short sequence that initiates the intermolecular pairing between the two species [6,7]. The activation energy can be calculated as DG act = C + DG seed > 0, where DG seed is the free energy of formation of the seed-based hybridizing complex and C is an entropic constant that accounts for the difference in conformational flexibility between the intramolecular and the high-energy intermediate states [56]. Thus, minimizing DG act is equivalent to minimizing DG seed . Energy landscape: representation of the effective free energy for every possible conformational state (intra- or intermolecular) of a given genotype [6,17]. The effective free energy (G eff ) at one conformational state is the addition of the free energy of that conformation (G) and a term of constraints related to function (G constr ), G eff = G + G constr (Figure 1B,C, main text). We distinguish the energy landscape from the fitness landscape, where, for each genotype, we represent the value of the objective function (Figure 1C, main text) [55]. The fitness landscape is more appropriate to represent the evolution of sequences, and the optimization algorithms, such as Monte Carlo simulated annealing [57], should be adapted to its shape to avoid local traps. Forward folding problem: formulation to obtain the molecular structure of minimum free energy (2D or 3D) for a given nucleotide sequence. Interaction energy (DG inter ): free energy of formation of the intermolecular complex, defined as the free energy release (products versus reactants) that results from an RNA–RNA interaction. Inverse folding problem: formulation to find the nucleotide sequence that will fold (with a minimum free energy) into a specified conformation [20]. This problem can be solved with evolutionary techniques even if the forward problem (folding) is not feasible computationally. This can be generalized to multispecies ensembles involving proteins and/or RNAs [6,17]. It has an exponentially large number of solutions. Logic gate: ideal device that implements a Boolean function. It gives a single logical output that is the result of a logical operation on one or more logical inputs (this can be represented by a truth table). In genetics, it is implemented by a regulatory circuit that expresses a defined gene under the presence of an appropriate combination of biomolecules [1]. Mutation operator: function used to generate genetic diversity during the optimization process. It usually involves nucleotide substitutions, deletions, and additions, but it could be as complex as desired. Objective function (F obj ): real-valued function to be optimized for obtaining the desired behavior. In the context of Figure 1C (main text), we minimize the objective function defined by the energy landscape, associated to chemical reactions and structural constraints. It is usually linked to a prediction of gene expression, in appropriate conditions, from the stable conformations. In a multiobjective problem (as is often the case), all objectives and constraints are added into a single equation to enforce the simultaneous rewarding of all objectives [5–8]. Pseudoknot: secondary structure of an RNA (or an RNA complex) formed by the interaction of a loop with another region [38]. They have important roles in controlling biological functions, but are challenging to predict computa- tionally. Riboregulator: small noncoding RNA that can regulate gene expression by interacting with an mRNA. It can bind to the 5 0 UTR for repression (e.g., by blocking the RBS) or for activation (e.g., by inducing a conformational change to release the RBS) [9]. 0168-9525/$ – see front matter ß 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.06.005 Corresponding author: Jaramillo, A. (alfonso.jaramillo@issb.genopole.fr). Keywords: computational design; post-transcriptional regulation; RNA folding; synthetic biology. TIGS-1067; No. of Pages 8 Trends in Genetics xx (2013) 1–8 1