Computational tools for enzyme improvement: why everyone can – and should – use them Maximilian CCJC Ebert 1,2 and Joelle N Pelletier 1,2,3 This review presents computational methods that experimentalists can readily use to create smart libraries for enzyme engineering and to obtain insights into protein– substrate complexes. Computational tools have the reputation of being hard to use and inaccurate compared to experimental methods in enzyme engineering, yet they are essential to probe datasets of ever-increasing size and complexity. In recent years, bioinformatics groups have made a huge leap forward in providing user-friendly interfaces and accurate algorithms for experimentalists. These methods guide efﬁcient experimental planning and allow the enzyme engineer to rationalize time and resources. Computational tools nevertheless face challenges in the realm of transient modern technology. Addresses 1 De ´ partement de biochimie and Center for Green Chemistry and Catalysis (CGCC), Universite ´ de Montre ´ al, Montre ´ al, QC H3T 1J4, Canada 2 PROTEO, The Que ´ bec Network for Research on Protein Function, Engineering and Applications, Que ´ bec, QC G1V 0A6, Canada 3 De ´ partement de chimie, Universite ´ de Montre ´ al, Montre ´ al, QC H3T 1J4, Canada Corresponding author: Pelletier, Joelle N (joelle.pelletier@umontreal.ca) Current Opinion in Chemical BiologyCurrent Opinion in Chemical Biology 2017, 37:89–96 This review comes from a themed issue on Biocatalysis and biotransformation Edited by Bernhard Hauer and Stefan Lutz http://dx.doi.org/10.1016/j.cbpa.2017.01.021 1367-5931/ã 2017 Elsevier Ltd. All rights reserved. Introduction The evolution of environmental regulations to reward eco-responsible practises is rapidly shifting the focus of the chemical and pharmaceutical industries toward apply- ing enzymes as ‘greener’ catalysts in complex organic syntheses. Early successes in the biocatalytic production of ﬁne chemicals have shown that investments in devel- oping such environmentally-friendly process chemistry can be more than offset by increased cost-effectiveness [1  ]. Target reactions range from those that beneﬁt from the exquisite regioselectivity, enantioselectivity and chemoselectivity of enzymes to less complex reactions where the high reactivity or eco-friendliness of enzymes is advantageous relative to conventional catalysis. Enzyme engineering is required because the high spec- iﬁcity of enzymes comes with a trade-off. For instance, while possessing the chemical reactivity of interest, the active site of a natural biocatalyst may be unable to bind and transform a desired industrial substrate [2]. In those cases, extensive active-site engineering is necessary to achieve product formation. In other cases, low substrate loads, moderate catalytic efﬁciency under process con- ditions or poor operational stability can require exten- sive, labor-intensive and expensive engineering to tailor biocatalysts to industrial processes. In contrast to con- ventional catalysis, the lack of a toolbox, or standard workﬂow, facilitating the development of biocatalyzed processes is delaying the development and adoption of industrial biocatalysts. Indeed, biocatalysts must be optimized for genuine industrial reaction conditions so as to not overtax downstream processing and thus maintain their advantage over traditional synthesis [3  ]. Early directed evolution experiments on the basis of random mutagenesis gave promising results [4,5]. How- ever, attempts to sample a meaningful fraction of the vast number of possible mutations soon gave way to the recognition that the chemical space covered was limited: the substitution of two nucleotides within one codon is seldom achieved [6]. For this reason, random mutagen- esis can only reach three to seven amino acid substitu- tions per residue [7]. Even if high-quality, unbiased random libraries are obtained, there remains the often insurmountable problem of screening: tailoring high- throughput screening methods to speciﬁc catalytic reac- tions is not trivial, and is often limited to model systems or to low numbers [8]. The generation of ‘smart libraries’ that are of smaller size because they target speciﬁc residues – typically, those involved in function – has been shown to cover functional space more effectively [9–11]. However, generating smart libraries typically requires structural and functional knowledge including information on ligand binding, such that for many sys- tems the choice of residues to include in such a library might not be evident. Here, we present recent advances in computational tools for enzyme engineering with a focus on the genera- tion of substrate-speciﬁc smart libraries and ﬁnding the Available online at www.sciencedirect.com ScienceDirect www.sciencedirect.com Current Opinion in Chemical Biology 2017, 37:89–96