Demo: I-TASSER Gateway for Protein Structure Prediction and Structure-based Function Annotation Chengxin Zhang, S. M. Mortuza, Yang Zhang* *Corresponding author address: Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA; email: zhng@umich.edu Abstract: I-TASSER (Iterative Threading ASSembly Refinement) is a composite pipeline for protein structure prediction and structure-based protein function annotation. Starting from sequence of a target protein, structure templates are identified by threading from the PDB. Full- length target structure is then constructed by fragment re-assembly simulation. The final structure model is further compared to entries in BioLiP structure-function database for biological function interference. Recently, I-TASSER is implemented as an XSEDE science gateway, which helped >14,000 users to decipher structure and function of >38,000 proteins in the last 12 months. The I-TASSER gateway is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER/. 1. Introduction The gap between the overwhelming number of protein sequences and the slow accumulation of experimentally characterized protein structures is increasing. As of May 2017, for example, there are >85 million protein sequences deposited in the UniProt database, while >41 thousands of them have experimentally characterized structures in the PDB. The lack of structures for the vast majority of protein sequences significantly hinders our understanding of their biological functions. To address this issue, I-TASSER (Iterative Threading ASSembly Refinement) [1] has been developed for automated protein structure prediction. It has been consistently ranked as one of the best servers in the most recently CASP community-wide protein structure prediction experiments [2]. In order to understand the biological function of proteins, including their Gene Ontology (GO) terms, Enzyme Commission (EC) numbers and Ligand Binding Sites, two algorithms, COFACTOR [3] and COACH [4], are developed and integrated into the I-TASSER protocol for function annotations. Both algorithms are ranked as the top webservers in CASP9 and the CAMEO community-wide protein function prediction experiments [5], respectively. To make these state-of-the-art structure and function prediction algorithms accessible to the biological community, the I-TASSER science gateway is developed to provide an integrated platform for the biologists in the community. The I-TASSER gateway is unique compared to other webservers for protein structure prediction [6-8] and function annotation [9-11] for the tight integration of structure and function modeling, and feature-rich result webpage that facilitates biological interpretation. Since its integration with the XSEDE-Comet computer cluster in October 2016, the gateway has been used by 14,503 researchers around the world (Fig. 1.). Fig. 1. Distribution of I-TASSER science gateway users among 132 countries. Presented at Gateways 2017, University of Michigan, Ann Arbor, MI; October 23-25, 2017. https://gateways2017.figshare.com/.