Bioinformatics Corner J Mol Microbiol Biotechnol 2003;5:7–10 DOI: 10.1159/000068719 An Automated Program to Screen Databases for Members of Protein Families Xiaofeng Zhou Rikki N. Hvorup Milton H. Saier, Jr. Division of Biology, University of California at San Diego, La Jolla, Calif., USA Milton H. Saier, Jr. Division of Biology University of California at San Diego La Jolla, CA 92093-0116 (USA) Tel. +1 858 534 4084, Fax +1 858 534 7108, E-Mail msaier@ucsd.edu ABC Fax + 41 61 306 12 34 E-Mail karger@karger.ch www.karger.com © 2003 S. Karger AG, Basel 1464–1801/03/0051–0007$19.50/0 Accessible online at: www.karger.com/mmb Key Words Protein family W Database searches W Automated program W ScreenTransporter W Homology Abstract We have developed a program, ScreenTransporter (ST), to screen for potential members of recognized transport- er families. This program uses Blastpgp as the engine to search a nonredundant database, NRDB90, based on an adjustable E-value cut-off as well as adjustable protein size criteria. Additional parameters can be integrated in later versions. ST is convenient for easily obtaining non- redundant members of transporter families starting from several homologous query sequences. The program can be applied to any protein family. Copyright © 2003 S. Karger AG, Basel Introduction Transport systems serve the cell in a number of capaci- ties [Saier, 2000]. They play central roles in the uptake of essential nutrients, in the maintenance of ionic homeosta- sis, in the excretion of end products of metabolism and harmful substances, and in allowing communication be- tween cells and between cells and the external environ- ment [Saier, 1994, 1998]. They are also involved in ener- gy-producing and energy-consuming processes [Mitchell, 1967a, b]. We have classified and tabulated transport proteins of known function and sequence in the transporter classifi- cation database, TCDB. Descriptions of the families and the protein members of these families are provided. Lists of primary and secondary references are also available. This resource allows easy access to a huge body of experi- mental and bioinformatic data. In bioinformatic analyses of transport proteins, it is often difficult to make predictions from individual se- quences; more reliable predictions can result from analy- ses of the sequences of several homologous proteins. For example, it is useful to identify the available members of a family for analyses of average hydropathy, amphipathici- ty and similarity, for estimation of domain structure and for establishing evolutionary relationships. Availability of many family members and the presence of conserved motifs also allow one to search for more distantly related homologues. Our lab has developed programs for charac- terizing integral membrane proteins with emphasis on transporters [Zhai and Saier, 2001a, b, 2002; Zhai et al., 2002].