Defining a similarity threshold for a functional protein sequence pattern: The signal peptide cleavage site Henrik Nielsen, Jacob Engelbrecht, Gunnar von Heijne , and Søren Brunak Center for Biological Sequence Analysis Department of Physical Chemistry The Technical University of Denmark DK-2800 Lyngby, Denmark Department of Biochemistry Arrhenius Laboratory Stockholm University S-106 91 Stockholm, Sweden Corresponding author: Søren Brunak Center for Biological Sequence Analysis Department of Physical Chemistry The Technical University of Denmark DK-2800 Lyngby, Denmark phone: +45 4525 2477 fax: +45 4593 4808 e-mail: Short title: Similarity screening of functional sites Keywords: Sequence data sets, Similarity screening, Redundancy reduction, Signal peptides, Database errors. 1