Motif Extraction from Promoter Regions of S. Cerevisiae Zach Solan, David Horn * , Eytan Ruppin Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University, Tel Aviv 69978, Israel zsolan,horn,ruppin@post.tau.ac.il and Shimon Edelman Department of Psychology Cornell University Ithaca, NY 14853, USA se37@cornell.edu and Michal Lapidot, Shai Kaplan, Yael Garten and Yitzhak Pilpel Department of Molecular Genetics Weizmann Institute of Science Rehovot, 76100, Israel michal.lapidot,shai.kaplan,yael.kfir,pilpel@weizmann.ac.il March 22, 2004 Abstract We present a novel method of de novo motif extraction (MEX) from biological sequence data. MEX is an unsupervised method that is data driven. It extracts motifs from the data in a context sensitive fashion, without relying on over-representation in the data-set. We run the algorithm on all S. Cerevisiae promoters and extract thousands of motifs. We next turn to check which of the motifs are likely to be biologically functional, and potentially also to ascribe such motifs functional mean- ing. For that we use the expression coherence (EC) formalism that assesses the effect of regulatory promoter elements on mRNA expression profiles. This is done for both single motifs and for pairs of motifs. The EC score is calculated on 40 experiments. We find 1311 sequence-motifs that score * phone ++972-3-6429305 fax ++972-3-6407932 1