Exploring Transcription Factor Binding Properties of Several Non-coding DNA Sequence Elements in the Human NF-IL6 Gene Elsie I. Pares-Matos, Jason S. Milligan and Minou Bina * Department of Chemistry Purdue University, West Lafayette, IN 47907, USA We examined several DNA segments upstream of the transcription start site of the human NF-IL6 gene to evaluate the predictions of two computational models developed to identify potential regulatory elements in the non-coding regions of genes. One model, comparative genomics, is based on the hypothesis that functional regulatory sequences can be localized in alignments of genomic DNA from several species. The other model is based on the hypothesis that protein-binding sites in genomic DNA may include sequence elements that occur frequently in proximal promoters of genes. The segments selected for DNA binding and functional evaluations included: (1) two conserved regions identified in multi-species sequence alignments; (2) a region containing several localized hits with 9- mers that ranked highly in studies of proximal promoters of human genes; and (3) two regions that were either GC-rich and/or contained tracts of G. The assays were done under nearly identical experimental conditions, using a cell line (U937) representing human monocytes/macrophages. The experiments also aimed at evaluating what effect, if any, cellular stimulation could have on the interactions of nuclear proteins with naturally occurring GC-rich elements in a human genomic DNA. In DNA binding assays, several complexes were formed with the conserved regions identified in multi-species sequence alignment. Furthermore, these regions were active in functional assays. The region containing several matches with 9-mers derived from proximal promoters of human genes was not conserved but formed several complexes with nuclear proteins including Sp1, Egr-1, and an unidentified protein. In addition, this region was active in functional assays and responded to cellular stimulations. Overall, the results of the assays suggest an important role for the sequence context of genomic DNA in protein binding and selection. q 2005 Elsevier Ltd. All rights reserved. Keywords: human genome; transcription factor binding sites; sequence context of protein binding sites; conserved non-coding regions; prediction of protein binding elements *Corresponding author Introduction Central to a complete annotation of the human genome is localization of the DNA segments and the sequence elements that control the expression of protein coding genes. A key characteristic of regulatory segments is that they include binding sites for the transcription factors that control gene expression through interactions with DNA. 1 Based on this characteristic, chromatin immuno- precipitation assay has provided a powerful strat- egy for localizing and thus mapping the genomic DNA regions associated with regulators of tran- scription. 2 Another mapping strategy exploits the observed DNase I hypersensitivity of functionally active regulatory regions in chromosomes. 3,4 How- ever, while powerful, these mapping strategies do not identify the actual DNA sequence elements with which transcription factors interact to regulate the expression of genes. 0022-2836/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. Abbreviations used: LPS, lipopolysaccharide; PMA, phorbol 12-myristate 13-acetate; EMSA, electrophoretic mobility shift anag. E-mail address of the corresponding author: bina@purdue.edu doi:10.1016/j.jmb.2005.12.071 J. Mol. Biol. (2006) 357, 732–747