Computational and in Vitro Analysis of Destabilized DNA Regions in the Interferon Gene Cluster: Potential of Predicting Functional Gene Domains ² S. Goetze, A. Gluch, C. Benham, § and J. Bode* ,‡ German Research Center for Biotechnology/Epigenetic Regulation, Mascheroder Weg 1, D-38124 Braunschweig, Germany, and UniVersity of California DaVis Genome Center, DaVis, California 95616-8536 ReceiVed July 23, 2002; ReVised Manuscript ReceiVed September 10, 2002 ABSTRACT: Recent evidence adds support to a traditional concept according to which the eukaryotic nucleus is organized into functional domains by scaffold or matrix attachment regions (S/MARs). These regions have previously been predicted to have a high potential for stress-induced duplex destabilization (SIDD). Here we report the parallel results of binding (reassociation) and computational SIDD analyses for regions within the human interferon gene cluster on the short arm of chromosome 9 (9p22). To verify and further refine the biomathematical methods, we focus on a 10 kb region in the cluster with the pseudogene IFNWP18 and the interferon R genes IFNA10 and IFNA7. In a series of S/MAR binding assays, we investigate the promoter and termination regions and additional attachment sequences that were detected in the SIDD profile. The promoters of the IFNA10 and the IFNA7 genes have a moderate 20% binding affinity to the nuclear matrix; the termination sequences show stronger association (70-80%) under our standardized conditions. No comparable destabilized elements were detected flanking the IFNWP18 pseudogene, suggesting that selective pressure acts on the physicochemical properties detected here. In extended, noncoding regions a striking periodicity is found of rather restricted SIDD minima with scaffold binding potential. By various criteria, the underlying sequences represent a new class of S/MARs, thought to be involved in a higher level organization of the genome. Together, these data emphasize the relevance of SIDD calculations as a valid approach for the localization of structural, regulatory, and coding regions in the eukaryotic genome. While there is increasing awareness that the eukaryotic nucleus is a highly structured organelle, its functional architecture has remained a largely unresolved enigma of molecular biology. According to recent publications the nucleus is organized into three major compartments: an open euchromatic compartment containing active genes, a het- erochromatic compartment containing inactive genes, and an interchromatin compartment mostly consisting of proteins (1), which is otherwise referred to as the in vivo nuclear matrix. Following its discovery in 1974 (2), the nuclear matrix has been shown to accommodate the replication and tran- scription machineries and, accordingly, the genes that are being actively transcribed. The DNA sequences thought to be responsible for mediating such effects, by serving as an anchor to the nuclear matrix, are the scaffold/matrix attach- ment regions (S/MARs), 1 which are recognized according to topological features that become reinforced by topological stress as it arises during replication and transcription (3, 4). According to a popular model a group of extended, tightly matrix-attached constitutive S/MARs serves as a coordinate system, which enables the formation of independently regulated chromatin loops ranging in size between 5 and 200 kb. A tendency for active genes to be organized into small loops has been noted (5). S/MARs were discovered almost two decades ago and have been defined as the DNA elements that either stay at the nuclear skeleton after the extraction of the histones and soluble factors during a halo-mapping procedure (6) or that reassociate with a scaffold or matrix preparation with high affinity in vitro (7-9). For obvious reasons, only the latter property (reassociation strength rather than the actual status in vivo) lends itself to computerization. While S/MARs do not conform to any obvious consensus sequence, their most consistent feature appears to be the propensity to expose single strands under negative superhelical tension (10) in addition to their intrinsic potential to form secondary structures for which strand separation is a prerequisite (4). Following this kind of reasoning, the prediction of S/MARs has required entirely new biomathematical concepts. The development of dedicated algorithms is considered important since S/MARs are commonly found at the boundaries of transcription units, typically in association with DNase I hypersensitive sites (11), where they may function as genomic insulators (12), in the vicinity of enhancers (13), or origins of replication (14). Thereby this class of elements emerges as a valuable marker enabling the localization of independently regulated genomic units, the so-called chro- ² This work was supported by grants from Deutsche Forschungs- gemeinschaft (Bo 419/6-1/-2) and BMBF (01 KW 0003). * Corresponding author: e-mail jbo@gbf.de; telephone +49 531 6181 251; fax +49 531 6181 262. German Research Center for Biotechnology/Epigenetic Regulation. § UC Davis Genome Center; e-mail cjbenham@ucdavis.edu. 1 Abbreviations: BUR, base-unpairing region; CUE, core-unpairing element; HS, DNase I hypersensitive site; SIDD, stress-induced duplex destabilization; S/MAR, scaffold/matrix attachment region; UE, un- pairing element. 154 Biochemistry 2003, 42, 154-166 10.1021/bi026496+ CCC: $25.00 © 2003 American Chemical Society Published on Web 12/13/2002