Cloning, Characterization, and the Complete 56.8-Kilobase DNA
Sequence of the Human NOTCH4 Gene
Linheng Li,*
,
†
,1
Guyang M. Huang,†
,1,2
Amy B. Banta,*
,
† Yu Deng,†
,
‡ Todd Smith,§
Penny Dong,† Cynthia Friedman,† Lei Chen,† Barbara J. Trask,†
Thomas Spies,
¶
Lee Rowen,† and Leroy Hood *
,
†
,3
* Stowers Institute for Medical Research; †Department of Molecular Biotechnology, University of Washington, Seattle,
Washington 98195; ‡Institute of Genetics, Fudan University, Shanghai, 200433 People’s Republic of China; §Geospiza, Inc.,
Bioinformatics Consulting and Contracting, 2442 NW Market Street, Suite 344, Seattle, Washington 98107;
and
¶
Fred Hutchinson Cancer Research Center, Clinical Research Division, Seattle, Washington 98104
Received January 9, 1998; accepted April 6, 1998
The first complete mammalian genomic sequence re-
ported thus far in the Notch gene family, including a
putative promoter region and 30 exons of the human
NOTCH4 gene spanning 56.8 kb of DNA, were se-
quenced. The NOTCH4 locus contains a TATA-less pro-
moter with two putative transcription initiation sites
(Inr), three RBP-J sites, and two GATA recognition
sites. Two cDNA isoforms, NOTCH4(L) and NOTCH4(S),
were identified. Whereas the NOTCH4(S) isoform con-
tains the entire coding sequence, the NOTCH4(L) iso-
form has two unspliced intronic sequences between
exons 11 and 12 and exons 20 and 21 and a misspliced
exon 6. Consistent with these results, two alterna-
tively spliced isoforms of transcripts of approximately
9.3 and 6.7 kb were detected by Northern blot analysis.
The predicted amino acid sequence of the NOTCH4
protein based on the NOTCH4(S) cDNA sequence con-
tains 2003 amino acids and includes the predominant
motifs of the Notch family: 29 epidermal growth factor
(EGF)-like repeats, 3 Notch/lin-12 repeats, a trans-
membrane region, 6 cdc10/Ankyrin repeats, and a
PEST domain. © 1998 Academic Press
INTRODUCTION
The human major histocompatibility complex (MHC
or HLA) locus is located on the short arm of chromo-
some 6 and spans approximately 3.5— 4.0 Mb (Milner
and Campbell, 1992). This locus includes three distinct
regions: MHC classes I, II, and III. The class I and II
regions encode highly polymorphic MHC proteins that
are involved in antigen presentation during immune
responses as well as genes encoding a variety of other
products (Beck et al., 1992; Campbell and Trowsdale,
1993; Spies et al., 1989). The class III region, which is
known to be extremely gene-dense, spans about 1.1 Mb
and is flanked by the class I HLA-B and class II HLA-
DRA loci (Milner and Campbell, 1992). To elucidate the
structural organization of the class III region, we have
undertaken its genomic sequence analysis. A novel
Notch gene near the centromeric end of the class III
locus was identified during the course of this sequence
analysis. Sugaya and co-workers reported the identifi-
cation of a Notch gene at this locus, but did not com-
plete the analysis (Sugaya et al., 1994).
Notch was first identified by its role in regulating the
segregation of neuroblasts from ectodermal cells in
Drosophila (Artavanis-Tsakonas and Simpson, 1991;
Artavanis-Tsakonas et al., 1995; Fortini et al., 1993a;
Kimble and Simpson, 1997). Notch is also essential for
eye, wing, bristle, egg chamber, and mesoderm devel-
opment (Cagan and Ready, 1989; Fehon et al., 1991;
Fortini et al., 1993b; Hartenstein and Posakany, 1990;
Heitzler and Simpson, 1991; Palka et al., 1990; Ruo-
hola et al., 1991; Xu et al., 1992). Significant progress
has been made in the identification of the homologues
of Drosophila Notch in a variety of organisms, includ-
ing Caenorhabditis elegans [Lin-12 and Glp-1, homo-
logues of Drosophila Notch (Yochem and Greenwald,
1989)], Xenopus (Coffman et al., 1990), mouse (Franco
del Amo et al., 1992; Kopan and Weintraub, 1993;
Lardeli et al., 1994; Reaume et al., 1992), rat (Wein-
master et al., 1991, 1992), and human (Ellisen et al.,
1991; Blaumueller et al., 1997). Most vertebrates have
multiple Notch genes, with each of the products carry-
ing out related but distinct cell surface receptor func-
Sequence data described in this paper have been deposited with
the EMBL/GenBank Data Libraries under Accession No. U89335 for
the genomic sequence and Accession No. U95299 for the cDNA se-
quence of human NOTCH4.
1
L. Li and G. M. Huang made equal contributions to this work.
2
Present address: Pangea Systems Inc., 1999 Harrison Street,
Suite 1100, Oakland, CA 94612.
3
To whom correspondence should be addressed at Department of
Molecular Biotechnology, University of Washington, Box 357730,
Seattle, WA 98195. Telephone: (206) 616-5104. Fax: (206) 685-7301.
GENOMICS 51, 45–58 (1998)
ARTICLE NO. GE985330
45
0888-7543/98 $25.00
Copyright © 1998 by Academic Press
All rights of reproduction in any form reserved.