Characterization of a highly conserved gene (OS4) amplified with CDK4 in human sarcomas Yan A Su, Margo M Lee, Carolyn M Hutter and Paul S Meltzer* Laboratory of Cancer Genetics, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA Amplification and overexpression of genes involved in cellular growth control occur frequently in human cancers. Here, we report characterization of the full length OS4 cDNA derived from 12q13-q15 (Su et al., Proc. Natl. Acad. Sci. USA, 91: 9121 – 9125, 1994), a region frequently amplified in sarcomas and brain tumors. This cDNA consists of 4833 base pairs (bp) encoding an open reading frame (ORF) of 283 amino acids. The ORF predicts a water-soluble acidic (pI 5.50) polypeptide with a molecular weight of 31 759. Database searches revealed highly significant similarity between OS4 and eight proteins predicted from genomic sequences of Caenorhabditis elegans, Schizosaccaharo- myces pombe, and Saccharomyces cerevisiae. Thus, OS4 defines a novel evolutionarily conserved gene superfamily. Northern and database analyses revealed OS4 transcripts in numerous human tissues demonstrating its ubiquitous expression. We also observed overexpression of OS4 in three cancer cell lines with amplification of this gene. Furthermore, we detected OS4 amplification in 5/5 primary sarcomas with known amplification of the closely linked marker CDK4. These results demonstrate that the highly conserved OS4 gene is frequently included in the 12q13-q15 amplicon and may contribute to the development of a subset of sarcomas. Keywords: neoplasm; genetics; chromosome 12q; homogeneously staining region Introduction In human cancers, gene amplification is a common mechanism by which increased dosage of a gene leads to its overexpression. Amplification of cellular onco- genes has been observed in tumor cell lines and primary tumor tissues suggesting that overexpression of these genes provides tumor cells with a selective growth advantage in vitro and in vivo. The chromo- some 12q13-q15 region is frequently amplified in human sarcomas and brain tumors (Smith et al., 1992; Reifenberger et al., 1994). Several genes have been previously mapped to this region, including a zinc finger protein (GLI [Roberts et al., 1989]), a member of the transmembrane four superfamily (SAS [Meltzer et al., 1991; Jankowski et al., 1994]), a cyclin dependent kinase (CDK4 [Khatib et al., 1993]), a transcription factor (CHOP [Aman et al., 1992]), a modulator of p53 (MDM2 [Oliner et al., 1992; Leach et al., 1993]), and two novel genes (OS9 and OS4) (Su et al., 1994, 1996). Recently, we have demonstrated that this amplicon is derived from a gene-rich locus, encoding several additional amplified cDNAs (Gracia et al., 1996). Full characterization of amplified DNA in tumors requires mapping of the involved genomic segments to identify the genes contained in the core region which is consistently amplified in multiple tumors. Expression and functional studies are then required to interpret the relative importance of those genes which consistently contribute to the amplicon. Physical mapping studies have demonstrated that 12q13-q15 amplicons contain two separate core regions, a telomeric region contain- ing MDM2 and a centromeric region containing CDK4 (Elkahloun et al., 1996). These regions are separated by over 1 Mb, and markers in variable portions of the intervening segment are frequently not amplified. Currently, because of their roles in regulating the cell cycle (MDM2 through p53, and CDK4 through pRb) these genes are considered the most probable target genes for 12q13-q15 amplification. However, the CDK4 core region is quite gene dense, and several closely linked genes are typically co-amplified (Berner et al., 1996). We have been pursuing the complete character- ization of the CDK4 core region in order to evaluate the potential impact of the genes in this region on tumor phenotype. OS4 which we originally identified by chromosome microdissection and hybrid selection falls in the CDK4 core region, and we therefore pursued characterization of a full length cDNA clone. The 4833-bp OS4 cDNA encodes an ORF of 283 amino acid residues. Our analysis revealed OS4 expression in multiple dierent human tissues and its overexpression in three cancer cell lines with OS4 amplification. By Southern hybridization, we demon- strated amplification of OS4 in 5/5 primary sarcomas with known CDK4 amplification. Finally, database searches demonstrate highly significant similarity of OS4 with eight genes in organisms as divergent as S. cerevisiae, strongly suggesting that these genes belong to an evolutionarily conserved superfamily. Results OS4 cDNA sequence The 4833-bp sequence of the full length OS4 cDNA was determined from the concensus sequence of both strands of 57 plasmids containing overlapping OS4 cDNA inserts. The full length cDNA consists of three segments: a 305-bp 5’-untranslated region (5’-UTR: 1 – 305), a 849-bp coding sequence (306 – 1154), and a 3679-bp 3’-untranslated region (3’-UTR: 1155 – 4833). Correspondence: PS Meltzer Received 14 March 1997; revised 20 May 1997; accepted 20 May 1997 Oncogene (1997) 15, 1289 – 1294 1997 Stockton Press All rights reserved 0950 – 9232/97 $12.00