[CANCER RESEARCH 61, 4320 – 4324, June 1, 2001]
Advances in Brief
Discovery of New Markers of Cancer through Serial Analysis of Gene Expression:
Prostate Stem Cell Antigen Is Overexpressed in Pancreatic Adenocarcinoma
1
Pedram Argani,
2
Christophe Rosty, Robert E. Reiter, Robb E. Wilentz, Selva R. Murugesan, Steven D. Leach,
Byungwoo Ryu, Halcyon G. Skinner, Michael Goggins, Elizabeth M. Jaffee, Charles J. Yeo, John L. Cameron,
Scott E. Kern, and Ralph H. Hruban
Departments of Pathology [P. A., C. R., R. E. W., S. R. M., H. G. S., M. G., S. E. K., R. H. H.], Surgery [S. D. L., C. J. Y., J. L. C.], Oncology [S. D. L., B. R., M. G., E. M. J.,
C. J. Y., S. E. K., R. H. H.], and Medicine [M. G.], The Johns Hopkins Medical Institutions, Baltimore, Maryland 21287; Department of Epidemiology, The Johns Hopkins School
of Public Health, Baltimore, Maryland 21287 [H. G. S.]; and Department of Urology, University of California, Los Angeles, California 90095 [R. E. R.]
Abstract
Serial analysis of gene expression (SAGE) can be used to quantify gene
expression in human tissues. Comparison of gene expression levels in
neoplastic tissues with those seen in nonneoplastic tissues can, in turn,
identify novel tumor markers. Such markers are urgently needed for
highly lethal cancers like pancreatic adenocarcinoma, which typically
presents at an incurable, advanced stage. The results of SAGE analyses of
a large number of neoplastic and nonneoplastic tissues are now available
online, facilitating the rapid identification of novel tumor markers. We
searched an online SAGE database to identify genes preferentially ex-
pressed in pancreatic cancers as compared with normal tissues. SAGE
libraries derived from pancreatic adenocarcinomas were compared with
SAGE libraries derived from nonneoplastic tissues. Three promising tags
were identified. Two of these tags corresponded to genes (lipocalin and
trefoil factor 2) previously shown to be overexpressed in pancreatic car-
cinoma, whereas the third tag corresponded to prostate stem cell antigen
(PSCA), a recently discovered gene thought to be largely restricted to
prostatic basal cells and prostatic adenocarcinomas. PSCA was expressed
in four of the six pancreatic cancer SAGE libraries, but not in the libraries
derived from normal pancreatic ductal cells. We confirmed the overex-
pression of the PSCA mRNA transcript in 14 of 19 pancreatic cancer cell
lines by reverse transcription-PCR, and using immunohistochemistry, we
demonstrated PSCA protein overexpression in 36 of 60 (60%) primary
pancreatic adenocarcinomas. In 59 of 60 cases, the adjacent nonneoplastic
pancreas did not label for PSCA. PSCA is a novel tumor marker for
pancreatic carcinoma that has potential diagnostic and therapeutic impli-
cations. These results establish the validity of analyses of SAGE databases
to identify novel tumor markers.
Introduction
SAGE
3
is a recently described technique that allows one to obtain
a quantitative and comprehensive profile of cellular gene expression
(1, 2). Briefly, in this procedure, cellular mRNA transcripts are
converted to cDNA and then cleaved at specific sites by restriction
enzymes into small (10 –14 bp) fragments, also known as tags. These
tags are ligated together into difragments, amplified by PCR, and then
concatenated and sequenced as one long fragment of DNA. Each
10 –14-bp fragment (tag) should uniquely identify a specific gene
transcript because it corresponds to a defined sequence near the
transcript’s 3' terminus, as dictated by the tagging restriction enzyme
used (1). The abundance of each tag provides a quantitative measure
of the transcript level present within the mRNA sample analyzed,
which therefore allows expression levels of specific transcripts to be
compared between two samples (2). This ability to quantitate gene
expression represents a major advantage of SAGE over other methods
of screening cDNA libraries for differentially expressed genes.
In the initial demonstration of the SAGE technique, a gene expres-
sion profile of the normal pancreas was constructed and validated by
Northern blotting (1). Subsequently, Zhang et al. (2) used SAGE to
demonstrate differences in expression patterns between colonic and
pancreatic adenocarcinomas and normal colonic epithelium. Such
applications of SAGE hold tremendous promise for the identification
of diagnostic and/or prognostic markers of malignancy. Indeed, the
above-referenced analyses identified several promising serum mark-
ers for pancreatic carcinoma, such as tissue inhibitor of metallopro-
teinase 1 (3).
Three recent advances have made analyses of SAGE libraries for
differentially expressed genes more feasible. First, rapid progress in
the Human Genome Project has facilitated the mapping of specific
genes to individual tags specified by SAGE (4). Fewer tags now
correspond to ESTs of unknown origin, and more can be assigned to
known genes. Second, a large number of normal and neoplastic tissues
have now been analyzed by SAGE, creating extremely large databases
for study. Third, much of this database is now online and available to
the general public (5, 6).
4
As of February 1, 2001, this online database
included 88 SAGE libraries, and 3,632,974 tags.
Armed with these tools, we searched an online SAGE database to
identify novel markers of pancreatic adenocarcinoma.
Materials and Methods
Based on the identification of differentially expressed genes in our ongoing
SAGE investigation of pancreatic cancer,
5
the xProfiler program available
online
4
was used to compare gene expression patterns in pancreatic cancer with
those in nonneoplastic tissues. In this program, one can select SAGE libraries
for analysis and then compare the tags in one group of online SAGE libraries
with the tags in another group. We used two queries to determine differentially
expressed genes. In the first strategy, we chose a pancreatic adenocarcinoma
group composed of the SAGE libraries of four pancreatic cancer cell lines that
yielded 96,494 total tags (CAPAN1, 37,926 tags; CAPAN2, 23,222 tags;
HS766T, 10,467 tags; and Panc1, 24,879 tags). The nonneoplastic comparison
group in this analysis was composed of the SAGE libraries of two short-term
cultures of normal pancreatic duct epithelial cells that yielded 64,577 tags (HX,
32,157 tags; and H126, 32,420 tags). In the second query, we expanded both
Received 2/22/01; accepted 4/12/01.
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance with
18 U.S.C. Section 1734 solely to indicate this fact.
1
Supported by the Specialized Program of Research Excellence (SPORE) in Gastro-
intestinal Cancer p50-CA62924, The National Pancreas Foundation, and The Michael
Rolfe Fund for pancreatic cancer research.
2
To whom requests for reprints should be addressed, at The Johns Hopkins Hospital-
Surgical Pathology, The Harry and Jeanette Weinberg Building, 401 North Broadway,
Room 2242, Baltimore, MD 21231-2410. Phone: (410) 614-2428; Fax: (410) 955-0115;
E-mail: pargani@jhmi.edu.
3
The abbreviations used are: SAGE, serial analysis of gene expression; PanIN,
pancreatic intraepithelial neoplasia; PSCA, prostate stem cell antigen; TFF2, trefoil factor
2; RT-PCR, reverse transcription-PCR; EST, expressed sequence tag.
4
http://www.ncbi.nlm.nih.gov/SAGE.
5
B. Ryu, J. Jones, M. A. Hollingsworth, R. H. Hruban, and S. E. Kern. Identification
of differentially expressed genes by serial analysis of gene expression profiling in
pancreatic cancer, manuscript in preparation.
4320
Research.
on November 14, 2015. © 2001 American Association for Cancer cancerres.aacrjournals.org Downloaded from