[CANCER RESEARCH 61, 5979 –5984, August 15, 2001]
Advances in Brief
Estrogen Receptor Status in Breast Cancer Is Associated with Remarkably Distinct
Gene Expression Patterns
1
Sofia Gruvberger, Markus Ringne ´r, Yidong Chen, Sujatha Panavally, Lao H. Saal, Åke Borg, Mårten Ferno ¨,
Carsten Peterson, and Paul S. Meltzer
2
Department of Oncology [S. G., Å. B., M. F.] and Complex Systems Division, Department of Theoretical Physics [M. R., C. P.], Lund University, SE-221 00 Lund, Sweden, and
Cancer Genetics Branch, National Human Genome Research Institute [S. G., M. R., Y. C., S. P., L. H. S., P. S. M.], NIH, Bethesda, Maryland 20892
Abstract
To investigate the phenotype associated with estrogen receptor (ER)
expression in breast carcinoma, gene expression profiles of 58 node-
negative breast carcinomas discordant for ER status were determined
using DNA microarray technology. Using artificial neural networks as
well as standard hierarchical clustering techniques, the tumors could be
classified according to ER status, and a list of genes which discriminate
tumors according to ER status was generated. The artificial neural net-
works could accurately predict ER status even when excluding top dis-
criminator genes, including ER itself. By reference to the serial analysis of
gene expression database, we found that only a small proportion of the 100
most important ER discriminator genes were also regulated by estradiol
in MCF-7 cells. The results provide evidence that ER and ER tumors
display remarkably different gene-expression phenotypes not solely ex-
plained by differences in estrogen responsiveness.
Introduction
Estrogens are important regulators of growth and differentiation in
the normal mammary gland and are also important in the development
and progression of breast carcinoma. Estrogens regulate gene expres-
sion via ER,
3
however the details of the estrogen effect on down-
stream gene targets, the role of cofactors, and cross-talk between other
signaling pathways are far from fully understood. As approximately
two-thirds of all breast cancers are ER+ at the time of diagnosis, the
expression of the receptor has important implications for their biology
and therapy (1). Opinions differ as to whether those breast cancers
which lack ER expression at diagnosis arise from an ER- compart-
ment within the mammary epithelium or represent evolution from an
ER+ to an ER- state (2).
The cDNA microarray technology allows for parallel analysis of
the expression of thousands of genes (3) to address complex questions
in tumor biology. Statistical tools are required to analyze the large
amount of expression data generated by this methodology. ANNs are
computer-based algorithms for pattern recognition that are capable of
learning from experience (4). The diagnosis of myocardial infarcts (5)
and heart arrhythmias from electrocardiograms (6) are examples of
applications of ANNs in medicine. We have recently demonstrated the
utility of ANNs for the diagnostic classification of tumors using
cDNA microarray data (7). In this study, we have applied ANNs as
well as conventional methods to analyze cDNA microarray data from
a selected group of node-negative breast cancers that differ with
respect to their ER status. Here we report that ER+ and ER- tumors
display remarkably different phenotypes, which may be attributable to
their evolution from distinct cell lineages.
Materials and Methods
Tissues and Cells. Fifty-eight grossly dissected primary tumors from node-
negative breast cancer patients, tumor size 20 –50 mm, were collected at the
University Hospital, Lund, Sweden. Microscopic examination of touch prep-
arations verified the presence of cancer cells in all samples. To train the
classifier described below, 47 tumors, all from two previous randomized
studies (Ref. 8)
4
were selected so that roughly half, 23, were ER+ (range,
50 –1900 fmol/mg protein; median, 160), whereas the remaining 24 were ER-
(range, 0 –9 fmol/mg protein, median 0.7). In addition, 14 of the patients were
premenopausal (5 ER+ and 9 ER-) and 33 were postmenopausal (18 ER+
and 15 ER-). To obtain an independent test set, the remaining 11 of the 58
tumors were selected from an ongoing clinical trial and used here as a blinded
test set. Of the 11 blinded samples, 5 were ER+ (range, 40 –120 fmol/mg
protein; median, 60), 6 were ER- (range, 0 –3 fmol/mg protein; median, 1.5),
and all were premenopausal. ER protein determinations were performed using
standard methods in the routine clinical laboratory (9). BT-474 cells, obtained
from American Type Culture Collection, were maintained in RPMI 1640
supplemented by 10% fetal bovine serum, penicillin, and streptomycin. Cells
were harvested at 60 – 80% confluency and used as a reference in all hybrid-
izations.
RNA Isolation and cDNA Microarrays. Total RNA was isolated from
cell lines using the RNeasy kit (Qiagen, Valencia, CA) with subsequent Trizol
(Life Technologies, Inc., Rockville, MD) purification. Total RNA from tumors
was isolated using two successive rounds of Trizol. Microarrays were prepared
and hybridized as described previously (3, 10, 11) and according to standard
protocols.
5
Briefly, the arrays were spotted with 6,728 sequence-verified
cDNA clones, of which 4000 were named human genes and the remaining
clones were expressed sequence tags. BT-474 RNA (200 g) and 65–100 g
of tumor RNA were used to produce labeled cDNA by anchored oligo(dT)-
primed reverse transcription using SuperScript II reverse transcriptase (Life
Technologies, Inc.) in the presence of either Cy5-dUTP or Cy3-dUTP (Am-
ersham Pharmacia, Piscataway, NJ), respectively. Fluorescence scanning and
image analysis with DeArray software were performed as described previously
(12, 13).
Data Analysis. For each gene, the fluorescent intensity of the most intense
channel [red (Cy3) or green (Cy5)] for each sample, was averaged over all
samples. All genes for which this average exceeded 2,000 fluorescence units
(scale 0 – 65,535 units) were included in the analysis. In addition, we required,
for all samples, that the red and green intensities both exceeded 20 fluores-
cence units and that the union (of the two channels) spot area exceeded 30
pixels. For the 58 (47 + 11) measured samples, these requirements left us with
Received 4/26/01; accepted 6/25/01.
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance with
18 U.S.C. Section 1734 solely to indicate this fact.
1
Supported in part by the Swedish Research Council and the Knut and Alice Wal-
lenberg Foundation through the SWEGENE consortium (to M. R.) and the Swedish
Foundation for Strategic Research (to C. P.). This work was partly supported by grants
from the Lund University Medical Faculty, the Swedish Cancer Society, Berta Kamprad’s
Foundation, the Gunnar Arvid and Elisabeth Nilsson Foundation, the Hospital of Lund
Foundations, the E and F Bergqvist Foundation, and King Gustav V ’s Jubilee Foundation.
2
To whom requests for reprints should be addressed, at National Human Genome
Research Institute, NIH, 49 Convent Drive, Bethesda, MD 20892-4470. Phone: (301) 594-
5283; Fax: (301) 402-3281; E-mail: pmeltzer@nhgri.nih.gov.
3
The abbreviations used are: ER, estrogen receptor ; ANN, artificial neural network;
E2, estradiol; PCA, principal component analysis; ROC, receiver operating characteristic;
MDS, multidimensional scaling; WGA, weighted gene analysis; SAGE, serial analysis of
gene expression; GATA3, GATA-binding protein; 3 TFF3, trefoil factor 3.
4
Å. Borg, M. Ferno ¨, unpublished results.
5
Internet address: http://www.nhgri.nih.gov/DIR/LCG/15K/HTML/protocol.html.
5979
Research.
on January 10, 2022. © 2001 American Association for Cancer cancerres.aacrjournals.org Downloaded from