Analyzing Patterns of Staining in Immunohistochemical
Studies: Application to a Study of Prostate
Cancer Recurrence
Ruth Etzioni,
1
Sarah Hawley,
1
Dean Billheimer,
3
Lawrence D. True,
2
and Beatrice Knudsen
1
1
Fred Hutchinson Cancer Research Center;
2
Department of Pathology, University of Washington, Seattle, Washington; and
3
Division of Biostatistics, Vanderbilt-Ingram Cancer Center, Nashville, Tennessee
Abstract
Background: Immunohistochemical studies use antibodies to
stain tissues with the goal of quantifying protein expression.
However, protein expression is often heterogeneous resulting
in variable degrees and patterns of staining. This problem is
particularly acute in prostate cancer, where tumors are
infiltrative and heterogeneous in nature. In this article, we
introduce analytic approaches that explicitly consider both
the frequency and intensity of tissue staining.
Methods: Compositional data analysis is a technique used to
analyze vectors of unit-sum proportions, such as those
obtained from soil sample studies or species abundance
surveys. We summarized specimen staining patterns by the
proportion of cells staining at mild, moderate, and intense
levels and used compositional data analysis to summarize
and compare the resulting staining profiles.
Results: In a study of Syndecan-1 staining patterns among
44 localized prostate cancer cases with Gleason score 7
disease, compositional data analysis did not detect a
statistically significant difference between the staining
patterns in recurrent (n = 22) versus nonrecurrent (n = 22)
patients. Results indicated only modest increases in the
proportion of cells staining at a moderate intensity in the
recurrent group. In contrast, an analysis that compared
quantitative scores across groups indicated a (borderline)
significant increase in staining in the recurrent group (P =
0.05, t test).
Conclusions: Compositional data analysis offers a novel
analytic approach for immunohistochemical studies, provid-
ing greater insight into differences in staining patterns
between groups, but possibly lower statistical power than
existing, score-based methods. When appropriate, we rec-
ommend conducting a compositional data analysis in
addition to a standard score-based analysis. (Cancer Epi-
demiol Biomarkers Prev 2005;14(5):1040 – 6)
Background
Immunohistochemical studies are designed to detect the
expression of proteins at the cellular level. In such studies,
an antibody is applied to a tissue specimen, and antibody
binding is detected using a chromogenic substrate that results
in a color change where the protein is localized. The specimen
is then examined under a microscope and the extent and
intensity of color staining is assessed by an observer.
Immunohistochemical studies have become an integral part
of the process of biomarker development, where they are used
to determine whether overexpression or underexpression of
potential markers predicts disease status. For example, several
novel proteins have been shown to be markedly overexpressed
in prostate cancer tissue, including a-methylacyl-CoA Race-
mase (1), EZH2 (2), and pim-1 kinase and hepsin (3). cDNA
microarray studies typically yield many candidate genes that
seem to differentiate between disease groups at the RNA level.
However, only a small portion of these findings will ultimately
be confirmed by immunohistochemistry at the protein level.
Immunohistochemical studies of prostate cancer reveal
tremendous within-specimen heterogeneity. Not only are
normal cells typically present in cancer specimens, but
staining among tumor cells can be highly variable. Figure 1
shows a section of formalin-fixed, paraffin-embedded pros-
tate tissue stained with the syndecan-1 antibody (4). Both
normal and cancer cells are visible; moreover, some areas of
the tumor show mild to moderate staining, whereas others
show very intense staining. Because of variability, like that
illustrated in Fig. 1, specimens are often not coded as
positive or negative for a particular antibody; rather, a
summary score is calculated (1, 4). First, the percent of cells
staining at a fixed number of intensity levels (e.g. none, faint,
moderate, strong) is determined. We refer to this vector of
percents as the staining profile associated with a specific
specimen. Then the score is given as the sum over intensity
levels of the percent staining multiplied by an ordinal value
corresponding to the intensity level (0 = no staining, 1 = mild
staining, etc.). With four intensity levels, the resulting score
ranges from 0 (no staining in the entire specimen) to 300
(intense staining uniformly within the specimen). Variants of
this system have been proposed; for example, instead of the
percent staining at each intensity level, an integer may be
recorded with values (0, 1, 2, ...) indicating whether the
percent staining at a given level falls within a specified range
(0-20%, 20-40%, 40-60%, ..., ref. 5). Qualitative approaches,
which classify specimens as positive/negative or weak-
strong, have also been used (2, 3).
The transformation from staining profile to score is many
to one. Consider, for example, the two specimen profiles in
Table 1. The first profile shows increased mild to moderate
staining relative to the second profile, which has more intense
staining. However, the two sample specimens in Table 1 yield
the same score (80), because the increased frequency of cells
staining moderately in the first sample compensates for the
small absolute increase in intense expression in the second
sample. Therefore, if differences such as those represented in
Table 1 are of interest, a score-based analysis may not be
adequate.
This article presents an approach for directly analyzing data
on the staining profiles recorded in immunohistochemical
Received 8/4/04; revised 2/9/05; accepted 3/2/05.
Grant support: Grant P50 CA97186.
The costs of publication of this article were defrayed in part by the payment of page charges.
This article must therefore be hereby marked advertisement in accordance with 18 U.S.C.
Section 1734 solely to indicate this fact.
Requests for reprints: Ruth Etzioni, Translational and Outcomes Research, Fred Hutchinson
Cancer Research Center, Mailstop M2-B230, 1100 Fairview Avenue North, P.O. Box 19024,
Seattle, WA 98109. Phone: 206-667-6561; Fax: 206-667-7264. E-mail: retzioni@fhcrc.org
Copyright D 2005 American Association for Cancer Research.
Cancer Epidemiology, Biomarkers & Prevention 1040
Cancer Epidemiol Biomarkers Prev 2005;14(5). May 2005
Research.
on February 25, 2016. © 2005 American Association for Cancer cebp.aacrjournals.org Downloaded from