SEER Cancer Statistics Review 1973-1999 National Cancer Institute Complete and Limited Duration Cancer Prevalence Estimates Angela Mariotto 1 , Anna Gigli 2 , Riccardo Capocaccia 3 , Andrea Tavilla 3 , Limin X. Clegg 1 , Michael Depry 4 , Steve Scoppa 4 , Lynn A. G. Ries 1 , Julia H. Rowland 1 , Gina Tesauro 1 , Eric J. Feuer 1 1 National Cancer Institute, National Institutes of Health, Bethesda, USA 2 Institute for Population Research and Social Policies, National Research Council, Rome, Italy 3 Istituto Superiore di Sanita’, Rome, Italy 4 Information Management Services, Inc., Silver Spring, USA Introduction Prevalence is an indicator of primary interest in public health because it measures the burden of cancer in a population and on the health care system. Prevalence is defined as the number or percent of people alive on a certain date in a population who previously had a diagnosis of the disease. It includes new (incidence) and pre-existing cases and is a function of both past incidence and survival. Information on prevalence can be used for health planning, resource allocation and an estimate of cancer survivorship. In past reports of the Cancer Statistics Review, US cancer prevalence was estimated by multiplying the Connecticut cancer prevalence proportions to the US population. This year, US cancer prevalence is estimated by applying SEER-9 and SEER-11 prevalence proportions to the US population. SEER proportion rates are more representative of the US and permit estimation of prevalence by racial/ethnic groups. Other changes with respect to previous reports are in the methods for tumor inclusion and complete prevalence calculation. The counting method We used the counting method (Byrne et al., 1992) to estimate prevalence from incidence and follow-up data from the SEER cancer registries. Variance for these estimates is proposed and evaluated by Gail et al. (1999) and Clegg et al. (2001). The counting method estimates prevalence by dividing the estimated number of diagnosed persons in the prevalence cohort by the study population size at the prevalence point of time, taking into account loss to follow-up. For those in the prevalence cohort who are lost to follow-up, the following procedure is used to estimate the probability that each individual is alive as of the prevalence point. First, survival functions, stratified by age at diagnosis and year of diagnosis, are estimated from the prevalence cohort. Then, for each individual lost to follow-up in the prevalence cohort, his or her probability of being alive at the prevalence time is estimated from the appropriate (age and year at diagnosis) survival function, conditional on the time lost to follow-up. Tumor inclusion criteria Different methods can be used to determine which tumors are to be included in the prevalence statistics. For the results presented here only the first malignant tumor ever is counted. Thus, if a woman had a melanoma prior to a breast cancer diagnosis, her melanoma would contribute to the prevalence of melanoma and to the prevalence of all sites, but the breast cancer would not contribute to the prevalence of breast cancer. Counting only one cancer per individual avoids some ambiguity in prevalence counts, and allows the counts for individual sites to sum to the all sites total. Other selection criteria are possible and different criteria have been used in the past. For more information and to generate statistics using other tumor selection criteria refer to http://srab.cancer.gov/prevalence. Complete Prevalence and Prevalence by Years Since Diagnosis Complete prevalence (i.e., the proportion of persons alive who ever had a history of the disease) can be estimated using the counting method from registries of long duration. In the US, only the Connecticut Tumor Registry has information on cancer cases from 1940 and may be used to approximate complete prevalence. Limited duration prevalence, representing the proportion of people alive at the prevalence date with a diagnosis of cancer in the previous L years, can be calculated from registries of shorter duration. For example, SEER incidence and follow-up data from 1975 through 1998 can provide estimates of prevalent cases diagnosed up to 24 years prior to Jan 1, 1999, the most recent date for which we can estimate prevalence.