A Training-Testing Approach to the Molecular Classification of Resected Non-Small Cell Lung Cancer Noboru Yamagata, Yu Shyr, Kiyoshi Yanagisawa, Mary Edgerton, Thao P. Dang, Adriana Gonzalez, Sorena Nadaf, Paul Larsen, John R. Roberts, Jonathan C. Nesbitt, Roy Jensen, Shawn Levy, Jason H. Moore, John D. Minna, and David P. Carbone 1 Vanderbilt-Ingram Cancer Center and Department of Medicine [N. Y., K. Y., T. P. D., S. N., D. P. C.], Department of Preventive Medicine [P. L., Y. S.], Department of Pathology [M. E., A. G., R. J.], Department of Cardiac and Thoracic Surgery [J. R. R.], and Department of Molecular Physiology and Biophysics [S. L., J. H. M.], Vanderbilt University School of Medicine, Nashville, Tennessee 37232-6838; Cardiovascular Surgical Associates, Saint Thomas Hospital, Nashville, Tennessee 37205 [J. C. N.]; and Hamon Center for Therapeutic Oncology Research, University of Texas Southwestern Medical Center, Dallas, Texas 75235 [J. D. M.] ABSTRACT Purpose: RNA expression patterns associated with non- small cell lung cancer subclassification have been reported, but there are substantial differences in the key genes and clinical features of these subsets casting doubt on their biological significance. Experimental Design: In this study, we used a training- testing approach to test the reliability of cDNA microarray- based classifications of resected human non-small cell lung cancers (NSCLCs) analyzed by cDNA microarray. Results: Groups of genes were identified that were able to differentiate primary tumors from normal lung and lung metastases, as well as identify known histological subgroups of NSCLCs. Groups of genes were identified to discriminate sample clusters. A blinded confirmatory set of tumors was correctly classified by using these patterns. Some histologi- cally diagnosed large cell tumors were clearly classified by expression profile analysis as being either adenocarcinoma or squamous cell carcinoma, indicating that this group of tumors may not be genetically homogeneous. High -acti- nin-4 expression was identified as highly correlated with poor prognosis. Conclusions: These results demonstrate that gene ex- pression profiling can identify molecular classes of resected NSCLCs that correctly classifies a blinded test cohort, and correlates with and supplements standard histological eval- uation. INTRODUCTION Lung cancer represents a challenging clinical problem in most of the developed countries. The number of deaths from lung cancer in the United States is more than the next four most common cancers combined. Despite the best current treatment, the overall 5-year survival after diagnosis is only 10 –15%. Improvements in prevention, early detection, prognosis, and therapy have been difficult to achieve. Clinically, lung cancers display a broad range of clinical behaviors ranging from slowly progressing to rapidly fatal, they can be highly metastatic or only locally invasive, and they may display responsiveness or resistance to therapy (1); the molecular basis of these variations in behavior is completely unknown. The classification of lung cancers has traditionally been based primarily on light microscopic morphological findings. According to the current histological lung cancer classification proposed by the WHO in 1981, lung cancers can be divided into two broad groups, small cell lung cancer, accounting for 20 – 25% of bronchogenic carcinomas, and NSCLC, 2 accounting for almost all of the remaining cases. NSCLC has three major subgroups: adenocarcinoma, squamous cell carcinoma, and large cell carcinoma (2). Even within the subgroup of NSCLC there is a great degree of heterogeneity in behavior, and the histological subclassifications for NSCLCs have no predictive use and all are treated identically despite decades of research. It is clear that each tumor has unique genetic differences, and it is hypothesized that these differences determine its bio- logical behavior. A large effort has been made by many labo- ratories to study many individual candidate genetic abnormali- ties in an attempt to develop molecular markers for lung cancer classification and prognosis, but after hundreds of such studies, none of these single markers are of any real clinical utility. Even today, all NSCLCs are usually treated identically, stage for stage, and no molecular marker is used for routine therapeutic decisions. Thus, it is becoming clear that complex biological behaviors of tumors will only be explainable by complex pat- terns of multiple markers. Microarray technology has enabled expression analysis of thousands of genes at one time, allowing insight into complex gene expression patterns and perturbations (3). To date, mi- croarray technology has been successfully applied to a wide Received 1/14/03; revised 6/29/03; accepted 7/3/03. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Supported by Lung Cancer Special Program of Research Excellence P50CA90949, P50CA70907, Mathers Foundation, and the Robert A. and Helen C. Kleberg Foundation. 1 To whom requests for reprints should be addressed, at Division of Hematology and Oncology, Vanderbilt-Ingram Cancer Center, 685 Pre- ston Research Building, Nashville, TN 37232-6838. Phone: (615) 936- 3321; Fax: (615) 936-3322; E-mail: d.carbone@vanderbilt.edu. 2 The abbreviations used are: NSCLC, non-small cell lung cancer; WFCCM, Weighted Flexible Compound Covariate Method; SAM, Sig- nificance Analysis of Microarrays; ACTN4, -actinin-4. 4695 Vol. 9, 4695– 4704, October 15, 2003 Clinical Cancer Research Cancer Research. on November 27, 2021. © 2003 American Association for clincancerres.aacrjournals.org Downloaded from