ORIGINAL ARTICLE Diagnosis of Ulcerative Colitis Before Onset of Inflammation by Multivariate Modeling of Genome-wide Gene Expression Data Jørgen Olsen, DMSc,* Thomas A. Gerds, PhD, Jakob B. Seidelin, PhD, Claudio Csillag, PhD, Jacob T. Bjerrum, MD, Jesper T. Troelsen, PhD,* and Ole Haagen Nielsen, DMSc Background: Endoscopically obtained mucosal biopsies play an important role in the differential diagnosis between ulcerative colitis (UC) and Crohn’s disease (CD), but in some cases where neither macroscopic nor microscopic signs of inflammation are present the biopsies provide only inconclusive information. Pre- vious studies indicate that CD cannot be diagnosed by molecular and histological diagnostic tools using colonic biopsies without microscopic signs of inflammation, but it is unknown if this is also the case for UC. Methods: The aim of the present study was to apply multivariate modeling of genome-wide gene expression to investigate if a diag- nosable preinflammatory state exists in biopsies of noninflamed UC colon, and to exploit such information to build a diagnostic tool. Results: Genome-wide gene expression data were obtained from control subjects and UC and CD patients. In total, 89 biopsies from 78 patients were included. A diagnostic model was derived with the random forest method based on 71 biopsies from 60 patients. The model-internal out-of-bag performance measure yielded perfect classification. Furthermore, the model was vali- dated in independent 18 noninflamed biopsies from 18 patients (7 UC, 7 CD, 4 control) where the model achieved 100% sensitivity (95% confidence limits: 60.0 –100) and 100% specificity (95% confidence limits: 71.5–100). Conclusions: The present study demonstrates a preinflammatory state in patients diagnosed with UC. In addition, we demonstrate the usefulness of random forest modeling of genome-wide gene expres- sion data for distinguishing quiescent and active UC colonic mucosa versus control and CD colonic mucosa. (Inflamm Bowel Dis 2009;15:1032–1038) Key Words: random forest, PCA, microarray, gene expression, preinflammation U lcerative colitis (UC) and Crohn’s disease (CD) are 2 common inflammatory bowel diseases (IBDs) with mul- tifactorial etiologies (for reviews, see Refs. 1, 2). UC and CD occur in the intestines of genetically susceptible individuals under the combined effects of commensally enteric micro- flora, mucosal immunity, and environmental factors. In every individual the enteric microflora affects the activities of the mucosal immune system cells, but in contrast to healthy individuals, IBD patients develop exaggerated effector T-cell activity that leads to a state of chronic intestinal inflammation (for reviews, see Refs. 3, 4). Whereas CD might affect any part of the gastrointestinal tract, UC is characterized by being confined solely to the colon. The diagnostic distinction be- tween UC and CD is especially important because the surgi- cal treatment options for the 2 diseases are different (for a review, see Ref. 5). Neither UC nor CD has absolute defining diagnostic markers. Endoscopically obtained mucosal biop- sies play an important role in the diagnosis of IBD, but in a nontrivial fraction of cases such biopsies provide only incon- clusive information (for a review, see Ref. 6). This fraction of IBD cases, i.e., the so-called indeterminate colitis, 7,8 which initially might lead to an incorrect diagnosis, imposes a clinically relevant and serious problem in a great percentage (5%–10%) of IBD patients. Gene expression analysis by DNA-microarray technol- ogy 9,10 and metabolite profiling by 1 H NMR spectroscopy 11 have been applied to solve the diagnostic difficulties encoun- tered in inconclusive mucosal biopsies from IBD patients. The strategies hitherto applied, however, only allow distinc- tion between UC and CD if biopsies with macroscopic signs of inflammation are used. Such samples may not, however, be available for CD patients with sparing of the colon, in chil- dren with UC, or in the cases of UC with a segmental distribution. 6 Moreover, antiinflammatory therapy by com- Received for publication December 2, 2008; Accepted December 16, 2008. From the *Department of Cellular and Molecular Medicine, Department of Biostatistics, Department of Gastroenterology C, Herlev Hospital, Uni- versity of Copenhagen, Copenhagen, Denmark. Supported by grants from the Danish Research Council, the Augustinus Foundation, Aase and Ejnar Danielsen’s Foundation, and Director Emil C. Hertz and spouse Inger Hertz Foundation. Reprints: Jørgen Olsen, Department of Cellular and Molecular Medicine, University of Copenhagen, Panum Institute Bldg. 6.4., Blegdamsvej 3, DK-2200 Copenhagen N, Denmark (e-mail: jolsen@sund.ku.dk). Copyright © 2009 Crohn’s & Colitis Foundation of America, Inc. DOI 10.1002/ibd.20879 Published online 28 January 2009 in Wiley InterScience (www. interscience.wiley.com). 1032 Inflamm Bowel Dis Volume 15, Number 7, July 2009