A Novel Unsupervised Method to Identify Genes Important in the Anti-viral Response: Application to Interferon/Ribavirin in Hepatitis C Patients Leonid I. Brodsky 1 , Abdus S. Wahed 3 , Jia Li 3 , John E. Tavis 4 , Takuma Tsukahara 2 , Milton W. Taylor 2 * 1 Institute of Evolution, University of Haifa, Haifa, Israel, 2 Department of Biology, Indiana University, Bloomington, Indiana, United States of America, 3 Epidemiology Data Center, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America, 4 Molecular Microbiology and Immunology, St. Louis University, St Louis, Missouri, United States of America Background. Treating hepatitis C with interferon/ribavirin results in a varied response in terms of decrease in viral titer and ultimate outcome. Marked responders have a sharp decline in viral titer within a few days of treatment initiation, whereas in other patients there is no effect on the virus (poor responders). Previous studies have shown that combination therapy modifies expression of hundreds of genes in vitro and in vivo. However, identifying which, if any, of these genes have a role in viral clearance remains challenging. Aims. The goal of this paper is to link viral levels with gene expression and thereby identify genes that may be responsible for early decrease in viral titer. Methods. Microarrays were performed on RNA isolated from PBMC of patients undergoing interferon/ribavirin therapy. Samples were collected at pre-treatment (day 0), and 1, 2, 7, 14 and 28 days after initiating treatment. A novel method was applied to identify genes that are linked to a decrease in viral titer during interferon/ribavirin treatment. The method uses the relationship between inter-patient gene expression based proximities and inter-patient viral titer based proximities to define the association between microarray gene expression measurements of each gene and viral-titer measurements. Results. We detected 36 unique genes whose expressions provide a clustering of patients that resembles viral titer based clustering of patients. These genes include IRF7, MX1, OASL and OAS2, viperin and many ISG’s of unknown function. Conclusion. The genes identified by this method appear to play a major role in the reduction of hepatitis C virus during the early phase of treatment. The method has broad utility and can be used to analyze response to any group of factors influencing biological outcome such as antiviral drugs or anti-cancer agents where microarray data are available. Citation: Brodsky LI, Wahed AS, Li J, Tavis JE, Tsukahara T, et al (2007) A Novel Unsupervised Method to Identify Genes Important in the Anti-viral Response: Application to Interferon/Ribavirin in Hepatitis C Patients. PLoS ONE 2(7): e584. doi:10.1371/journal.pone.0000584 INTRODUCTION Treating with peginterferon/ribavirin combination therapy pa- tients who have chronic hepatitis C virus (HCV) infection results in a varied response in terms of outcome and decrease in viral titer [1–4]. For patients who respond well there is a sharp decrease in viral titer within 24–48 hours after treatment initiation whereas in other patients there is little or no effect on the viral titer and only temporary, or no, clearance of the virus over a long period [5,6]. Previous in vitro studies have shown that combination interferon treatment induces or decreases expression of hundreds of genes [7–10]. One of the major problems, however, is to identify which of these genes are linked to viral clearance in vivo. In this paper we report a novel mathematical method to explore the association between decrease in viral titer and changes in gene expression in hepatitis C patients following combination treatment with pegylated interferon and ribavirin. The viral clearance time course profile will not necessarily directly correlate with the gene expression time course profile even if the gene is an active participant of the interferon treatment response because the decrease of the viral levels depends on the interplay of many genes and gene products. Therefore, an indirect approach was used in which the relationship between gene expression across days and viral decrease was examined using inter-patient distances (prox- imity) according to both characteristics. Using this approach we selected thirty seven gene probes that were linked with the anti-HCV response during the first 28 days of treatment. A visual demonstration of the association of detected genes with the viral decrease is demonstrated by a comparison of patient clusterings. Indeed, the inter-patient proximities according to the pattern of decrease in virus titer provide an unsupervised clustering of patients based on changes in viral levels. Similarly, the inter-patient proximities according to expressions of the specified genes across time provide another unsupervised cluster- ing of patients. A visual inspection of viral-titer based and selected genes expression based clusterings of patients indicates their close relationship. Since the unsupervised clustering of patients accord- ing to the pattern of viral clearance is in good correspondence with an a priori biological categorization of patients into marked, slow Academic Editor: Sebastian Fugmann, National Institute on Aging, United States of America Received May 8, 2007; Accepted June 3, 2007; Published July 4, 2007 Copyright: ß 2007 Brodsky et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This study was funded as a cooperative agreement by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) with co-support from the Intramural Research Program of the National Cancer Institute (NCI) with further support under a Cooperative Research and Development Agreement (CRADA) with Roche Laboratories, Grant numbers: U01 DK60329, U01 DK60340, U01 DK60324, U01 DK60344, U01 DK60327, U01 DK60335, U01 DK60352, U01 DK60342, U01 DK60345, U01 DK60309, U01 DK60346, U01 DK60349, U01 DK60341. Other support: National Center for Research Resources (NCRR) General Clinical Research Centers Program grants: M01 RR00645 (New York Presbyterian), M02 RR000079 (University of California, San Francisco), M01 RR16500 (University of Maryland), M01 RR000042 (University of Michigan), M01 RR00046 (University of North Carolina). Competing Interests: The authors have declared that no competing interests exist. * To whom correspondence should be addressed. E-mail: taylor@indiana.edu PLoS ONE | www.plosone.org 1 July 2007 | Issue 7 | e584