A combined sequencestructure approach for predicting resistance to the non-nucleoside HIV-1 reverse transcriptase inhibitor Nevirapine Vadim L. Ravich, Majid Masso, Iosif I. Vaisman Laboratory for Structural Bioinformatics, Department of Bioinformatics and Computational Biology, George Mason University, 10900 University Blvd., MSN 5B3, Manassas, VA 20110, United States abstract article info Article history: Received 22 September 2010 Received in revised form 5 November 2010 Accepted 12 November 2010 Available online 23 November 2010 Keywords: Delaunay tessellation Knowledge-based statistical potential Computational mutagenesis Machine learning HIV-1 drug resistance Prediction The development of drug resistance to antiretroviral medications used to treat infection with HIV-1 is a major concern. Given the cost and time constraints associated with phenotypic resistance testing, computational approaches leading to accurate predictive models of resistance based on a patient's mutational patterns in the target protein would provide a welcome alternative. A combined sequencestructure computational mutagenesis procedure is used to generate attribute vectors for each of 222 mutational patterns of HIV-1 reverse transcriptase that were isolated and sequenced from patients. Phenotypic fold-levels of resistance to the non-nucleoside inhibitor Nevirapine are known for over 25% of these mutants, whose values are used to assign each assayed mutant to a drug susceptibility class, either sensitive or resistant. Support vector machine and random forest supervised learning algorithms applied to this subset respectively classify mutants based on drug susceptibility with 85% and 92% cross-validation accuracy. The trained models are used to predict susceptibility to Nevirapine for all remaining mutant isolates, and predictions are in agreement for 90% of the test cases. © 2010 Elsevier B.V. All rights reserved. 1. Introduction The HIV-1 reverse transcriptase (RT) is an important target enzyme for nearly all combination antiretroviral therapies that are currently available to treat patients [1]. In the earliest stages following infection of a host cell, RT is responsible for converting the RNA viral genome of HIV-1 into DNA for subsequent integration into the host genome. In addition to RT, the pol gene of HIV-1 encodes the protease and integrase enzymes, which are also crucial for viral replication and targets for pharmaceutical inhibitor drugs [2]. The functional RT enzyme is a heterodimer consisting of a p66 subunit that is enzymatically active and a p51 subunit that provides structural stability (Fig. 1A [3]). The larger chain contains both an N-terminal polymerase domain comprising 440 amino acid residues as well as a C-terminal RNase H domain that spans 120 residues [4]. The palm of the p66 subunit includes the polymerase active site, characterized by the catalytic aspartic triad formed by Asp110, Asp185, and Asp186 [5], where the latter two residues participate in a highly conserved YXDD motif across retroviral RTs [6,7]. Commercially available non-nucleoside reverse transcriptase inhib- itor (NNRTI) drugs bind to a hydrophobic region located in the palm subdomain of the p66 subunit, approximately 10 Å away from the polymerase active site [8]. In particular, the drug Nevirapine (NVP) makes a total of 38 atomic contacts with residues in the palm and thumb subdomains. A beta-sheet within the palm is shifted as a result of NNRTI binding, which alters the geometry of the active site and deactivates polymerase activity [9]. The majority of mutations in RT associated with NNRTI resistance occur at residue positions making direct contact with the particular drug, including Leu100, Lys103, Val106, Val108, Tyr181, Tyr188, Gly190, Pro225, Met230, and Pro236 [10,11]. Amino acid replacements at these positions interfere with NNRTI binding by eliminating atomic contacts as well as by altering the size and shape of the hydrophobic region. Analysis of crystallographic structures has revealed that drug resistance mutations do not substantially change protein conformation but introduce local geometric variations around mutation sites, inducing a change in local van der Waals forces and hydrogen bonding patterns [12]. Given the clinical imperative for prescribing to HIV-1 infected patients an effective cocktail of antiretroviral medications to which they are susceptible, genotypic and phenotypic assays are now available to assess the degree to which RT enzymes harboring single or multiple amino acid substitutions are susceptible to inhibitor drugs [13]. Genotyping consists of sequencing patient RT isolates in order to determine if there are mutations present that are already known to be associated with resistance, while phenotyping involves directly mea- suring and comparing the susceptibility of an RT mutant to an inhibitor relative to a drug-sensitive RT control. Since phenotypic assays are expensive and can take up to two weeks to complete, reports detailing computational techniques for rapidly predicting phenotype from genotype have started to appear in the literature [1422]. Additionally, Biophysical Chemistry 153 (2011) 168172 Corresponding author. Tel.: +1 703 993 8431; fax: +1 703 993 8401. E-mail address: ivaisman@gmu.edu (I.I. Vaisman). 0301-4622/$ see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.bpc.2010.11.004 Contents lists available at ScienceDirect Biophysical Chemistry journal homepage: http://www.elsevier.com/locate/biophyschem