Volume 4, Issue 3, September – October 2010; Article 023 ISSN 0976 – 044X International Journal of Pharmaceutical Sciences Review and Research Page 136 Available online at www.globalresearchonline.net IN SILICO COM PARATIVE GENOM E ANALYSIS OF HEPATITIS B AND HEPATITIS C VIRUS Budhayash Gautam * , Shashi Rani, Satendra Singh and Rohit Farmer Department of Computational Biology and Bioinformatics, Sam Higginbottom Institute of Agricultural, Technology and Sciences, Allahabad-211007, U.P., INDIA * Email: budhayashgautam@gmail.com ABSTRACT In the present study, comparative genome analysis of Hepatitis B and C is done. The similarity and conservation of sequences were analyzed at the genome level by In silico approaches. The study revealed that both the sequences have identical conservation at the sequence level with each other. Both the genomes contain same numbers of the genes and sizes of the genes are almost similar. Most of the sequence patterns of both strains are identical. Thus, although the viruses possessed different size of the genome and slightly different positions and numbers of repeats, they were containing almost similar information at the genome level. Also, it may be possible that hepatitis C has added some genetic information to its viral genome and it may be evolved from hepatitis B. Keywords: Comparative Genomics, Hepatitis, Patterns, Tandem Repeats. INTRODUCTION Genome analysis entails the prediction of genes in uncharacterized genomic sequences. The objective is to be able to take a newly sequenced uncharacterized genome and break it up into introns, exons, repetitive DNA sequences, transposons etc. and other elements. Several genetic disorders like Huntington’s disease, Parkinson’s disease, sickle cell anemia etc. are caused due to mutations in the genes or a set of genes inherited from one generation to another. There is a need to understand the cause for such disorders. An understanding of the genome organization can lead to concomitant progresses in drug target identification. Comparative genomics has become a very important emerging branch with tremendous scope, for the above mentioned reasons. If the genome for humans and a pathogen, a virus causing harm is identified, comparative genomics can predict possible drug targets for the invader without causing side effects to humans 1 . Comparative genomics is an exciting new field of biological research in which the genome sequences of different species of human, mouse and a wide variety of other organisms from yeast to chimpanzees are compared. By comparing the finished reference sequence of the human genome with genomes of other organisms, researchers can identify regions of similarity and difference. This information can help scientists better understand the structure and function of human genes and thereby develop new strategies to combat human disease. Comparative genomics also provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved among species, as well as genes that give each organism its unique characteristics 2 . The main objectives of the present study were to find out the sequence similarity and sequence conservation between Hepatitis B and C. M ATERIALS AND M ETHODS Sequence retrieval The genomic sequences of hepatitis B and hepatitis C were retrieved from the “National Centre for Biotechnology Information”, (NCBI) (http://www.ncbi.nlm.nih.gov ), genome database using hepatitis B and hepatitis C as keywords in the fasta file format. There accession i.d., are NC_003977 and NC_004102 respectively. Sequence of Hepatitis B virus is complete genome sequence, dsDNA; circular; having length of 3,215 nucleotides and its replicon type is viral segment. Sequence of Hepatitis C virus is complete genome sequence, ssRNA; linear; having length of 9,646 nucleotides and its replicon type is viral segment. Sequence alignment Pairwise sequence alignment of Hepatitis B and Hepatitis C genomic sequences was done using ClustalW 3 . Genes and proteins prediction Genes were predicted in both hepatitis B and hepatitis C using FGENESV tool (http://linux1.softberry.com/berry.phtml ). Hypothetical proteins coded by these genes were also predicted in both hepatitis B and hepatitis C using same tool. Tandem repeats identification Tandem repeats were identified within the genomic sequences of hepatitis B and hepatitis C with the help of Tandem Repeat Finder tool 4 . Pattern identification Conserve sequences or patterns were predicted in the hypothetical proteins of both hepatitis B and hepatitis C by using PROSCAN tool (http://npsapbil.ibcp.fr/cgibin/npsa_automat ), and Pfam Search tool (http://pfam.sanger.ac.uk/search ).