IJSRSET1622104 | Received: 20 March 2016 | Accepted: 264 March 2016 | March-April 2016 [(2)2 337-343] © 2016 IJSRSET | Volume 2 | Issue 2 | Print ISSN : 2395-1990 | Online ISSN : 2394-4099 Themed Section: Science and Technology 337 Computational Techniques for the Functional Annotation of Hypothetical ORFS in Human Chromosome 3 Sivashankari Selvarajan 1 , Piramanayagam Shanmughavel 2 1 Assistant Professor in UGC, Innovative Programme, Department of Bioinformatics, Nirmala College for Women, Coimbatore, Tamil Nadu, India 2 Associate Professor, Department of Bioinformatics, Bharathiar University, Coimbatore, Tamil Nadu, India ABSTRACT In biochemistry, a hypothetical protein encoded by a hypothetical gene is a protein whose existence has been predicted, for which there is no experimental evidence for expression in vivo. As a result, the function of such genes is not known. Despite several efforts, only 50-60 % of genes have been annotated in most completely sequenced genomes and their functions are known. The rest 40% of the genes in any genome is totally unknown in terms of its functions. As of September 2010, there are around 637 genes encoded as Hypothetical in NCBI. So, the present investigation focused on functional annotation of hypothetical genes in the Chromosome 3 of the Human Genome. Keywords: Annotation, Chromosome3, Function, Hypothetical Genes I. INTRODUCTION The human genome project revealed the three billion base pairs encrypted within the twenty three pairs of chromosomes in the human genome. Also, the Human Genome contains 30,000 genes, constituting just 1% of the ~3 billion base pairs of the total human DNA. Among these, there are genes (called Hypothetical ORFs) which code for the so-called “hypothetical proteins” whose existence is either validated experimentally or predicted computationally but its function is not yet reported. Hence, after the completion of the genome sequences, the challenge ahead for all biologists is to use the data to interpret the function of the protein, the cell, and the organism. This can be achieved by a process called annotation which involves identification of genes within the chromosome, its fine structure, determination of protein products encodes by the gene and understanding the function (Venter et al., 2001). A group of these genes may be involved in many pathological disorders and hence are of pharmaceutical significance. Thus, annotation is an essential process of understanding the entire mechanism behind the cellular processes and molecular functions of a genome. However, there were inconsistencies in the accuracy of genome annotation in the initial stages which are now gone due to advancements in computational algorithms and potentiality of bioinformatics. After annotation of the Human Genome a number of genes (59%) reported by the project were hypothetical and annotated genes with unknown function (Venter et al 2001) (Table 1). Table 1.1 The Human Genome Statistics S.No Topic Statistic 1 Total size of the genome approximately 3,200,000,000 bp 2 Percentage of DNA spanned by genes between 25% and 38% 3 Percentage of exons 1.1 to 1.4% 4 Percentage of introns 24% to 37% 5 Occurrence rate of genes about 12 per 1,000,000 bp 6 Percent of hypothetical genes and annotated genes with unknown function in the genome 59% Source: Venter et al., 2001