GOmir: a stand-alone application for human microRNA target analysis and gene ontology clustering Pantelis Zotos*, Georgios Papachristoudis*, Maria G. Roubelakis, Joannis Michalopoulos, Kalliopi 1. Pappa, Nikolaos P. Anagnou and Sophia Kossida. *The authors contributed equally to the work. Abstract- MicroRNAs (miRNAs) are single-stranded RNA molecules of about 26-23 nucleotides length found in a wide variety of organisms. MiRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. GOmir (by using up to four different databases) introduces, for the first time, miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded from http://bioacademy.gr/bioinformatics/proi ects/GOmir. Manuscript received June 12, 2008. P.Z. is with Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4, 11527, Athens, Greece (e-mail:e103620@mail.ntua.gr). G.P. is with MIT Computer Science and Artificial Intelligence Laboratory,The Stata Center, Building 32, 32 Vassar Street, Cambridge, MA 02139, USA (e-mail: geopapa@mit.edu). M.G.R. is with Cell & Gene Therapy Laboratory. Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4, 11527, Athens, Greece (email: mroubelaki@bioacademy.gr). l.M. is with Bioinformatics & Medical Informatics Team. Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4. 11527. Athens, Greece (e-mail: imichalop@bioacademy.gr). K.I.P. is with Cell & Gene Therapy Laboratory, Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4, 11527, Athens, Greece and First Department of Obstetrics & Gynecology, University of Athens School of Medicine, Athens, Greece (e-mail: kpappa@imbb.forth.gr). N.P.A. is with Cell & Gene Therapy Laboratory, Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4, 11527, Athens, Greece and Laboratory of Biology, University of Athens School of Medicine, Athens, Greece (e-mail: anagnou@med.uoa.gr). S.K. is with Bioinformatics & Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Soranou Efesiou 4, 11527, Athens, Greece (corresponding author, phone: +30 210 6597199. fax: +30 210 6597545 e-mail: skossida@bioacademy.gr). I. INTRODUCTION MicroRNAS (miRNAs) are 20- to 23- nucleotide long single stranded RNAs that post-transcriptionally regulate gene expression [1, 2]. MiRNAs act as translation inhibitors of mRNA into protein and promote mRNA degradation. In this way, miRNAs play important role in various cell processes such as proliferation, differentiation, apoptosis, development, cancer and various other diseases [1, 2] and thus represent potential targets for therapeutic applications. The biogenesis of miRNAs is a complicated process involving two different cellular compartments [3]. First, in the nucleus, a primary miRNA (pri-miRNA) is transcribed from the genomic DNA by RNA polymerase II. The size of this primary product varies from 100- to 1000- nucleotides in length. Then, the pri-miRNA is truncated by Drosha and DGCR8 to form a hairpin loop precursor called pre-miRNA [3]. The 60-70 nucleotide long pre-miRNA is loaded to Exportin 8 and Ran-GTP in order to be exported into the cytoplasm. A mature miRNA (20-23 nucleotides) is then released by the RNAse III endonuclease complex including Dicer and trans-activator RNA (tar)-binding protein TRBP [3]. The mature miRNA then inhibits translation of a miRNA into a protein by imperfect base pairing to one or more mRNA sequences [1, 4]. The identification of human miRNAs and their respective targets is of great importance and involves both computational and experimental approaches. Prediction servers such as TargetScan [5], miRanda [6], RNAhybrid [7] and PicTar [8] give information for the miRNA-target interactions. Recent reports have described correlated computational expression of miRNA and their target mRNAs and proteins giving a detailed functional description of the latest [4, 9]. Herein, we describe GOmir, a new stand-alone application for human miRNAs target prediction and ontology clustering, consisting of two different components, JTarget and TAGGO. JTarget combines the data from four different databases (TargetScan, miRanda, RNAhybrid and PicTar), whereas TAGGO gives detailed assignments from Gene Ontology (GO) resources to gene products. TAGGO uses one of the most reliable biological ontologies, the Gene Ontology, the main goal of which is to provide a well structured, precisely defined and controlled vocabulary for describing the roles of genes and gene products in any organism. Thus, GOmir serves as a reliable tool for miRNA target prediction and more interestingly provides assignments from GO resources for these gene products, exploring in this way the functional aspects ofmiRNAs in more detail.