Journal of Proteomics & Bioinformatics - Open Access www.omicsonline.com Research Article JPB/Vol.2/August 2009 J Proteomics Bioinform Volume 2(8) : 336-343(2009) - 336 ISSN:0974-276X JPB, an open access journal         Swati Sinha 1,* , T.S. Vasulu 1 , and Rajat K. De 2,* 1 Biological Anthropology Unit, Indian Statistical Institute, Kolkata, India 2 Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India *Corresponding authors: Swati Sinha, Biological Anthropology Unit, Indian Statistical Institute, Kolkata-108, India, Tel: 91-33-25753215; E-mail: swati.6783@gmail.com Rajat K De, Machine Intelligence Unit, Indian Statistical Institute, Kolkata-108, India Tel: 91-33-25753105, Fax (O): +91-33-25753026, E-mail: rajat@isical.ac.in Received July 02, 2009; Accepted August 11, 2009; Published August 12, 2009 Citation: Sinha S, Vasulu TS, De RK (2009) Performance and Evaluation of MicroRNA Gene Identification Tools. J Proteomics Bioinform 2: 336-343. doi:10.4172/jpb.1000093 Copyright: © 2009 Sinha S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract MicroRNAs are small single stranded RNA molecules of ~ 22 nt in length which play important role in post transcriptional gene regulation either by translational repression of mRNA or by their cleavage. Since their discovery, continuous efforts to identify the miRNA genes led to the discovery of several miRNAs in plants as well as animals. Owing to the limitations of the molecular genetic techniques of miRNA identification, computational approaches were introduced for better and affordable in silico-miRNA predictions. Here, we compared a few miRNA gene identification tools, such as ‘MiPred’,‘Triplet-SVM’,‘BayesMiRNAfind’,‘OneClassmiRNAfind’and ‘BayesSVMmiRNAfind’ to evaluate the performance of its predictability based on the real and pseudo precursor miRNA datasets. Of all the tools examined MiPred is more sensitive (96%) in identifying pseudo miRNAs than Triplet-SVM for real/pseudo miRNA classification, whereas for mature miRNA prediction ‘one-class’ SVM classifier shows best specificity (96%), while BayesSVMmiRNAfind shows least specificity (8%). Keywords: MiPred; Triplet-SVM; BayesMiRNAfind; OneClassmiRNAfind; BayesSVMmiRNAfind; Sensitivity; Speci- ficity; Accuracy; Mathew’s Correlation Coefficient; Positive Predictive Value Abbreviations: miRNA: MicroRNA; pre-miRNA: Precursor MicroRNA; HMM: Hidden Markov Model; SVM: Sup- port Vector Machine; PCA: Principal Component Analysis; K-NN: K-Nearest Neighbor; MCC: Mathew’s Correlation Coefficient; PPV: Positive Predictive Value Introduction Interest in miRNAs and their role as gene expression regulators has been growing immensely (Clop et al., 2006, Feng et al., 2009). The first effort that could identify such a small regulator, the lin-4 RNA in C. elegans, was done by Victor Ambros and colleagues, Rosalind Lee and Rhonda Feinbaum (Bartel DP, 2004). It was shown that the 21 nt lin-4 RNA, represses mRNA and controls part of the C. elegans larval development. The next small regu- latory RNA to be discovered was the let-7, which con- trols another later developmental stage of C. elegans (Lee, et al., 1993; Wightman, et al., 1993). They were previ- ously known as small temporal RNAs (stRNAs), but to- day recognized as the first of the large class of small regu- latory non-coding RNA molecules, ‘microRNAs’. Now it is believed that this class of molecules is not only limited to development but also plays a very important role in the regulation of a wide range of biological processes (Gard et al., 2006, Feng et al., 2009). MicroRNAs are small non-coding RNAs of approxi- mately 22nt (ranged 19-25nt) known to be involved in