http://www.iaeme.com/IJAERT/index.asp 384 editor@iaeme.com
International Journal of Advanced Research in Engineering and Technology (IJARET)
Volume 11, Issue 12, December 2020, pp.384-394, Article ID: IJARET_11_12_041
Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=12
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
DOI: 10.34218/IJARET.11.12.2020.041
© IAEME Publication Scopus Indexed
IDENTIFICATION OF BEST FEATURES OF
SMALL PEPTIDES USING VARIOUS FEATURE
SELECTION METHODS
Ankita Tripathi
Amity Institute of Biotechnology Amity University, Gurgaon, India
Tapas Goswami
The University of Petroleum and Energy Studies, Dehradun, India
Shrawan Kumar Trivedi
Indian institute of technology (ISM) Dhanbad, India
Ravi Datta Sharma
Amity Institute of Biotechnology Amity University, Gurgaon, India
ABSTRACT
Classification of the different categories of small peptides is a challenging research
area in bioinformatics research. However, machine learning based approaches are
widely experimented in the literature with enormous success. For excellent learning of
the classifiers, few numbers of informative features are important. This research explores
a comparative study between various supervised feature selection methods such as
Document Frequency (DF), Chi-Squared (
2
), Information Gain (IG), Gain Ratio (GR),
Relief F (RF), and One R (OR). The corpus of small peptides data is selected from ARA-
PEP repository. Bayesian Classifier is taken to classify the different categories of the
given corpus with the help of features selected by above feature selection techniques.
Results of this study shows that RF is the excellent feature selection technique amongst
other in terms of classification accuracy and false positive rate whereas DF and
2
were
not so effective methods. Bayesian classifier has proven its worth in this study in terms of
good performance accuracy and low false positives. Small peptides Identification,
Machine Learning Classifiers, Pattern Recognition