News Topic Classification Using Machine
Learning Techniques
Pramod Sunagar, Anita Kanavalli, Sushmitha S. Nayak, Shriya Raj Mahan,
Saurabh Prasad, and Shiv Prasad
Abstract News topic classification is a method of classifying news articles avail-
able in text data into some predefined classes or labels. This is one of the appli-
cations of text classification. Text classification can be applied in the fields of
spam filtering, language recognition, segmenting customer feedbacks, segregating
technical documents, etc. This paper discusses news topic classification on AG’s
News Topic Classification Dataset using machine learning algorithms such as linear
support vector machine, multinomial Naive Bayesian classifier, K-Nearest Neighbor,
Rocchio, bagging, and boosting. This paper discusses three steps for classifica-
tion, namely pre-processing of text, then applying feature extraction techniques, and
finally implementing machine learning algorithms. These algorithms are compared
using evaluation metrics like Accuracy, Recall, Precision, and F1 Score.
Keywords Text Classification · Natural language processing (NLPs) · Term
frequency–inverse document Frequency (TF-IDF) · Support vector machine
(SVM) · K-nearest neighbours (KNN) · Naïve Bayes · Rocchio
P. Sunagar (B ) · A. Kanavalli · S. S. Nayak · S. R. Mahan · S. Prasad · S. Prasad
Department of Computer Science & Engineering, Ramaiah Institute of Technology, Bangalore
560054, India
e-mail: pramods@msrit.edu
Visvesvaraya Technological University, Belagavi, Karnataka, India
A. Kanavalli
e-mail: anithak@msrit.edu
S. S. Nayak
e-mail: nayaksushmitha90@gmail.com
S. R. Mahan
e-mail: shriyarajmahan@gmail.com
S. Prasad
e-mail: saurabhprasad12@gmail.com
S. Prasad
e-mail: shivpsmy227721@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021
V. Bindhu et al. (eds.), International Conference on Communication, Computing and
Electronics Systems, Lecture Notes in Electrical Engineering 733,
https://doi.org/10.1007/978-981-33-4909-4_35
461