Research Article
Linguistic Analysis of Hindi-English Mixed Tweets for
Depression Detection
Carmel Mary Belinda M J ,
1
Ravikumar S ,
1
Muhammad Arif ,
2
Dhilip Kumar V ,
1
Antony Kumar K ,
1
and Arulkumaran G
3
1
Department of Computer Science & Engineering,
Vel Tech Rangarajan Dr Sagunthala R and D Institute of Science and Technology, Chennai, India
2
Department of Computer Science and Information Technology, University of Lahore, Lahore, Pakistan
3
Department of Electrical and Computer Engineering, Bule Hora University, Bule Hora, Ethiopia
Correspondence should be addressed to Arulkumaran G; erarulkumaran@gmail.com
Received 31 January 2022; Revised 15 February 2022; Accepted 21 February 2022; Published 12 April 2022
Academic Editor: Naeem Jan
Copyright © 2022 Carmel Mary Belinda M J et al. is is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low
self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). is makes it a high time to come up
with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the
social media platform Twitter. e proposed model is based on linguistic analysis and text classification by calculating probability
using the TF * IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a
mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and
screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being
used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. e entire architecture works over
English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop
to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed
techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.
1. Introduction
Recent studies by the World Health Organization (WHO)
[1] have revealed that 56 million Indians suffer from de-
pression and another 38 million Indians suffer from anxiety
disorders, and only a fraction of them receive adequate
treatment. Even though this disorder is highly treatable,
only a fraction of those suffering receive treatment, due to
the societal stigma associated with mental health. Diagnosis
and subsequent treatment for depression are often delayed,
imprecise, and/or missed entirely. e social media activity
of individuals presents a revolutionary approach to
transforming early depression intervention services, es-
pecially for young adults [2, 3]. Many depressed individuals
seldom choose not to discuss their mental health with their
family and friends because the taboo surrounding de-
pression is still high, especially in India. Such individuals,
when they tweet, consciously and subconsciously use words
that indicate their mental health. e advent of social media
platforms has made it relatively easier to find these indi-
viduals [4, 5]. Since it is nearly impossible to check the hints
from the posts of each user across all platforms for a human
being or even a team of them, automating the entire process
becomes the need of the hour. One such approach accepted
globally is sentiment analysis [6, 7]. It is a cross platform
ML approach that can be implemented to filter out a
particular user based on the pattern of their social media
posts.
Hindawi
Journal of Mathematics
Volume 2022, Article ID 3225920, 7 pages
https://doi.org/10.1155/2022/3225920