International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 4329
SPEECH BASED EMOTION DETECTION SYSTEM USING MFCC
Vemula Yakub Reddy
1
, Mangipudi Pavan Kumar
2
, Mankala Sushma
3
,Gurindagunta
Kiran
4
,Vijaya Kumar Gurrala
5
1- 4
Dept. of Electronics and Communication Engineering, VNR Vignana Jyothi Institute of Engineering and
Technology, Hyderabad, Telangana, India.
5
Assistant Professor, Dept. of Electronics and Communication Engineering, VNR Vignana Jyothi Institute of
Engineering and Technology, Hyderabad, Telangana, India.
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Generally, people express their emotions through
speech, facial expressions and body pose. But, estimating the
state of his emotion can be found out easier through speech
only. Recent studies says that harmony features of speech
signal help in recognizing emotions easily. Because speech is a
major channel for communicating emotion. Here we developed
a speech based emotion detection system considering German
emotional corpus database (EMODB) using Neural Network
approach. It comprises 10 sentences which covers 7 classes of
emotion from everyday communication. Using Fourier
parameters of speech signal i.e. when speech signal is Fourier
transformed, harmonies can be calculated by extracting
features using Mel Frequency Cepstrum Coefficient (MFCC).
Thus, by extricating the emotional conditions of speaker from
speech, we can improve the exhibition of speech based emotion
detection system and subsequently extremely valuable for
criminal investigations, smart assistance surveillance and the
location of dangerous events in health care systems too.
Key Words: Mel-Frequency Cepstral Coefficients(MFCC),
Speech Recognition, Cepstrum, Speech analysis, Neural
Networks.
1. INTRODUCTION
Speech recognition [1] is the way toward changing over an
acoustic signal, caught by a Microphone to a lot of words.
These words can be utilized for applications such as orders
and control, information passage, and record planning.
Speech is acoustic signal which contains data about the
perspectives on the speaker and furthermore the thoughts
that pivoting in the brain of a speaker. Automatic Speech
Recognition [2] (ASR) is just based on acoustic data in audio
signal. But in an uproarious situation, its precision level is
less. Along these lines, rather than Audio Speech Recognition
(ASR) [3], we can utilize Audio-Visual Speech Recognition [4]
(AVSR) which utilize both speech and visual data moreover.
Audio is one an antiquated approach for communication. In
present days these speech signal are utilized in man-
machine communication also. When inspected in an
adequately brief timeframe (5-100 m sec), its attributes are
fixed. Speech based emotion detection plays a major role in
machine learning platform by improving man-machine
interaction. Emotions plays a major role in human
environment, we can find the emotion of a person by seeing
his/her facial expressions or by noticing his/her actions.
Here, this system deals with the detecting emotions of a
person from his speech. By recording the a speech of a
person and extracting features from those speech and by
performing specific actions on them, emotion behind those
speech can be extracted.
To improve machine man interface speech based emotion
detection system gives some different applications, for
example this system can be utilized in Airplane cockpits to
give examination of Mental condition of pilot to stay away
from calamities, for example, mishaps. Speech emotion
detection system also uses to recognize worry in speech for
better execution lie recognition, in Call centre conversation
to break down lead examination of the customers which
helps with improving nature of nature of a call systematic
and in like manner in clinical field for Mental determination.
Emotion detection in criminal investigations also helps in
finding criminals who hides emotions behind their facial
expressions. If machine will prepared to understand
individuals like emotion conversation with programmed
robot toys would be dynamically reasonable and pleasant. In
vehicle board system where information of the mental state
of the driver may be given for the system in keeping in mind
of his/her security.
2. SPEECH EMOTION DETECTION
Generally, speech emotion [7] can be recognized by deeply
analyzing the speech signal. Here the speech signal is divided
into frames and separate frames are analyzed and features
such as pitch or fundamental frequencies, energy, MFCC
values are obtained and using neural network mechanism
emotion is classified. The assessment of the speech emotion
detection [8] system depends on nature of speech/audio. In
event that the substandard speech is utilized as a
contribution to the system, at that point we might have
wrong conclusions. The audio signal as a contribution to the
emotion recognition [9] system might have this present
reality emotions. The main aim of this model is to detect the
emotion of the speaker from his voice with help of feature
extraction with a popular technique called MFCC feature
extraction and neural network classifier to modify its
emotion detection accuracy. The speech emotion detection