(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 9, No. 8, 2018 63 | Page www.ijacsa.thesai.org A Review on Scream Classification for Situation Understanding Saba Nazir, Muhammad Awais Dept. of Software Engineering Govt. College University Faisalabad, Pakistan Sheraz Malik Dept. of Information Technology Govt. College University Faisalabad, Pakistan Fatima Nazir Dept. of Software Engineering Govt. College University Faisalabad, Pakistan Abstract—In our living environment, a non-speech audio signal provides a significant evidence for situation awareness. It also compliments the information obtained from a video signal. In non-speech audio signals, screaming is one of the events in which the people like security guard, care taker and family members are particularly interested in terms of care and surveillance because screams are atomically considered as a sign of danger. Contrary to this concept, this review is particularly targeting automated acoustic systems using non-speech class of scream believing that the screams can further be classified into various classes like happiness, sadness, fear, danger, etc. Inspired by the prevalent scream audio detection and classification field, a taxonomy has been projected to highlight the target applications, significant sound features, classification techniques, and their impact on classification problems in last few decades. This review will assist the researchers for retrieving the most appropriate scream detection and classification technique and acoustic parameters for scream classification that can assist in understanding the vocalization condition of the speaker. Keywords—Scream classification; scream detection; acoustic parameters; surveillance; security I. INTRODUCTION In the past few decades, there have been several efforts regarding the classification of the acoustic data into classes. The audio data is very informative and a rich source of extraction for the type of content involving content-based classification of the acoustic signals. Human beings use vocal tract for producing speech sounds such as talking, singing, crying, and laughing. These sounds are further classified as speech or non-speech vocalizations. Speech consists of voices that are in the form of sentences and can be understood using different Natural Language Processing (NLP) techniques. The non-speech sounds include laugh, sneeze, cough, snore, and scream. These non-speech vocalizations are sometimes segregated from speech signals to extract additional information about the context, situation, or emotional state of the speaker. Scream is a non-speech signal that is caused by a loud vocalization when air passes through vocal folds with greater force than regular vocalizations. Most often, a scream is a reflex action or a response from an unexpected situation and it is strongly associated with emotional behavior of the speaker. It can have many forms like a scream of joy, danger, pain, surprise, etc. Scream sound event classification and detection has wide applications in science due to which it has gained significant importance in literature. Many real-life acoustic systems use scream detection in the areas like speaker identification [1], Audio-Surveillance Systems [2] and Home applications [3]. These systems use the knowledge extracted from scream detection and classification for processing. In this field, the conjunction of time-frequency features and machine learning classifier have achieved recent developments. Different techniques and methodologies have been established to differentiate speech and non-speech sounds. These include Support Vector Machines [3], band-limited spectral entropy [4], Deep Neural Networks (DNN) [5], Hidden Markov Model (HMM), sound event partitioning [6] and modulation power spectrum [7]. Most works on scream detection and classification emphasize on some crucial acoustic events, none cover the overall state-of-the-art for scream classification and detection. The current work varies of all preceding efforts in terms of emphasis, correctness as well as suitability. The aim of this review is to highlight the scream classification concerns and challenges to analyze and classify the screams from a variety of perspectives. Additionally, a comparative study is hereby presented that is based on the problem domain, sound features, and classification techniques. By overviewing this review, one can easily determine the problem domains where to put the scream efforts, using best sound parameters and scream classification techniques for situation understanding. This review is planned as follows. Section 2 covers the data collection techniques and research methodology. Section 3 contains an overview of different classes of problem domains, sound features, and classification techniques. Section 4 evaluates the various data classes and argues on the comparison and accuracy rates. Finally, Section 5 concludes the key points in this review. II. DATA COLLECTION A review of 30 different research articles that are associated with scream detection and classification in various environments is presented. Highly cited and credible publications are used from different digital libraries for obtaining the research source. A thorough analysis is performed on all the articles to make sure that the content is pertinent to the research interests. Those classification problems that have hindered the further development and exploration in screaming environments, are discussed.