Posted on 31 Jan 2023 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.167517066.63670454/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. SEMG Approach For Speech Recognition Siddesh Shisode 1 , Bhavesh Mahtre 1 , Jeet Sikligar 1 , Sushil Vishwakarma 1 , Supriya Tupe 1 , Sheetal Jagtap 1 , and Milind Nemade 1 1 K J Somaiya Institute of Engineering and Information Technology January 31, 2023 Abstract Speech is the most familiar and habitual way of communication used by most of us. Due to speech disabilities, many people find it difficult to properly voice their views and thus are at a disadvantage. The re- search tackles the issue of lack of speech from a speech impaired user by recognizing it with the use of ML models such as Gaussian Mixture Model - GMM and Convolutional Neural Network - CNN. With properly recorded and cleaned muscle activity from the facial muscles it is possi- ble to predict the words being uttered/whispered with a certain accuracy. The intended system will additionally also have a visual aid system which can provide better accuracy when used together with the facial muscle activity-based system. Neuromuscular signals from the speech articulat- ing muscles are recorded using Surface ElectroMyoGraphy (SEMG) sen- sors, which will be used to train the machine learning models. In this paper we have demonstrated various signals synthesized through the ElectroMyography system and how they can be classified using machine learning models such as Gaussian Mixture Model and Convolutional Neu- ral Network for the visual-based lip-reading system. REVIEWPAPER SEMG Approach For Speech Recognition Siddesh Shisode1* | Bhavesh Mahtre2* | Jeet Sikligar3*| Sushil Vishwakarma4* | Supriya Tupe5* | Prof. Sheetal Jagtap6+ | Prof. Milind Nemade7+1,2,3,4,5,6,7 Electronics Engineer- ing, KJSIT, Mumbai, Maharashtra, 400022, IndiaCorrespondenceSiddesh Shisode, Electronics Engineer- ing, KJSIT, Mumbai, Maharashtra, 400022, India Email: siddesh.shisode@somaiya.eduPresent address* Hirkani CHS, Sector - 15, Nerul, Maharashtra, India - 400706Funding informationAbbreviations: Con- volutional Neural Network - CNN, Gaussian Mixture Model - GMM, Motion History Index - MHI, Prin- cipal Component Analysis - PCA, Audio Speech Recognition - ASR, Multiscale Spatial Analysis - MSA, 2-Dimensional Linear Discriminant Analysis - 2DLDA, Bidirectional Long Short-Term Memory - BLTSM * Equally contributing authors + Guide | INTRODUCTION The research tackles the issue of lack of speech from a speech impaired user by recognizing it with the use of ML models. With properly recorded and cleaned muscle activity from the facial muscles it is possible to predict the words being uttered/whispered with a certain accuracy. The neuro-muscular activity will be recorded from surface elec- trodes; it will be a non-invasive method, meaning it does not involve the electrodes being inserted inside the body. Since the method being used is a non-invasive measure to record the muscle activity, it might contain other noises in the signal, these can be removed easily but to ensure a higher accuracy, a visual method to track what the user is intending to speak can be used. The visual based system will record the lip-movement of the user with the help of a camera and output a list of probable 1