iJournals: International Journal of Software & Hardware Research in Engineering (IJSHRE) ISSN-2347-4890 Volume 10 Issue 12 December 2022 Arhaan Garg; Reetu Jain; Syed Abou Iltaf Hussain, Volume 10 Issue 12, pp 15-23, December 2022 Early detection of vocal disorders such as laryngeal cancer and dysphonia using voice analysis and machine learning Authors: Arhaan Garg 1 ; Reetu Jain 2 ; Syed Abou Iltaf Hussain 3 Affiliation: Grade 12 Step By Step School Noida 1 ; Supervisor, On My Own Technology Pvt. Ltd., Mumbai 2,3 Email: gargarhaan13@gmail.com 1 ; reetu.jain@onmyowntechnology.com 2 ; syed.hussai@onmyowntechnology.com 3 Abstract - Many serious disorders with our throat, such as laryngeal cancer, laryngitis, muscle tension dysphonia, vocal cord paralysis, and so on, are detected after the patient has become critically ill. These disorders can also be life-threatening, as I witnessed with my uncle. Looking at one of the most painful cancers, laryngeal cancer, I wanted to work on a remedy. This occurrence was crucial in directing my attention to this field of study. The majority of these disorders can be discovered early since the voice begins to change due to vocal cord disformations at an early stage. Smoking, drinking, bad eating habits, career, and other factors are all key contributors to these problems. The change in voice is typically the first sign of all of these disorders. People, on the other hand, have a tendency to disregard the very first symptom, which leads them deep into the problem. Voice irregularities, such as variations in frequency, may potentially be too deceiving to the human ear to be taken seriously. Voice disorders such as dysphonia and laryngeal cancer can be detected early using artificial intelligence and machine learning. I worked with Santosh Hospital to collect data and do background study on vocal problems and irregularities. Throughout the procedure, I collected 100+ minutes of audio data from individuals with laryngeal cancer while also researching approaches for detecting voice problems such as laryngoscopy. The project's goal is to distinguish between the voices of a healthy patient and a patient with a vocal cord disorder. A voice analysis comparison between a healthy patient and a patient with a vocal issue was used for this objective. 40 human voice parameters such as frequency, pitch, and zero crossing rate were retrieved using MFCCs and methods such as the discrete cosine transformation and the mel filter bank. A wrapper was used to pick the most important features in determining if the patient has a vocal problem or not. After that, the logistic regression model was used to train a machine learning model to determine if the audio sample was disordered or healthy. The instrument has an incredibly high accuracy of 88%, making it extremely efficient. This is a technology that assists patients at an early stage in order to keep therapy simple and cure cancer and other critical conditions faster. It also lessens stress on doctors and lowers medical costs while decreasing the effect of sedatives on patients. The technique is very simple to use and available in all places where competent doctors and proper equipment to detect such major voice problems are lacking. Keywords- Voice Disorder, MFCC, Voice analysis, Machine learning, Dysphonia Detection, Laryngeal Cancer Detection I. INTRODUCTION The voice box (larynx), which is made up of cartilage, muscle, and mucous membranes, is situated close to the base of the tongue and the top of the trachea. At the beginning of the windpipe are two pliable bands of muscle tissue known as the vocal cords. The vocal cords make sound as they vibrate. Your voice chords are closer together due to air moving through your larynx, which causes this vibration. The vocal cords also assist in closing the voice box, preventing food or liquid from being inhaled after swallowing. A person may have a vocal disorder if there is a problem with their voice's pitch, volume, tone, or other characteristics. One will not