International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value (2015): 78.96 | Impact Factor (2015): 6.391 Volume 6 Issue 2, February 2017 www.ijsr.net Licensed Under Creative Commons Attribution CC BY Text-Dependent Speaker Identification and Verification Using Hindi Database in Adverse Acoustic Condition Shrikant Upadhyay 1 , Sudhir Kumar Sharma 2 , Pawan Kumar 3 , Aditi Upadhyay 4 1,3 Department of Electronics & Communication Engineering, Cambridge Institute of Technology, Ranchi, Jharkhand, India 2,4 Department of Electronics & Communication Engineering, Jaipur National University, Jaipur, Rajasthan, India Abstract: Human voice is one the medium through he/she communicate with the real world. Text-dependent voice or database can be used to protect your asset or privacy in many respects. Speaker identification and verification is one the issues that must be tested and verified so, that we can share information to the far location from the remote area and used for different application purposes like account access, password verification, pin access etc. Here, text-dependent Hindi database has been used from shunay to nau and we try to evaluate the efficiency and error rate in adverse acoustic condition. Original text might be corrupted or disturbed and person sitting at the access point may not identify the authenticate user in such situation. So, we put your effort with the help of this paper to identify and verify the speaker in adverse acoustic condition for real time applications. Here different combination of feature extraction method has been used to compute the performance of the database. Keywords: Hindi Database, Adverse Acoustic Condition, Feature Extraction, Text Dependent 1. Introduction Speaker identification and verification is one the major challenges in the speech domain. This will help to solve many real time application issues. Text-dependent sample is quite easy to identify and create a database. Text-dependent task involves some form of pre-determined or prompted password, in order to obtain the required text. It can be used for applications such as voice mode password or signature verification. In such applications, there is a need to change the password frequently and it can be done easily by changing the pre-determined text. In text-dependent speaker verification, during enrolment phase a limited number of utterances of the fixed text is collected. Therefore, approaches based on template matching are used for pattern comparison instead of approaches based on statistics or artificial neural networks, which needs a large amount of training data. Research in speech processing and communication, for the most part, was motivated by people‟s desire to build mechanicals models to emulate human verbal communication. Research interest in speech processing today has done well beyond the notion of mimicking human vocal apparatus [1]. People can reliably identify familiar voices and about 2-3 seconds is enough to identify a voice, although performance decreases for unfamiliar voices [2]. Even if duration of the utterance was increased, but played backward (which distorts timing and articulatory cues), the accuracy decreases drastically. Widely varying performance on this background task suggested that cues to voice recognition vary from voice to voice, and that voice patterns may consist of a set of acoustic cues from which listeners select a subset to use in identifying individual voices. Recognition often falls sharply when speakers attempt to distinguish their voices [3]. This is reflected in machines, where accuracy decreases when mimics act as impostors. Humans appear to handle mimics much better than machines do, easily perceiving when a voice is being mimicked [4]. If the target (intended) voice is familiar to the listener, he/she often associates the mimic voice with it. Certain voices are more easily mimicked than others, which lends evidence to the theory that different acoustic condition are used to distinguish different voices for real applications. Human performance in adverse conditions was also reviewed in [4], where it was reported clearly that human listeners are adept at using various cues to verify speakers in the presence of acoustic mismatch. Speaker recognition is one area of artificial intelligence where machine performance can exceed human performance- using short test utterances and N- number of speakers, machine accuracy often exceeds that of humans [4]. 2. Speaker Identification and Verification The objective of speaker identification is to classify an unlabeled utterance belonging to one of the N reference speakers [5]. It can be closed set identification or open set identification shown in Figure1. The objective of speaker identification is to decide the identity of speaker based on the speaker‟s voice, from set of N speakers i.e., one-to-many matching. Paper ID: ART2017954 2002