Speaker Recognition Using Sparse Representation via Superimposed Features Yashesh Gaur, Maulik C. Madhavi, and Hemant A. Patil Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar, Gujarat, India {yashesh_gaur,madhavi_maulik,hemant_patil}@daiict.ac.in Abstract. In this paper, we demonstrate the effectiveness of superim- posed features for the purpose of template matching-based speaker recog- nition using sparse representations. The principle behind our hypothesis is, if the test template approximately lies in the linear span of the train- ing templates of the genuine class, then so does any linear combination of test templates. In this paper, we introduce the notion of superimposed features for the first time. Using our initial trials on the TIMIT database, we have shown that superimposed features can result in reducing the com- plexity cost by 80 % with a very minor decrease in identification rate by 0.67 % and a minor increase in EER by 0.85 %. Keywords: Superimposed features, sparse representations, orthogonal matching pursuit, template matching, speaker recognition. 1 Introduction Speaker recognition is the task to recognize a person from his or her voice, with the help of machines. Depending on the way feature matching is done, the systems can be classified into template matching systems and probabilistic modelling systems [1]. Probabilistic modelling systems involve modelling feature vectors with probability density functions (pdf). The probability of a test utter- ance, given the speaker model, is evaluated to get the confidence scores [1]-[2]. The template matching techniques, on the other hand, do not involve any prob- abilistic measures. The features from the test utterances are considered as some variation of the training features [1]. The template matching-based techniques are usually faster as no probabilistic modelling is required prior to matching. A sparse representation for the purpose of pattern classification has been used in [3]-[5]. A sparse representation model for the speaker recognition was used in [6]. This technique was probabilistic in nature as they used Gaussian Mixture Model (GMM) mean super vectors to model speaker characteristics. Sparse representa- tions were invoked after the speaker characteristics were modelled using GMM. Recently, sparse representations were used using template matching technique in [7]. The benefit of this kind of technique is the reduction in complexity as sparse representations can directly be used on the features, without any prior P. Maji et al. (Eds.): PReMI 2013, LNCS 8251, pp. 140–147, 2013. c Springer-Verlag Berlin Heidelberg 2013