International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014 1059 ISSN 2229-5518 IJSER © 2014 http://www.ijser.org Audio Classification and Retrieval by Using Vector Quantization Shruti Vaidya, Dr. Kamal Shah Abstract—In today’s world, we can say that information and its processing has become the critical aspect for functioning of everything. In the early days, information was generally obtained and processed in the form of text. Today information is available in all forms namely, text, music, graphics, etc. which are a easily understandable and accurately represent information. Information is first captured then the captured information is retrieved and analyzed for further requirements. In this paper, the information that we take into consideration is in audio form. We have studied the feature vector extraction methods, similarity measurement techniques, and have also measured the performance parameters. It has been observed that the use of multiple feature vectors provides better and more accurate classification and retrieval of audios from large database. Index Terms— Audio, Audio Retrieval, Audio Vector Quantization, Data Compression, k-Nearest Neighbor, Precision Recall, Vector Quantization —————————— —————————— 1 INTRODUCTION ector Quantization (VQ) is an efficient and simple ap- proach for data compression. Since it is simple and easy to implement, it is widely used in different applications, such as pattern recognition, face detection, image segmenta- tion, speech data compression, Content Based Image Retrieval, tumor detection etc. Vector quantization is a lossy compres- sion technique. There are three major procedures in vector quantization, namely codebook generation, encoding proce- dure and decoding procedure. In the codebook generation process, audio is divided into several k-dimensional training vectors. The representative codebook is generated from these training vectors by the clustering techniques. In the encoding procedure, the original audio is divided into numerous k- dimensional vectors and the encoding of each vector is done by indexing of codeword by a look up table methodology. The encoded results are called an index table. In the decoding procedure, the same codebook is used by the receiver to translate the index back again into its appropriate codeword for rebuilding of the audio. One of the key points of Vector Quantization is to generate a good codebook such that the distortion between the original and the reconstructed audio should be minimum. In order to find the best-matched codeword in the encoder, various codebook full search algo- rithm can be used [1]. 2 OVERVIEW OF SYSTEM Research till today in audio classification tends to focus on matching test sounds into a limited number of predefined cat- egories such as music, applause, speech etc., but this approach would describe each sound on the feature vectors. Furthermore, the proposed system allows intelligent interpre- tation of unseen examples, e.g. describe a door closing based on the similarity to previously seen events. The new signal can be easily classified and other related sounds can also be re- trieved in relation to the other sounds as shown in the Fig.1. For instance, consider a system where given an input sound of a door closing, would return the label “background sound”, and will retrieve from a database samples most similar to it. Fig.1: System Overview V ———————————————— Shruti Vaidya is currently pursuing masters of engineering degree pro- gram in information technology, TCET, Mumbai University,India, E-mail: shrutiv01@gmail.com Dr. Kamal Shah is currently a professor in masters of engineering infor- mation technology department,TCET, Mumbai University,India, E-mail: kamal.shah@thakureducation.org IJSER