ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY,VOL.2, ISSUE 1 SEPTEMBER 2017 Semantic Feature Extraction using Feed-Forward Neural Network for Music Genre Classification Danyal Imran , Hina Wadiwala, Muhammad Atif Tahir, Muhammad Rafi Abstract—Music genre is a conventional category that identifies some piece of music as belonging to a shared tradition or set of conventions characterized by similarities in form, style or subject matter. Traditional method of genre classification tends to extract features and use them to predict labels. These features are independent of each other and do not provide meaning to music genre classification process. In order to achieve semantic meaning of features, feed-forward neural network model with stochastic gradients descent and back propagation algorithm with the categorical cross entropy loss function is investigated in this paper. The main objective is to identify complex patterns that can help in music genre classification. Experiments are performed on AMG1608 dataset and results have indicated significant performance gains when compared with existing approaches. I. I NTRODUCTION Music is a language that speaks to everyone is their own way. It varies from tradition to culture and is filling the online databases rapidly. It is likely that every music recorded in history will be available online [1]. As these databases are growing enormously, much effort is required to retrieve them accurately. Due to such reasons, Music Information Retrieval (MIR) systems are heavily in demand. Research in MIR started since 2002 and from then on, many systems have been evaluated with different strategies, yet accuracy is still a measure that is hard to achieve with live systems. Current systems have not adopted the automated strategy and are still under a manual phase, where the user searches for the music through a title, lyrics, singer name, etc. Since the repository is manually marked, there are many margins for errors that the system can not handle and yield a negative result for the user. Development in MIR systems have targeted objectives such as classification, clustering, tagging, ranking or annotation using a mainstream strategy i.e. to extract low-level descrip- tors, generate mid-level, high-level or hand crafted descriptors which are summarized as feature vectors to achieve their task. Music is a mixture of art, concepts, traditions, instruments and melody, therefore poses a challenge while automatically trying to classify the genre of the music [2]. Therefore, low- level and generated descriptors are not enough to classify mu- sic accurately since they have reached a standard benchmark. The transition in MIR systems came with the advancements in Deep architecture models that overwhelmed the naive three- step strategy [3]. The advantage that deep architecture offered was that we could generate and learn semantic features by using a combination of many descriptors to make better predictions. Recent studies on deep architecture [4], [5], [6], [7] have shown advantages over other MIR systems. In this paper, we present a feed-forward neural network to learn semantic features to predict music genre. The aim of the model is to analyze semantic features to improve music genre classification whilst minimizing performance loss over other datasets using stochastic gradients descent and back propa- gation algorithm. Music data is passed to the extractor that extracts low- level descriptors which are then dimensionally reduced using univariate feature selection algorithm to identify the degree of linear dependency between two random variables and capture any kind of statistical dependency using F -test. Input music once transformed into semantic features through the network, are then mapped to the output layer of the model, which dictates the probabilities of each genre, the output neuron with the maximum probability determines the genre of the music at hand. The idea of back propagation algorithm and stochastic gradients descent algorithm is to reduce error/loss of the classification result by taking advantage of the model architecture by moving similar genre closer whilst moving dissimilar genre farther away from one another. Experiments are performed on AMG1608 dataset, which consists of 1000 songs, each belonging to an individual genre from 10 distinct genres. The rest of the paper is organized as follows. Section II reviews the related work. Section III discusses presented model of Feed-Forward neural network to learn semantic features followed by experiments and results in Section IV. Section V concludes the paper. II. LITERATURE REVIEW Music Genre classification dates back to 2002, when G. Tzane- takis and P. Cook open-sourced their work and MARSYAS framework online [1]. The MARSYAS framework is a low- level feature extractor for audio files that aided researches in working with actual music files. Since then many researchers have worked with different approaches to resolve this problem yet there seems to be no actual system that implements search by music genre, which is highly in demand by music enthusiast. After MARSYAS framework was open sourced, Shen et al. [8] used a combination of acoustic features with the aid of MARSYAS framework to generate a 25-dimensional reduced feature vector. Later being used with neural networks to perform a nonlinear dimensionality reduction. The output generated a single node in the last layer of the network that predicted the genre that the music corresponded to. However, there were several problems faced while performing this task such as Accuracy, Taxonomy etc. [9]. Approaches focused on extracting acoustic features, generating hand-crafted features, Extended Paper(ICEEST 17) 1