Automatic Music Transcription Using Fourier Transform for Monophonic and Polyphonic Audio File Kelvin A. Minor 1* , Iman H. Kartowisastro 2,3 1 Computer Science Department, BINUS Graduate Program - Master of Computer Science, Bina Nusantara University, Jl. Raya Kebon Jeruk No. 27, Jakarta Barat 11530, Indonesia 2 Computer Science Department, BINUS Graduate Program - Doctor of Computer Science, Bina Nusantara University, Jl. Raya Kebon Jeruk No. 27, Jakarta Barat 11530, Indonesia 3 Computer Engineering Department, Faculty of Engineering, Bina Nusantara University, Jl. Raya Kebon Jeruk No. 27, Jakarta Barat 11530, Indonesia Corresponding Author Email: kelvin.minor@binus.ac.id https://doi.org/10.18280/isi.270413 ABSTRACT Received: 6 June 2022 Accepted: 6 August 2022 Musical sheet is an important tool for musicians that enables musicians to communicate with each other and help musicians to learn a composition of a song. Sometimes, musicians face an obstacle when they cannot find the musical sheet to learn a new song or it may require payment to get the sheet. The solution for this problem is to learn the song by figuring out the composition of a song using music transcription. Music Transcription is the process of music information retrieval to produce musical notation. Music Transcription using a computational method often called Automatic Music Transcription by upload the audio file as an input and generate musical sheet. The proposed method is solved using Note Value Detection to separate windows by the detected note values and Fourier Transform to recognize the frequency from each window. This study is evaluating the system using three variables; note value, pitch accuracy, and extra notes. The study shows that note value and pitch detection gives a relatively small percentage error. Meanwhile, extra note detection gives a relatively moderate percentage error in every polyphonic file. Keywords: musical notation, music transcription, note value detection, Fourier Transform 1. INTRODUCTION The music industry has become popular and continues to evolve across the world. Even in difficult situations (like Covid-19 Pandemic), the industry implements different approaches using the available technology to interact with the consumers [1]. Prior to digitalization, distribution of music is even boarder [2]. Besides can be enjoyed through the audio, music can be represented in a sheet of a paper. Musical sheet is an important tool for musicians that enables musicians to communicate with each other [3]. With musical sheet, anyone can learn a composition of a song, but not every song has the musical sheet or it may require payment to get the sheet. Music Transcription is the process of music information retrieval to produce musical sheet. By definition, Music Transcription is listening to the song and write it down in musical notation [4]. To perform Music Transcription manually, a person must have certain knowledge and it requires a lot of time to translate a song into notation. While Music Transcription process done manually, Automatic Music Transcription (AMT) is the Music Transcription process using computational technique. The proposed method in this study wants to use Fourier Transform to perform pitch detection system by separating audio file into windows systematically using the duration of the notes. The paper will break down the whole step and experiment in a clear and visible state. The system is based on Note Value Detection and Pitch Detection. In this study, Note Value Detection can be obtained from the perceived pulse of the signal to estimate the duration of each note (note value / note length) in the audio file. Note Value Detection algorithm needs to be performed because the result of the algorithm can be used to split the signal into multiple windows, so the system can have windows contained signals ordered by time. Pitch Detection in this research is using Fast Fourier Transform on each window to obtain the frequencies and store it to the data collection. The final step of the method is translating the data collection into musical notation as the presentation. Evaluation of the system will be performed using the pair of musical sheet and audio file. The musical sheet will act as the ground truth in the array form. The audio file will act as the input of the system that produce the list of note name and note value. The evaluation value will be obtained by comparing the ground truth and the result. 2. RELATED WORKS In the academic field, research on Music Transcription has been performed by some researchers using well-known methods like Fourier Transform. Fourier Transform can be used for melody transcription, audio remixing, karaoke, and instrument identification [5]. Seetharman et al. are using Fourier Transform to analyze the audio and using some technique from image processing to extract singing voice from an audio file. Similar research was performed to transform the musical signal from Gamelan into notation as a guide in playing the instrument [6]. Fitria, Suprapto, and Purnomo Ingénierie des Systèmes d’Information Vol. 27, No. 4, August, 2022, pp. 629-635 Journal homepage: http://iieta.org/journals/isi 629