Computers and the Humanities 35: 95–121, 2001. © 2001 Kluwer Academic Publishers. Printed in the Netherlands. 95 The Challenge of Optical Music Recognition DAVID BAINBRIDGE 1 and TIM BELL 2 1 Department of Computer Science, University of Waikato, Hamilton, New Zealand (E-mail: d.bainbridge@cs.waikato.ac.nz); 2 Department of Computer Science, University of Canterbury, Christchurch, New Zealand (E-mail: t.bell@cosc.canterbury.ac.nz) Abstract. This article describes the challenges posed by optical music recognition – a topic in com- puter science that aims to convert scanned pages of music into an on-line format. First, the problem is described; then a generalised framework for software is presented that emphasises key stages that must be solved: staff line identification, musical object location, musical feature classification, and musical semantics. Next, significant research projects in the area are reviewed, showing how each fits the generalised framework. The article concludes by discussing perhaps the most open question in the field: how to compare the accuracy and success of rival systems, highlighting certain steps that help ease the task. Key words: optical music recognition, musical data acquisition, document image analysis, pattern recognition 1. Introduction Optical Music Recognition (OMR) – a computer system that can “read” printed music – has much promise: a clarinetist could scan a tune and have it trans- posed automatically; a soloist could have the computer play an accompaniment for rehearsal; a music editor could make corrections to an old edition using a music notation program; or a publisher could convert a piece to Braille with very little work. OMR has been the focus of international research for over three decades, and while numerous achievements have been made, there are still many challenges to be faced before it reaches its full potential. OMR addresses the problem of musical data acquisition, the key impediment to many computer music applications. It is not, however, the only data entry method for music. The most common method for music data entry in current use combines synthesiser keyboard entry and computer keyboard entry. The musical keyboard is typically used to enter the notes by playing each voice in isolation, either in time with a metronome or using the computer keyboard to enter rhythmic information. The computer keyboard and mouse are then used to correct any mistakes and to add other notation such as lyrics, slurs, and dynamics. Music data entry in this form demands a level of skill from the keyboard player, and adding the remaining notation is time-consuming.