Available online at www.sciencedirect.com Medical Engineering & Physics 30 (2008) 419–425 Development of a (silent) speech recognition system for patients following laryngectomy M.J. Fagan a, , S.R. Ell b , J.M. Gilbert a , E. Sarrazin a , P.M. Chapman c a Department of Engineering, University of Hull, UK b Department of Otolaryngology, Hull Royal Infirmary, Hull and East Yorkshire Hospitals NHS Trust, UK c Department of Computer Science, University of Hull, UK Received 20 November 2006; received in revised form 2 May 2007; accepted 3 May 2007 Abstract Surgical voice restoration post-laryngectomy has a number of limitations and drawbacks. The present gold standard involves the use of a tracheo-oesophageal fistula (TOF) valve to divert air from the lungs into the throat, which vibrates, and from this, speech can be formed. Not all patients can use these valves and those who do are susceptible to complications associated with valve failure. Thus there is still a place for other voice restoration options. With advances in electronic miniaturization and portable computing power a computing-intensive solution has been investigated. Magnets were placed on the lips, teeth and tongue of a volunteer causing a change in the surrounding magnetic field when the individual mouthed words. These changes were detected by 6 dual axis magnetic sensors, which were incorporated into a pair of special glasses. The resulting signals were compared to training data recorded previously by means of a dynamic time warping algorithm using dynamic programming. When compared to a small vocabulary database, the patterns were found to be recognised with an accuracy of 97% for words and 94% for phonemes. On this basis we plan to develop a speech system for patients who have lost laryngeal function. © 2007 IPEM. Published by Elsevier Ltd. All rights reserved. Keywords: Speech recognition; Rehabilitation; Laryngectomy; Magnetic sensor; Speech system 1. Introduction Patients with laryngeal cancer, whose larynx must be removed, inevitably lose their voice. Also, as a result of surgery, the viscera involved in swallowing and breathing are separated so that the patient must breathe through their neck via a permanent tracheostomy. The three main meth- ods used currently to restore vocal function may encounter a number of problems and limitations. Sound can be cre- ated by swallowing air and belching, forming the sound into words. This is known as ‘oesophageal speech’ and is difficult to learn, and fluent speech is impossible. Vibrating the soft tissues of the throat by an electrolarynx creates sound, which Corresponding author at: Centre for Medical Engineering and Technol- ogy, Department of Engineering, University of Hull, Hull HU6 7RX, UK. Tel.: +44 1482 465058; fax: +44 1482 466664. E-mail address: m.j.fagan@hull.ac.uk (M.J. Fagan). can be articulated into speech, but the voice is monotonic, ‘Dalek-like’, and can be difficult to understand. The current ‘gold-standard’ method is to use a small silicone tracheo- oesophageal fistula speech valve that connects the trachea and the oesophagus [1]. Air, powered by the lungs, is diverted through the fistula into the throat which vibrates, and this is formed into speech. However, although these valves work very well initially, they rapidly become colonised by biofilm in many patients and fail after an average of only 3–4 months [2–5]. Various modifications have been tried over the years to discourage biofilm growth (e.g. [6–8]), but to date none of these approaches appears to provide a long-term solution to this problem. Thus there is a need for a fundamental improvement in the current methods for the restoration of speech after laryngec- tomy. Digital (voiced) speech recognition systems have been the subject of research for a number of years, based on mea- surement of sound emitted by the speaker [9] and a variety of 1350-4533/$ – see front matter © 2007 IPEM. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.medengphy.2007.05.003