ELSEVIER Electroencephalographyand clinical Neurophysiology101 (1996) 129-144 a Evaluation of the diagnostic performance of the expert EMG assistant MUNIN Steen Andreassen a, *, Annelise Rosenfalck a, Bj~rn Falck h, Kristian G. Olesen a, Stig Kj~er Andersen a a Institute for Electronic Systems, Department of Medical lnformatics and Image Analysis, Aalborg University, Fredrik Bajers Vej 7D, DK-9220 Aalborg ~), Denmark b Department of Clinical Neurophysiology, University Hospital, Turku, Finland Accepted for publication:20 November 1995 Abstract The diagnostic performance of the medical expert system MUNIN for diagnosis of neuromuscular disorders was evaluated on a set of 30 test cases. The cases were provided by 7 experienced electromyographers who were subsequently invited to participate in the evaluation. To reasonably cover the range of disorders, the electromyographers were asked to provide cases from patients with different types of muscular dystrophy, with neuromuscular transmission disorders, with motor neurone disease, and with different types of polyneuropathies. In addition, patients with a range of local neuropathies were provided. Out of the 30 cases, 11 cases were evaluated by an "almost peer review" method and the remaining 19 cases were evaluated by a "silver standard" method. The number of cases evaluated by "almost peer review" was limited to 11 due to time constraints on the evaluation procedure. During the "almost peer review," each electromyographer was asked to diagnose patients, using a vocabulary that closely resembled MUNIN's vocabulary. Subsequently, we attempted to provide a consensus diagnosis for the patients based on discussion among the participating electromyographers. The electromyographers were also asked to assess how well MUNIN had performed in each case. The remaining 19 cases were evaluated by a "silver standard" procedure, where MUNIN's diagnosis was compared to the diagnosis of the expert who provided the case. The results indicated that MUNIN performed well, and the electromyographers considered "that MUNIN performed at the same level as an experienced neurophysiologist." In particular, it was noted that MUNIN handled cases with conflicting findings well, and that it was able to diagnose patients with multiple diseases. Keywords: Electromyography;Computer-assisteddiagnosis; Evaluation 1. Introduction An evaluation of the Microhuman prototype of the MUNIN expert system for electromyography (EMG) (Andreassen et al., 1992) has been carried out. The evalua- tion marked the end of a 5 year project partially sponsored by the Commission of the European Community through the ESPRIT programme. We wanted to get an outside evaluation of what had * Corresponding author. Tel.: +45 98 158522, Ext. 4951; Fax: +45 98 154008; E-mail: sa@miba.auc.dk. been achieved during the project, and were particularly interested in obtaining an answer to the question: Does MUNIN match the diagnostic performance of an EMG expert? Other EMG expert systems have also been evalu- ated. Jamieson (1990) reported that his system reached an agreement with experts in 87% of the presented cases. This performance seems comparable to the performance ob- tained in the evaluation of a number of other medical expert systems (Miller, 1986). In a field evaluation, PC- KANDID, a rule-based EMG expert system (Fuglsang- Frederiksen et al., 1990) obtained a 53% agreement with the experts. Interestingly, experts of different nationalities differed widely in their agreement rates. The Danish and 0924-980X/96/$15.00 © 1996 Elsevier Science Ireland Ltd. All rights reserved SSDI 0013-4694(95)00252-9 EEM 92573