MEDINFO 2001 V. Patel et al. (Eds) Amsterdam: IOS Press © 2001 IMIA. All rights reserved 474 On Classification Capability of Neural Networks: A Case Study with Otoneurological Data Martti Juhola a , Kati Viikki a , Jorma Laurikkala a , Ilmari Pyykkö b , Erna Kentala c a Department of Computer and Information Sciences, 33014 University of Tampere, Finland b Department of Otorhinolaryngology, Karolinska Institute, 17176 Stockholm, Sweden c Department of Otorhinolaryngology, 00029 Helsinki University Central Hospital, Finland, and Vestibular Laboratory, Massachusetts Eye & Ear Infirmary, Boston, USA Abstract We investigated the capability of multilayer perceptron neural networks and Kohonen neural networks to recognize difficult otoneurological diseases from each other. We found that they are efficient methods, but the distribution of a learning set should be rather uniform. Also it is important that the number of learning cases is sufficient. If the two mentioned conditions are satisfied, these neural networks are similarly efficient as some other machine learning methods. The conditions are known in the theory of neural networks [1,2], but not often taken seriously in practice. Both networks functioned as well, excluding the case with several input variables, where the Kohonen neural networks surpassed the perceptron. Keywords: Machine learning; Neural networks; Perceptron networks; Kohonen networks; Classification; Otoneurology Introduction It seems to be ordinary in the medical classification executed with neural networks that it is not paid attention to the question whether the requirements assigned by the applied neural networks are actually satisfied. In the literature [3-5] such issues as a sufficiently large learning set compared to the size of a network topology and the type of data distribution were seldom considered. There may arise problems if such premises are not qualified. The objective of the present study was to outline facilities that neural networks allow in a difficult but obviously typical classification problem, where otoneurological (ear medicine) cases are recognized into right classes. The investigation considered a common situation in which the scarcity of data related to the size of a neural network topology is a clear difficulty. The data collection is typically a slow task due to not necessarily the shortage of patients, but a strongly biased distribution of cases between diseases classes. When some classes are relatively large, whereas others are very small, this is a crucial problem for any machine learning method and especially for neural networks. Their ideal requirement is that the distribution of a learning set is uniform [1]. Further, the otoneurological diseases are exceptionally challenging for the medical diagnostics. Their disease profiles may resemble each other extensively. It can be a hard diagnostic task even for an experienced otologist to differentiate between various disease cases. We explored facilities that feedforward multilayer perceptron neural networks with the backpropagation learning algorithm [1] and Kohonen neural networks (or selforganising maps) [2] establish in these extreme circumstances with the problematic otoneurological diseases. The former type of the neural networks is the most frequently employed type that takes advantage of the supervised learning and the latter one is of the unsupervised learning paradigm. We briefly compare results obtained to our earlier tests with the perceptron neural networks and other methods, such as decision trees, genetic algorithms and nearest neighbour searching. Materials and Methods Previously, we collected an otoneurological patient database [6,7], which consisted of 564 cases with the ensured diagnoses. The expert otologists of our research group inferred the diagnoses independently of any machine learning or statistical techniques. The database was extended to incorporate 883 cases for the present investigation. The current database includes nine diseases as listed in Table 1 from which it is seen that there are one large class, three medium size ones and five small. The database was collected at the vestibular laboratory of the Department of Otorhinolaryngology, Helsinki University Central Hospital. It incorporates 170 possible attributes. Nevertheless, only a part of theirs is filled in for a patient depending on which tests were made or issues were investigated and what symptoms were present.