Abstract – The purpose of the study is to efficient classification of Cardiotocography (CTG) Data Set from UCI Irvine Machine Learning Repository with Extreme Learning Machine (ELM) method. CTG Data Set has 2126 different fetal CTG signal recordings comprised of 23 real features. Data is two target class description that are based on fetal hearth rate and morphology pattern. The classification criteria based on morphology pattern (A-SUSP) is used in this study to serve better decision options to operators. Accuracy of ELM method will be compared with previous works in literature. Keywords – Cardiotocography, Extreme Learning Machine; Machine Learning, Classification I. INT RODUCT ION Fetus distress is common term used generally in third trimester of pregnancy because of oxygen inefficacy. Fetus distress can lead caesarian decision to protect fetus’ health from fetal hypoxia and metabolic acidosis[1, 2]. Cardiotocography (CTG) is a vital monitoring tool for gynecologist in early detection of fetus distress. CTG allows graphics recordings of fetus hearth rate and contradiction of the uterus with two transducers [3]. In recordings, accepted baseline fetal heart rate is between 110 beats per minutes to 160 beats per minutes according to experts [4]. Due to emergency in interpretation of CTG recordings for fetus health, CTG data set is widely used in machine learning and classification papers [5]. Most works in literature are aimed to classify “fetus well-being” and used mostly two class (pathologic – healthy) and three class (normal – suspect – pathologic) methodology. In literature review, it is observed that 10 class (FHR) pattern code is preferred rarely due to adversity of classification. Discriminant Analysis (DA), Decision Tree (DT) and Artificial Neural Network (ANN) is applied to classify CTG data set by Huang et al [6]. In 2012, accuracy results are 82.1 %, 86.36%, 97.78% respectively. In 2012, Sundar et all has applied ANN model to 3 class CTG data set, the accuracy result is 80% [7]. Yılmaz and Kılıkçıer (2013) use Least Square Support Vector Machine (LS- SVM) utilizing a Binary Decision Tree (BDT) for 3-class FHR, accuracy rate is improved to 91.62% [8]. Ocak and Ertunç (2013) has used two class CTG data with Adaptive Neuro Fuzzy Interface System (ANFIS) method, for pathological and normal class prediction accuracies are 96.6% and 97.2% [9]. In 2014, Karabulut et al has compared machine learning methods; Naïve Bayes (NB), Radial Basis Function Network (RBFN), Bayesian Network (BN), SVM, ANN and DT, without ensemble method AdaBoost and with AdaBoost [10]. In studies NB, RBFN, BN, DT methods’ accuracy has improved to 87.39%, 87.67%, 92.61% and 95.01% respectively. By using K-Nearest Neighbors (k-NN) and Random Forest (RF), Şahin and Subasi has improved prediction accuracy to 98.4% and 99.18% respectively [4]. The ANN method performance has compared with ELM method by Cömert et al, it is stated that ELM is faster and more accurate than ANN. Classification error is stated as 93.42% for ELM [11]. Arif has used Random Forest classifier for 3 class CTG dataset and improved accuracy to 93.6% [12]. Kamath and Kamat has proposed to apply Random Forest model to 10 class CTG data set. Team has claimed to achieve accuracy over 87% for 600 number of tree in the forest with 5 partitioning [13]. II. FEATURE SELECTION High dimensional data can be considered not only as detailed information to ease classification of targets but also correlated/redundant features to emphasize specific features and complicate classification. Therefore, analyzer should be aware of correlation between features before applying classification methods to dataset to prevent “curse of dimensionality” [14]. If dependent or irrelevant features exit, convenient subset of features should be selected by feature selection methods against over-fit [3]. In preprocessing stage of this work, dataset has examined through distribution and correlation of features. As a result, two main feature selection method is used: Principle Component Analysis (PCA) and Fisher Score (FS) A. Fisher Score (FS) Fisher Score is suitable feature selection method to sparse the most discriminative subset of features [15]. FS is a scoring algorithm over features which assigns a score for each feature and picks Z large score for subset [15]. For dataset with K samples, F features and C class label: ( , ) =1 for input samples ℛ and class target { 1,2, … . } ; score assignment can be calculated with equation (1). ( )= ∑ � − � 2 =1 ∑ � � 2 =1 (1) for = { 1,2, … . . } where number of instances in class m, where is i th feature’s mean, Cardiotocography Data Set Classification with Extreme Learning Machine A. UZUN 1 , E. CAPA KIZILTAS 1 , E. YILMAZ 1 1 Uludağ University, Bursa/Turkey, aysenuruzun@uludag.edu.tr 1 Uludağ University, Borçelik Celik Sanayii Tic. A.S.; Bursa/Turkey, capa.eda@gmail.com 1 Uludağ University, Bursa/Turkey, ersen@uludag.edu.tr International Conference on Advanced Technologies, Computer Engineering and Science (ICATCES’18), May 11-13, 2018 Safranbolu, Turkey 224