ORIGINAL ARTICLE Machine learning approaches for sex estimation using cranial measurements Diana Toneva 1 & Silviya Nikolova 1 & Gennady Agre 2 & Dora Zlatareva 3 & Vassil Hadjidekov 3 & Nikolai Lazarov 4,5 Received: 12 August 2020 /Accepted: 5 November 2020 # Springer-Verlag GmbH Germany, part of Springer Nature 2020 Abstract The aim of the present study is to apply support vector machines (SVM) and artificial neural network (ANN) as sex classifiers and to generate useful classification models for sex estimation based on cranial measurements. Besides, the performance of the generated sub-symbolic machine learning models is compared with models developed through logistic regression (LR). The study was carried out on computed tomography images of 393 Bulgarian adults (169 males and 224 females). The three- dimensional coordinates of 47 landmarks were acquired and used for calculation of the cranial measurements. A total of 64 measurements (linear distances, angles, triangle areas and heights) and 22 indices were calculated. Two datasets were assembled including the linear measurements only and all measurements and index, respectively. An additional third dataset comprising all possible interlandmark distances between the landmarks was constructed. Two machine learning algorithms—SVM and ANN and a traditional statistical analysis LR—were applied to generate models for sex estimation. In addition, two advanced attribute selection techniques (Weka BestFirst and Weka GeneticSearch) were used. The classification accuracy of the models was evaluated by means of 10 × 10-fold cross-validation procedure. All three methods achieved accuracy results higher than 95%. The best accuracy (96.1 ± 0.5%) was obtained by SVM and it was statistically significantly higher than the best results achieved by ANN and LR. SVM and ANN reached higher accuracy by training on the full datasets than the selection datasets, except for the sample described by the interlandmark distances, where the reduction of attributes by the GeneticSearch algorithm improved the accuracy. Keywords Machine learning . Artificial neural network . Support vector machine . Sex estimation . Cranial measurements . Computed tomography Introduction Sex estimation is one of the main steps in the forensic analysis of unknown human bone remains. The accurate estimation of sex is very important, since it is related to the estimation of other characteristics of the biological profile such as age and stature [1]. Thus, accurate sex estimation provides a basis to build an accurate biological profile; otherwise it results in an inaccurate profile which would misdirect the forensic investi- gation. DNA analysis is extremely valuable for positive per- sonal identification [2]. However, an identification based on bone morphology examination is needed in cases when the DNA analysis is obstructed, because of contamination, insuf- ficient probes, scarce possibility for amplification due to tem- poral or environmental conditions, lack of a DNA profile for comparison either from relatives or from biological material of the potential person, etc. * Diana Toneva ditoneva@abv.bg 1 Department of Anthropology and Anatomy, Institute of Experimental Morphology, Pathology and Anthropology with Museum, Bulgarian Academy of Sciences, Acad. G. Bonchev Str., Bl. 25, 1113 Sofia, Bulgaria 2 Department of Linguistic Modelling and Knowledge Processing, Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria 3 Department of Diagnostic Imaging, Medical University of Sofia, 1431 Sofia, Bulgaria 4 Department of Anatomy and Histology, Medical University of Sofia, 1431 Sofia, Bulgaria 5 Department of Synaptic Signaling and Communications, Institute of Neurobiology, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria International Journal of Legal Medicine https://doi.org/10.1007/s00414-020-02460-4