Classifying Common Vulnerabilities and Exposures Database Using Text Mining and Graph Theoretical Analysis Ferda Özdemir Sönmez Abstract Although common vulnerabilities and exposures data (CVE) is commonly known and used to keep vulnerability descriptions. It lacks enough classifiers that increase its usability. This results in focusing on some well-known vulnerabilities and leaving others during the security tests. Better classification of this dataset would result in finding solutions to a larger set of vulnerabilities/exposures. In this research, vulnerability and exposure data (CVE) is examined in detail using both manual and computerized content analysis techniques. Later, graph theoretical techniques are used to scrutinize the CVE data. The computerized content analysis made it possible to find out 94 concepts associated with the CVE records. The author was able to relate these concepts to 11 logical groups. Using the network of the relationships of these 94 concepts further in the graph theoretical analysis made it possible to discover groups of contents, thus, the CVE items which have similarities. Moreover, lacking some concepts pointed out the problems related to CVE such as delays in the review CVE process or not being preferred by some user groups. Keywords Content analysis · Text mining · Graph theoretical analysis · Leximancer · Pajek · CVE · Common vulnerabilities and exposures 1 Introduction Common Vulnerabilities and Exposures (CVE) dictionary [1], which is also called as dataset or database in some sources, is a huge set of vulnerabilities and exposures data which is considered as the naming standard for vulnerabilities and exposures in numerous security-related studies, books, articles and by the vendors of security- related products including Microsoft, Oracle, Apple, IBM, and many others. Despite F. Ö. Sönmez (B ) Informatics Institute Middle East Technical University, Ankara, Turkey e-mail: ferdaozdemir@gmail.com © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Y. Maleh et al. (eds.), Machine Intelligence and Big Data Analytics for Cybersecurity Applications, Studies in Computational Intelligence 919, https://doi.org/10.1007/978-3-030-57024-8_14 313