Discovering Names in Linked Data Datasets Bianca Pereira 1 , João C. P. da Silva 2 , and Adriana S. Vivacqua 1,2 1 Programa de Pós-Graduação em Informática, 2 Departamento de Ciência da Computação Instituto de Matemática, Universidade Federal do Rio de Janeiro, Brazil bianca.pereira@ppgi.ufrj.br, {jcps,avivacqua}@dcc.ufrj.br Abstract. The Named Entity Recognition Task is one of the most com- mon steps used in natural language applications. Linked Data datasets have been presented as promising background knowledge for Named En- tity Recognition algorithms due to the amount of data available and the high variety of knowledge domains they cover. However, the discovery of names in Linked Data datasets is still a costly task if we consider the amount of available datasets and the heterogeneity of vocabulary used to describe them. In this work, we evaluate the usage of rdfs:label as a property referring to entities’ name and we describe a set of heuristics created to discover properties identifying names for named entities in Linked Data datasets. Keywords: Named Entity, Named Entity Recognition, Linked Data 1 Introduction Named Entity Recognition (NER) in natural language texts is one of the most common tasks in Natural Language Processing. Since the sixth Message Under- standing Conference (MUC) with the emergence of the term "named entity" and the formalization of the NER task, the techniques for recognizing names in texts have greatly evolved. Additionally, better knowledge bases not only for recognition of names but also for its disambiguation have been developed. A named entity (NE) is an entity that can be identified by a proper name [2]. Originally NEs were instances of person, organization or location classes and also dates and numeric values. Nowadays there are many other classes that identify NEs [3] [4]. Techniques for NER range from dictionary-based approaches to rule or ma- chine learning ones [10]. Over time, different knowledge bases have been used as background knowledge for the NER task: from manually created lists to datasets using knowledge available on the Web [4]. Recently, with the emergence of databases in Linked Data format, Linked Data datasets have been presenting as promising sources for NEs. The Linked Open Data cloud (LOD cloud) provides knowledge in diverse human knowledge domains, including not only the most common types of entities mentioned previously as NE types, but also entities in the field of music, video, biology, among many others.