Semantic Question Answering System over Linked Data using Relational Patterns Sherzod Hakimov Hakan Tunc Marlen Akimaliev Erdogan Dogdu TOBB University of Economics and Technology, Ankara, Turkey {hakimov, hakantunc, makimaliev, edogdu}@etu.edu.tr ABSTRACT Question answering is the task of answering questions in natural language. Linked Data project and Semantic Web community made it possible for us to query structured knowledge bases like DBpedia and YAGO. Only expert users, however, with the knowledge of RDF and ontology definitions can build correct SPARQL queries for querying knowledge bases formally. In this paper, we present a method for mapping natural language questions to ontology-based structured queries to retrieve direct answers from open knowledge bases (linked data). Our tool is based on translating natural language questions into RDF triple patterns using the dependency tree of the question text. In addition, our method uses relational patterns extracted from the Web. We tested our tool using questions from QALD-2, Question Answering over Linked Data challenge track and found promising preliminary results. Categories and Subject Descriptors H.5.2 [Information Systems]: User interfaces-natural language, Theory and methods I.2.1 [Artificial Intelligence]: Natural language interfaces. General Terms Algorithms, Performance, Design, Experimentation. Keywords Question Answering, Pattern Extraction, Linked Data, Semantic Web. 1. INTRODUCTION Recently, question answering on the web gained momentum due to the large structured knowledge bases such as DBpedia, Freebase, YAGO that regularly collect information from open and ever expanding knowledge resources such as Wikipedia. Web of Data and Linked Data Linked data refers to the Web of data in contrast to the Web of documents. Linked data extends the current Web that consists of documents and the links between documents. In the case of linked data or the Web of data, meaningful links with types between data elements exist unlike the links in the Web of documents where links are only untyped references in the form of hyperlinks. Linked data is therefore more structured and machine processable; applications can traverse this Web of data, easily find useful data and pinpoint the right information [12]. In the Web of documents, or the current web, searching and finding information is by way of parsing documents and looking for useful information by matching keywords and terms or using some natural language processing techniques; thus it is a dummy search. Instead, users can query structured databases like DBpedia 1 , YAGO 2 or Freebase 3 to find the exact information such as who the president of the USA, or the population of Italy, etc. However, querying these knowledge bases requires the knowledge of RDF data model and the related ontologies. For example the above facts are presented in DBpedia in the form of RDF triples as follows: <United_States, leaderName, Barack_Obama> and <Italy, populationCencus, “59.464.644”>. But someone to find this information, that person has to know leaderName and populationCencus properties and the regarding entities to formulate the queries. If search engines or question answering systems can make use of linked data and translate the natural language questions to linked data queries automatically, users can find the intended information much faster. DBpedia DBpedia is one of the central linked data datasets in Linked Open Data 4 project [13]. It is created by converting infobox information of Wikipedia articles to RDF data model. Latest version of DBpedia contains more than 3.77 million things, including 764.000 persons, 573.000 places, 112.000 music albums, 72.000 films and 18.000 video games, 192.000 organizations, etc. in 111 different languages. In this paper, we present a method to translate natural language questions to linked data queries. We developed a tool that takes natural language questions, converts them to SPARQL 5 queries for DBpedia and answer the questions automatically. We explain our approach in detail in Section 2. In Section 3 we present the evaluation results and show that the method is capable of translating natural language questions to SPARQL queries and provide an answer. We present the related work in section 4 and 1 http://dbpedia.org 2 http://yago-knowledge.org 3 http://freebase.com 4 http://linkeddata.org 5 http://www.w3.org/TR/rdf-sparql-query Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. EDBT/ICDT '13 , March 18 - 22 2013, Genoa, Italy. Copyright 2013 ACM 978-1-4503-1599-9/13/03…$15.00.