BERT-based Semantic Query Graph Extraction for Knowledge Graph Question Answering Zhicheng Liang * ,†, 1 , Zixuan Peng †, 2 , Xuefeng Yang 2 , Fubang Zhao 2 , Yunfeng Liu 2 , and Deborah L. McGuinness 1 1 Department of Computer Science, Rensselaer Polytechnic Institute, USA 2 Zhuiyi Technology, China Abstract. Answering complex questions involving multiple entities and relations remains a challenging Knowledge Graph Question Answering (KGQA) task. To extract a Semantic Query Graph (SQG), we propose a BERT-based decoder that is capable of jointly performing multi-tasks for SQG construction, such as entity detection, relation prediction, output variable selection, query type classification and ordinal constraint detection. The outputs of our model can be seamlessly integrated with downstream components (e.g. entity linking) of a KGQA pipeline to construct a formal query. The results of our experiments show that our proposed BERT-based semantic query graph extractor achieves better performance than traditional recurrent neural network based extractors. Meanwhile, the KGQA pipeline based on our model outperforms baseline approaches on two benchmark datasets (LC-QuAD, WebQSP) containing complex questions. § 1 Introduction Semantic parsing (SP) based approaches to knowledge graph question answering (KGQA) aim at building a semantic parser that first converts natural language questions into some logical forms, and then into formal queries like SRARQL that can be executed on the un- derlying KG to retrieve answers. For these approaches, constructing the semantic query graph (SQG) plays a vital role. For example, the SQG of a natural language query (NLQ) What awards have been won by the executive producer of Fraggle Rock?” involves three nodes and two labeled edges, i.e. {(Fraggle Rock, dbo:executiveProducer, ?x), (?x, dbo:award, ?uri)} if represented using triples, where ?x and ?uri are some free variables. The answers to this query should be the grounded KG nodes for the output variable ?uri. Despite some work on abstract query graph prediction [1, 8], there is yet to be an end-to-end model that jointly performs query graph identification along with entity mention detection and relation prediction. To this end, we propose a novel BERT-based neural network to extract SQG in an end-to-end manner for answering complex questions with multiple triple patterns. We evaluate our approach on two KGQA benchmark datasets containing complex questions. The experimental results demonstrate that our approach, by using a simple pipeline built on top of our proposed SQG extractor, improves the overall KGQA performance outperforming the baseline approaches. * Work partially done during an internship at Zhuiyi Technology. Equal contribution. Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). § Our code and data are available at: https://github.com/gychant/BERT-NL2SPARQL