JAH Journal of Advances in Health Journal of Advances in Health 2019;1(2):70–84 doi:10.3724/SP.J.2640-8686.2019.0063 70 © 2019 Journal of Advances in Health | Published by WEDA GenomicsKG: A Knowledge Graph to Visualize Poly-Omics Data Alokkumar Jha 1* , Yasar Khan 1 , Ghanshyam Verma 1 , Durre Zehra 1 , Qaiser Mehmood 1 , Ratnesh Sahay 1 , Dietrich Rebholz-Schuhmann 2 , Seema Dangwal 3 , Mathieu d’Aquin 1 1 Insight centre for data analytics, National University of Ireland, Galway, DERI Building, Galway, Ireland; 2 Information Center for Life Sciences, University of Cologne, ZB MED, Colonge, Germany; 3 School of Medicine, Stanford University, Stanford Cardiovascular Institute, Palo Alto, CA, USA *Address for correspondence: Alokkumar Jha, Insight Centre for data analytics, National University of Ireland, IDA Business park, Lower Dangan Galway, Ireland, Email: alokkumar.jha@insight-centre.org Received February 28, 2019; Accepted March 17, 2019 Multi-omics data is the driving force for precision medicine by improving prognosis in patients through the underlying molecular mechanism, variation, mutations and disease cataloging. However, the exponentially increasing genomics data and its multi-dimensionality requires data mining, knowledge extraction and knowledge enrichment to achieve clinically translational visualization and reporting. The critical challenge in post genomics era is to visualize distributed and multidimensional data and use it into clinical decision support in such a way that it can help in the clinical decisions. GenomicsKG is linked data-driven integration approach where in-silo data sets are linked together. The molecular subtypes, such as Gene Expression (GE), Copy Number Variation (CNV) and Methylation are used in this version of GenomicsKG. Here, integrated multi-omics data built as the knowledge graph which provides an essential tool for functional annotation. GenomicsKG is built as multi-layered, multi-mode, interactive visualization to handle the dimensionality. Knowledge graph models and visualizes the genomics data for physicians and medical professionals, by data reduction and simplification, to annotate clinically actionable events. We developed GenomicsKG over large and multi-dimensional patient-driven cancer data cohorts currently limited to TCGA-OV, UCS, UCEC and COSMIC. The phenotypic validation evaluated using Illumina body map and GTEx which provides a data set from normal tissues to various tissues. We further provided four use-cases to understand tumour classification and progression and demonstrated significant data enrichment and simplistic visualization. We achieved this through knowledge enrichment on genes. Further, we visualized and reported them using our visualization and reporting tool. Key Words: Cancer Genomics, Multi-Omics data, Clinical Visualization and Reporting, Linked data, Semantic web INTRODUCTION Cancer Genomics Molecular Subtyping Cancer is a somatic mutation disease and multi-dimensional high-performance data generated through Next Generation Sequencing (NGS) technologies added significant distributed knowledge bases to understand the molecular changes to explain cancer therapeutics and phenotype. Genetic signatures, such as GE (gene expression), CNV (copy number variation) and DNA methylation are the essential molecular types to explain cancer systems biology. Further, mining of clinical knowledge requires an intuitive and less complex visualization method to logically refer gene signatures as clinical decision matrix. Clinical Genomics and Visualization Usually, Clinical visualization requires the effect of action- able events, such as mutations. 1 Thus, the selection of action- able events and the real monitoring in the change of such events thorough disease progression requires large translation and clinical cohorts, such as TCGA and ICGC. The challenge here is to represent an extensive and multidimensional knowledge in a comprehensive understandable format for physicians with a minimum number of This is an open access journal, and articles are distributed under the terms of the Creative Commons AttributionNonCommercialShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work noncommercially, as long as appropriate credit is given and the new creations are licensed under the identical terms. For reprints contact: weda-h@weda-h.org How to cite this article: Jha A, Khan Y, Verma G, Zehra D, Mehmood Q, Sahay R, Rebholz-Schuhmann D, Dangwal S, d'Aquin M. GenomicsKG: A Knowledge Graph to Visualize Poly-Omics Data. J ADV HEALTH 2019; 1(2): 70-84. Original article