1 Towards Automated Explanation of Gene-Gene Relationships Waclaw Ku´ snierczyk, 1 Astrid Lægreid, 2 Agnar Aamodt 1 Keywords: microarrays, gene-gene relationships, knowledge-intensive problem solving, it- erative search, public databases and tools. 1 Introduction. During the recent decades research in molecular biology experienced several paradigm shifts that changed the researchers’ approach to solving particular problems in the field. The invention of microarray technology in the last decade of the previous millennium can certainly be seen as one of those paradigm shifts [3]. However, although there appear first reports showing efforts to combine various resources of genomic data instead of investigating just one source (e.g., [2, 4]), the understanding and interpretation of the results—the key issue in any attempt to a discovery—is still entirely left to the human. Microarray data represent a reasonable source of new hypotheses on gene function and between-gene relationships. In order to fully understand and explain these hypotheses, biological background information from many resources has to be explored and combined. We propose a novel method intended to aid a researcher in understanding hypothetical relationships between genes, e.g., genes not previously known to be related. In order to justify a tentative link between genes, the sequences, promoter regions, protein structure, function and other properties may have to be investigated for these and possibly other related genes. Although a manual search for information that would link two genes is theoretically possible, in practice it may be a very tedious task. Our approach is an attempt to design and imple- ment a high-level wrapper for existing databases and tools, providing an automated process of forming relevant human-readable explanations. The proposed solution draws from the achievements of research in artificial intelligence—knowledge representation and knowledge- intensive reasoning, non-deductive inference mechanisms, and machine-learning [1]. 2 Methods. The proposed system is an intelligent interface between the user on one side, and remote databases and publicly available tools on the other side, enhanced by background knowledge in molecular biology. Its modular architecture is illustrated in Fig 1. A typical question that can arise from a microarray experiment and may be asked to the system is of the form How are genes g 1 and g 2 related? or What might be the causal relationship between genes g 1 and g2? The exact syntax of the query depends on the actual implementation of the query interface module (QI). The query, translated into the internal representation language, is interpreted by the core reasoner (CR), which utilizes general domain knowledge (GDK) to construct an explanation chain that links the two investgated genes. The GDK is modelled as a multi-relational semantic network, a kind of ontology, where each concept and relation are represented as a 1 Department of Information and Computer Science, Norwegian University of Science and Tech- nology, Sem Sælandsv. 7, 7491 Trondheim, Norway. E-mail: {waku,agnar}@idi.ntnu.no 2 Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Olav Kyrresg. 3, 7489 Trondheim, Norway. E-mail: astrid.lagreid@medisin.ntnu.no