Learning Semi-Markovian Causal Models using Experiments Stijn Meganck 1 , Sam Maes 2 , Philippe Leray 2 and Bernard Manderick 1 1 CoMo Vrije Universiteit Brussel Brussels, Belgium 2 LITIS INSA Rouen St. Etienne du Rouvray, France Abstract Semi-Markovian causal models (SMCMs) are an extension of causal Bayesian networks for modeling problems with latent variables. However, there is a big gap between the SMCMs used in theoretical studies and the models that can be learned from observational data alone. The result of standard algorithms for learning from observations, is a complete partially ancestral graph (CPAG), representing the Markov equivalence class of maximal ancestral graphs (MAGs). In MAGs not all edges can be interpreted as immediate causal relationships. In order to apply state-of-the-art causal inference techniques we need to completely orient the learned CPAG and to transform the result into a SMCM by removing non-causal edges. In this paper we combine recent work on MAG structure learning from observational data with causal learning from experiments in order to achieve that goal. More specifically, we provide a set of rules that indicate which experiments are needed in order to transform a CPAG to a completely oriented SMCM and how the results of these experiments have to be processed. We will propose an alternative representation for SMCMs that can easily be parametrised and where the parameters can be learned with classical methods. Finally, we show how this parametrisation can be used to develop methods to efficiently perform both probabilistic and causal inference. 1 Introduction This paper discusses graphical models that can handle latent variables without explicitly mod- eling them quantitatively. For such problem do- mains, several paradigms exist, such as semi- Markovian causal models or maximal ances- tral graphs. Applying these techniques to a problem domain consists of several steps, typi- cally: structure learning from observational and experimental data, parameter learning, proba- bilistic inference, and, quantitative causal infer- ence. A problem is that each existing approach only focuses on one or a few of all the steps involved in the process of modeling a problem includ- ing latent variables. The goal of this paper is to investigate the integral process from learning from observational and experimental data unto different types of efficient inference. Semi-Markovian causal models (SMCMs) (Pearl, 2000; Tian and Pearl, 2002) are specif- ically suited for performing quantitative causal inference in the presence of latent variables. However, at this time no efficient parametrisa- tion of such models is provided and there are no techniques for performing efficient probabilistic inference. Furthermore there are no techniques for learning these models from data issued from observations, experiments or both. Maximal ancestral graphs (MAGs), devel- oped in (Richardson and Spirtes, 2002) are specifically suited for structure learning from observational data. In MAGs every edge de- picts an ancestral relationship. However, the techniques only learn up to Markov equivalence