Performance Analysis of an Acyclic Genetic approach to Learn Bayesian Network Structure (Student Paper) Pankaj B. Gupta 1 and Vicki H. Allan 2 1 Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA, pagupta@microsoft.com 2 Computer Science Department, Utah State University, Logan, UT 84322, allanv@cs.usu.edu Abstract. We introduce a new genetic algorithm approach for learn- ing a Bayesian network structure from data. Our method is capable of learning over all node orderings and structures. Our encoding scheme is inherently acyclic and is capable of performing crossover on chromosomes with different node orders. We present an analysis of this approach using different Bayesian networks such as ASIA and ALARM. Results sug- gest that the method is effective. The tests we perform include varying the population size of the genetic algorithms, restricting the maximum number of parents a node can have, and learning with a fixed node order. Keywords: Structure Learning, Bayesian Networks, Genetic Algorithms. 1 Introduction Bayesian networks are probabilistic networks capable of representing causal re- lationships [6]. They are directed acyclic graphs with nodes representing the variables of a problem and edges representing the causal relations between the variables. Each node has a conditional probability distribution in which the par- ents of the nodes are the condition variables. Learning structure and learning probabilities of the nodes are treated as two separate problems where the for- mer is considered to be much more challenging. In this research, we consider the problem of learning the structure of a Bayesian network. Genetic algorithms are evolutionary algorithms for solving problems with a large solution space [14]. A possible solution of the problem is termed a chro- mosome. An initial population of chromosomes is generated randomly. Each chromosome is evaluated according to a fitness function. Two chromosomes are chosen from the population at random, and they are crossed over to produce two daughter chromosomes. Eventually, the population size is truncated to the orig- inal size by discarding the worst quality chromosomes. This process is repeated until a quality chromosome is generated. In the past few years, Bayesian networks have become important in the field of artificial intelligence. Learning the structure of a Bayesian network from data is a challenging problem, known to be NP-Hard [2, 7]. Since the solution space