Model–Building Algorithms for Multiobjective EDAs: Directions for Improvement Luis Mart´ ı, Student Member, IEEE, Jes´ us Garc´ ıa, Member, IEEE, Antonio Berlanga, Member, IEEE, Jos´ e M. Molina, Member, IEEE. Abstract—In order to comprehend the advantages and short- comings of each model–building algorithm they should be tested under similar conditions and isolated from the MOEDA it takes part of. In this work we will assess some of the main machine learning algorithms used or suitable for model–building in a controlled environment and under equal conditions. They are analyzed in terms of solution accuracy and computational complexity. To the best of our knowledge a study like this has not been put forward before and it is essential for the understanding of the nature of the model–building problem of MOEDAs and how they should be improved to achieve a quantum leap in their problem solving capacity. I. I NTRODUCTION Multiobjective optimization problems (MOPs) have been successfully addressed using evolutionary approaches. The resulting multiobjective optimization evolutionary algorithms (MOEAs) [?] have attracted a great deal of attention because of their practical impact and their interesting theoretical aspects. A MOP is an optimization problem where there are two or more functions (known as objective functions) that must be simultaneously optimized. Correspondingly, the solution is a set of trade–off points that represent equally good com- binations of the values of the objective functions. MOEAs are particularly suitable for this class of problems as they do not assume any particular form of the underlying optimal landscape and their parallel search allows them to produce an adequate sample of the solution set. In spite of the favorable results obtained with MOEAs some issues remain to be properly handled. One of these issues is the scalability with respect to the number of objec- tive functions. There are some experimental evidences that show that there is an exponential relationship between the number of objectives and the amount of resources required to correctly solve the problem (see [?] pp. 414–419). This problem not only concerns to MOEAs as it is consistent with the curse of dimensionality [?] which can be attributed to heuristic and machine learning methods. Nevertheless, the sensitivity of MOEAs to this inconvenient hinders the tackling of many real–life engineering problems. There has been a number of works [?], [?], [?] directed towards the reduction of the number of objective functions to a minimum and, therefore, to mitigate the complexity The authors are with the Group of Applied Artificial Intelligence, De- partment of Informatics, Universidad Carlos III de Madrid. Av. de la Universidad Carlos III, 22. Colmenarejo 28270 Madrid Spain. http://www.giaa.inf.uc3m.es. of a problem. Although these works provide a most useful tool for alleviating the burden of a given problem they do not ultimately address the essential issue. Instead they just postpone it. Another viable approach is to employ cutting–edge evo- lutionary algorithms that would deal more efficiently with the high–dimensional problems. Estimation of distribution algorithms (EDAs) [?] constitute good candidates for such task. EDAs have been claimed to be a paradigm shift in the evolutionary computation field. They replace the application of evolutionary operators with the creation of a statistical model of the fittest elements of the population. This model is then sampled to produce new elements. The extension of EDAs to the multiobjective domain has lead to what can be denominated multiobjective optimization EDAs (MOEDAs) [?]. However, although MOEDAs have yielded encouraging results, their introduction has not lived up to a priori expectations. This situation can be attributed to the fact that most of these approaches rely on features extracted from existent MOEAs, in particular, the fitness assignment strategy, and on off–the-shelf machine learning methods that are used for model–building. Therefore, in order to achieve a substantial improvement on their results, both issues should be dissected and properly explored. The improvement of the model–building algorithm used by MOEDAs is, in our opinion, a fertile line of research. This can be attributed to the fact that current MOEDAs employ machine learning algorithms not specifically meant for the model–building task. Therefore, these algorithms do not fully address the requirements of this new task. Reaching a rigorous understanding of the state–of–the– art in MOEDAs’ model–building is hard since each model builder is embedded in a different EDA framework. In order to comprehend the advantages and shortcomings of each algorithm they should be tested under similar conditions and isolated from the MOEDA it is part of. In this work we assess some of the main machine learning algorithms used or suitable for model–building in a controlled environment and under equal conditions. To the best of our knowledge a study like this one has not been put forward before. The resulting knowledge will help to understand which approaches are better in each situation and it might open the path of further developments in this area. In particular, we will gauge the randomized leader al- gorithm [?], the k–means algorithm [?], the expectation maximization algorithm [?], Bayesian networks [?] and the