Pareto-Based Multi-output Metamodeling with Active Learning Dirk Gorissen, Ivo Couckuyt, Eric Laermans, and Tom Dhaene Ghent University-IBBT, Dept. of Information Technology (INTEC), Gaston Crommenlaan 8, 9050 Ghent, Belgium Abstract. When dealing with computationally expensive simulation codes or process measurement data, global surrogate modeling methods are firmly estab- lished as facilitators for design space exploration, sensitivity analysis, visualiza- tion and optimization. Popular surrogate model types include neural networks, support vector machines, and splines. In addition, the cost of each simulation mandates the use of active learning strategies where data points (simulations) are selected intelligently and incrementally. When applying surrogate models to multi-output systems, the hyperparameter optimization problem is typically for- mulated in a single objective way. The different response outputs are modeled separately by independent models. Instead, a multi-objective approach would benefit the domain expert by giving information about output correlation, facili- tate the generation of diverse ensembles, and enable automatic model type selec- tion for each output on the fly. This paper outlines a multi-objective approach to surrogate model generation including its application to two problems. 1 Introduction Regardless of the rapid advances in High Performance Computing and multi-core ar- chitectures, it is rarely feasible to explore a design space using high fidelity computer simulations. As a result, data based surrogate models (otherwise known as metamodels or response surface models) have become a standard technique to reduce this computa- tional burden and enable routine tasks such as visualization, design space exploration, prototyping, sensitivity analysis, and optimization. It is important to first stress that this paper is concerned with fully reproducing the simulator behavior with a global model. The use of metamodels to approximate the costly function for optimization (Metamodel Assisted Optimization) is not our goal. Our objective is to construct a high fidelity approximation model that is as accurate as possible over the complete design space of interest using as few simulation points as possible (= active learning). This model can then be reused in other stages of the engi- neering design pipeline, for example as cheap accurate replacement models in design software packages (e.g., ADS Momentum). In engineering design simulators are typically modeled on a per-output basis. Each output is modeled independently using separate models (though possibly sharing the same data). Instead, the system may be modeled directly using multi-objective algo- rithms while maintaining the tie-in with active learning (classically a fixed data set is D. Palmer-Brown et al. (Eds.): EANN 2009, CCIS 43, pp. 389–400, 2009. c Springer-Verlag Berlin Heidelberg 2009