Tiefu Shao Graduate Research Assistant Sundar Krishnamurty Associate Professor e-mail: skrishna@ecs.umass.edu Department of Mechanical and Industrial Engineering, University of Massachusetts Amherst, Amherst, MA 01003 A Clustering-Based Surrogate Model Updating Approach to Simulation-Based Engineering Design This paper addresses the critical issue of effectiveness and efﬁciency in simulation-based optimization using surrogate models as predictive models in engineering design. Speciﬁ- cally, it presents a novel clustering-based multilocation search (CMLS) procedure to iteratively improve the ﬁdelity and efﬁcacy of Kriging models in the context of design decisions. The application of this approach will overcome the potential drawback in surrogate-model-based design optimization, namely, the use of surrogate models may result in suboptimal solutions due to the possible smoothing out of the global optimal point if the sampling scheme fails to capture the critical points of interest with enough ﬁdelity or clarity. The paper details how the problem of smoothing out the best (SOB) can remain unsolved in multimodal systems, even if a sequential model updating strategy has been employed, and lead to erroneous outcomes. Alternatively, to overcome the problem of SOB defect, this paper presents the CMLS method that uses a novel clustering-based methodical procedure to screen out distinct potential optimal points for subsequent model validation and updating from a design decision perspective. It is embedded within a genetic algorithm setup to capture the buried, transient, yet inherent data pattern in the design evolution based on the principles of data mining, which are then used to improve the overall performance and effectiveness of surrogate-model-based design optimization. Four illustrative case studies, including a 21 bar truss problem, are detailed to demon- strate the application of the CMLS methodology and the results are discussed. DOI: 10.1115/1.2838329 Keywords: simulation-based design, optimization, surrogate model, Kriging, sequential updating, data mining, single-linkage clustering, genetic algorithm Introduction In simulation-based engineering design, physics-based high- ﬁdelity numerical models that are safe to operate, easy to modify, and can be automated for design optimization purposes are typi- cally employed 1. The use of such numerical models, however, can be computationally intensive 2–5. As countermeasures, cost- effective surrogate models or metamodels are usually employed to speed up simulation-based design optimization 5–19. The challenge is then how to design, develop, and implement a surro- gate model that is robust, reliable, and accurate. The quality or ﬁdelity of a surrogate model is greatly inﬂuenced by the sampling scheme employed. If the scheme is not adequate for capturing the germane features of the unknown system infor- mation, there can be serious distortion in the resulting surrogate models, leading to erroneous design outcomes. In fact, the well known noise suppression associated with regression models, which is typically made use of in building surrogate models for stochastic systems, is the feature of smoothing-out data features. This could perhaps be a desired property in certain situations, but when the global optimal design point is accidentally treated as a noise and smoothed out, the subsequent surrogate-model-based design optimization SMBDO will inevitably result in an errone- ous optimal design solution 20. Ideally, sampling schemes should be designed such that they can capture critical system features with minimum sample size. Yet, in reality, the optimality features of the studied unknown system are not known a priori. Therefore, in the absence of such system information, the most efﬁcient sampling scheme that will lead to a perfect surrogate model cannot be known a priori, and the surrogate modeling process for SMBDO has to be achieved iteratively, with a built-in mechanism to overcome the problem of smoothing out the best SOB. Sampling Methods in Surrogate-Model-Based Design Optimization. The fundamental challenge to efﬁcacious surrogate modeling for design optimization is how to construct a sufﬁciently high-ﬁdelity model for ﬁnding true optimal design solutions using the least number of sample points. Built upon the classic design of experiment DOE sampling methods, there are two groups of methods designed for building surrogate models for the determin- istic numerical models 5,21–24: One is space ﬁlling sampling SFS methods or single-stage methods 21; the other is sequen- tial inﬁlling sampling SIS methods or sequential methods 21. SFS methods spread sample points into the entire design space “equally” for gathering maximum system information and are thus referred to as “space ﬁlling.” The common features of SFS meth- ods are the following: 1 they are independent of the features of the input-output I/O system; 2 they design the placement of sample points a priori, thus do not beneﬁt from any new ﬁnding about system features, and 3 they focus on the approximation of the system over the entire design space. Table 1 is a near- exhaustive list of SFS methods and their corresponding refer- ences. Here, the sample size is often ﬁnite and preﬁxed due to time, cost, or related budget constraints, and all the sample points Contributed by the Design Theory and Methodology Committee of ASME for publication in the JOURNAL OF MECHANICAL DESIGN. Manuscript received November 16, 2006; ﬁnal manuscript received September 10, 2007; published online February 28, 2008. Review conducted by Yan Jin. Paper presented at the ASME 2006 Design Engineering Technical Conferences and Computers and Information in Engineering Conference DETC2006, Philadelphia, PA, September 10–13, 2006. Journal of Mechanical Design APRIL 2008, Vol. 130 / 041101-1 Copyright © 2008 by ASME Downloaded 20 Mar 2008 to 128.119.88.62. Redistribution subject to ASME license or copyright; see http://www.asme.org/terms/Terms_Use.cfm