Tiefu Shao Graduate Research Assistant Sundar Krishnamurty Associate Professor e-mail: skrishna@ecs.umass.edu Department of Mechanical and Industrial Engineering, University of Massachusetts Amherst, Amherst, MA 01003 A Clustering-Based Surrogate Model Updating Approach to Simulation-Based Engineering Design This paper addresses the critical issue of effectiveness and efficiency in simulation-based optimization using surrogate models as predictive models in engineering design. Specifi- cally, it presents a novel clustering-based multilocation search (CMLS) procedure to iteratively improve the fidelity and efficacy of Kriging models in the context of design decisions. The application of this approach will overcome the potential drawback in surrogate-model-based design optimization, namely, the use of surrogate models may result in suboptimal solutions due to the possible smoothing out of the global optimal point if the sampling scheme fails to capture the critical points of interest with enough fidelity or clarity. The paper details how the problem of smoothing out the best (SOB) can remain unsolved in multimodal systems, even if a sequential model updating strategy has been employed, and lead to erroneous outcomes. Alternatively, to overcome the problem of SOB defect, this paper presents the CMLS method that uses a novel clustering-based methodical procedure to screen out distinct potential optimal points for subsequent model validation and updating from a design decision perspective. It is embedded within a genetic algorithm setup to capture the buried, transient, yet inherent data pattern in the design evolution based on the principles of data mining, which are then used to improve the overall performance and effectiveness of surrogate-model-based design optimization. Four illustrative case studies, including a 21 bar truss problem, are detailed to demon- strate the application of the CMLS methodology and the results are discussed. DOI: 10.1115/1.2838329 Keywords: simulation-based design, optimization, surrogate model, Kriging, sequential updating, data mining, single-linkage clustering, genetic algorithm Introduction In simulation-based engineering design, physics-based high- fidelity numerical models that are safe to operate, easy to modify, and can be automated for design optimization purposes are typi- cally employed 1. The use of such numerical models, however, can be computationally intensive 2–5. As countermeasures, cost- effective surrogate models or metamodelsare usually employed to speed up simulation-based design optimization 5–19. The challenge is then how to design, develop, and implement a surro- gate model that is robust, reliable, and accurate. The quality or fidelity of a surrogate model is greatly influenced by the sampling scheme employed. If the scheme is not adequate for capturing the germane features of the unknown system infor- mation, there can be serious distortion in the resulting surrogate models, leading to erroneous design outcomes. In fact, the well known noise suppression associated with regression models, which is typically made use of in building surrogate models for stochastic systems, is the feature of smoothing-out data features. This could perhaps be a desired property in certain situations, but when the global optimal design point is accidentally treated as a noise and smoothed out, the subsequent surrogate-model-based design optimization SMBDOwill inevitably result in an errone- ous optimal design solution 20. Ideally, sampling schemes should be designed such that they can capture critical system features with minimum sample size. Yet, in reality, the optimality features of the studied unknown system are not known a priori. Therefore, in the absence of such system information, the most efficient sampling scheme that will lead to a perfect surrogate model cannot be known a priori, and the surrogate modeling process for SMBDO has to be achieved iteratively, with a built-in mechanism to overcome the problem of smoothing out the best SOB. Sampling Methods in Surrogate-Model-Based Design Optimization. The fundamental challenge to efficacious surrogate modeling for design optimization is how to construct a sufficiently high-fidelity model for finding true optimal design solutions using the least number of sample points. Built upon the classic design of experiment DOEsampling methods, there are two groups of methods designed for building surrogate models for the determin- istic numerical models 5,21–24: One is space filling sampling SFSmethods or single-stage methods 21; the other is sequen- tial infilling sampling SISmethods or sequential methods 21. SFS methods spread sample points into the entire design space “equally” for gathering maximum system information and are thus referred to as “space filling.” The common features of SFS meth- ods are the following: 1they are independent of the features of the input-output I/Osystem; 2they design the placement of sample points a priori, thus do not benefit from any new finding about system features, and 3they focus on the approximation of the system over the entire design space. Table 1 is a near- exhaustive list of SFS methods and their corresponding refer- ences. Here, the sample size is often finite and prefixed due to time, cost, or related budget constraints, and all the sample points Contributed by the Design Theory and Methodology Committee of ASME for publication in the JOURNAL OF MECHANICAL DESIGN. Manuscript received November 16, 2006; final manuscript received September 10, 2007; published online February 28, 2008. Review conducted by Yan Jin. Paper presented at the ASME 2006 Design Engineering Technical Conferences and Computers and Information in Engineering Conference DETC2006, Philadelphia, PA, September 10–13, 2006. Journal of Mechanical Design APRIL 2008, Vol. 130 / 041101-1 Copyright © 2008 by ASME Downloaded 20 Mar 2008 to 128.119.88.62. Redistribution subject to ASME license or copyright; see http://www.asme.org/terms/Terms_Use.cfm