Computational Geosciences https://doi.org/10.1007/s10596-018-9765-1 ORIGINAL PAPER Surrogate-based parameter inference in debris flow model Maria Navarro 1 · Olivier P. Le Maˆ ıtre 2 · Ibrahim Hoteit 3 · David L. George 4 · Kyle T. Mandli 5 · Omar M. Knio 1 Received: 4 December 2017 / Accepted: 27 July 2018 © Springer Nature Switzerland AG 2018 Abstract This work tackles the problem of calibrating the unknown parameters of a debris flow model with the drawback that the information regarding the experimental data treatment and processing is not available. In particular, we focus on the evolution over time of the flow thickness of the debris with dam-break initial conditions. The proposed methodology consists of establishing an approximation of the numerical model using a polynomial chaos expansion that is used in place of the original model, saving computational burden. The values of the parameters are then inferred through a Bayesian approach with a particular focus on inference discrepancies that some of the important features predicted by the model exhibit. We build the model approximation using a preconditioned non-intrusive method and show that a suitable prior parameter distribution is critical to the construction of an accurate surrogate model. The results of the Bayesian inference suggest that utilizing directly the available experimental data could lead to incorrect conclusions, including the over-determination of parameters. To avoid such drawbacks, we propose to base the inference on few significant features extracted from the original data. Our experiments confirm the validity of this approach, and show that it does not lead to significant loss of information. It is further computationally more efficient than the direct approach, and can avoid the construction of an elaborate error model. Keywords Bayesian inference · Polynomial chaos expansion · Debris flow · Uncertainty quantification Maria Navarro marianj61@gmail.com Olivier P. Le Maˆ ıtre olm@limsi.fr Ibrahim Hoteit ibrahim.hoteit@kaust.edu.sa David L. George dgeorge@usgs.gov Kyle T. Mandli kyle.mandli@columbia.edu Omar M. Knio omar.knio@kaust.edu.sa 1 Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia 2 LIMSI-CNRS, F-91403 Orsay Cedex, Paris, France 3 Physical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia 4 U.S. Geological Survey, Vancouver, WA 98683, USA 5 Department of Applied Physics and Applied Mathematics, Columbia University in the City of New York, New York, NY 10027, USA 1 Introduction The present work explores the possibility of exploiting experimental data resulting from measurements in [45] for the purpose of inferring selected parameters in the D- CLAWdebris flow model [14, 23]. The inference focuses on a Bayesian approach that has the advantage of providing a complete characterization of the parameters’ uncertainty through their resulting posterior distribution. While con- ceptually simple, performing a Bayesian inference raises several computational and practical difficulties at every one of its constitutive steps. One objective of the present work is to highlight these difficulties and underline issues and risk of applying the Bayesian procedure naively using the available measurements. The key issue is related to the use of complex, non- linear models such as the one used here. In the context of this study, the use of D-CLAWmakes the derivation of closed form expressions for the prediction of physical mea- surements difficult at best while also being computationally complex and costly; tackling this problem is highly non- trivial. In this context, we (1) constructed a polynomial chaos expansion [15, 31] with the intention of using this expansion as a surrogate model for sampling the Bayesian