Total variation of atmospheric data: covariance minimization about objective functions to detect conditions of interest Nicholas Hamilton National Renewable Energy Laboratory, Golden, Colorado, USA Correspondence: Nicholas Hamilton (nicholas.hamilton@nrel.gov) Abstract. Identification of atmospheric conditions within a multivariable atmospheric data set is a necessary step in the vali- dation of emerging and existing high-fidelity models used to simulate wind plant flows and operation. Most often, conditions of interest are determined as those that occur most frequently, given the need for well-converged statistics from observations against which model results are compared. Aggregation of observations without regard to covariance between time series discounts the dynamical nature of the atmosphere and is not sufficiently representative of wind plant operating conditions. 5 Identification and characterization of continuous time periods with atmospheric conditions that have a high value for analy- sis or simulation sets the stage for more advanced model validation and the development of real-time control and operational strategies. The current work explores a single metric for variation of a multivariate data sample that quantifies variability within each channel as well as covariance between channels. The total variation is used to identify periods of interest that conform to desired objective functions, such as quiescent conditions, ramps or waves of wind speed, and changes in wind direction. 10 The direct detection and classification of events or periods of interest within atmospheric data sets is vital to developing our understanding of wind plant response and to the formulation of forecasting and control models. 1 Introduction Parsing multivariate data sets that are ever growing in size and complexity can be a daunting task for researchers seeking to identify periods or events of interest in time series data (Preston et al., 2009; Shahabi and Yan, 2003). This is especially true 15 for wind energy research seeking to validate high-fidelity numerical models against field observations (Barthelmie et al., 2015; Larsen et al., 2013; Sørensen and Shen, 2002). Wind plants operate continuously over time periods spanning years and across a broad range of atmospheric conditions, each of which implicitly impact the operation of the wind plant, either in terms of power production, operations and maintenance costs, or energy forecasting for grid integration. Field observations of wind plants are typically collected by instrumentation mounted to wind turbines or meteorological 20 towers (met masts) and by supervisory control and data acquisition (SCADA) systems. Wind plant data sets typically include measurements of wind speed and direction, local temperature and pressure, and wind turbine operational data, such as op- erational status, power production, and nacelle position. Each of the atmospheric quantities of interest may be classified as nonhomogenous stochastic variables that are fundamentally connected (i.e. strongly interdependent). 1 https://doi.org/10.5194/amt-2019-200 Preprint. Discussion started: 11 September 2019 c Author(s) 2019. CC BY 4.0 License.