Challenges of the application of data-driven models for the real-time optimization of an industrial air separation plant Dionysios P. Xenos 1,* , Olaf Kahrs 2 , Matteo Cicciotti 3 , Fernando Moreno Leira 4 and Nina F. Thornhill 1 Abstract— The optimization of the operation of chemical plants may require the development of mathematical models of the process units of a plant. These mathematical models can be either first-principles or data-driven models. The former type of modeling may be complex for the use in optimization and especially for online applications such as real time optimization. Available measured process data can be used to develop the latter type of modeling. Although data-driven models offer several benefits for online applications, there are some very significant challenges related to their development in a practical industrial implementation. This paper discusses the important aspects of the building of data-driven models and demonstrates the effects of these types of models on the optimization results. The current work demonstrates the application of a real time optimization framework applied to an industrial air compressor station of an air separation plant when the models are based on operating data. I. INTRODUCTION In the last years, there has been much interest in reducing energy consumption of large industrial plants. Environmental awareness, government legislation, increased competition among companies and increasing energy prices are the major reasons that industrial plants need to improve their efficiency and operate in a more optimal way. Optimization tools can help the users of a plant to reduce operational costs without purchasing new equipment which increases further the capital cost of the initial design. These tools rely on mathematical models which represent the behaviour of the process units of the plant. The mathematical models can be derived usually either from first-principles models or models based on raw data from the operation. The latter types of models are known as data-driven, black- box, surrogate or meta models. Although a data-driven model does not have any physical meaning, it can be suitable for online optimization applications as this type of model offers several benefits compared to a large and complex first- principles model, for example reduced computational time to find solution and ease of development. . Available process measurements of the plant can be used to develop computationally cheap models tailored to opti- mization frameworks. However, having installed adequate and accurate online sensors is one of the major obstacles 1 Department of Chemical Engineering, Centre for Process Sys- tems Engineering, Imperial College London, London SW7 2AZ, UK * d.xenos@imperial.ac.uk 2 Corporate Operational Excellence - Technical Process Optimisation, BASF SE, 67056 Ludwigshafen, Germany 3 Process Control Task Force, BASF SE, 67056 Ludwigshafen, Germany 4 Automation Technology - Advanced Process Control, BASF SE, 67056 Ludwigshafen, Germany in the process industry, especially for older machines. Each technical upgrade needs to be justified by safety conditions or improved economics. This paper discusses the challenges of the use of data-driven models for industrial applications and presents results from an example of the optimization of a network of compressors of an air separation plant. II. OPTIMIZATION OF COMPRESSOR STATIONS Compressor stations include a network of compressors operating in parallel and can be found in process systems [1], [2], in natural gas boosting stations [3] and in conventional natural gas transportation stations [4]. Han et al. [2] reported that the energy consumed by the air- and gas-network of the compressors of a terepthalic acid manufacturing plant is estimated approximately 75%-85% of the total consumption. Air compressors of air separation plants can consume more than 70% of the overall consumption of plants of several tens of MW [1]. The optimization of the operation of compressor stations can lead to significant reduction in energy cost compared to the typical industrial operation with strategies such as equal split and equal surge margin. A summary of typical load sharing techniques and other best practices can be seen in the article by Xenos et al. [1]. The optimization of the operation of the compressors exploits the fact that it is difficult or impossible for different compressors of a compressor station to have identical characteristics and efficiencies [5], [6]. Moreover, these characteristics and efficiencies change over time due to fouling and erosion, and non-uniform mainte- nance plans which result in dissimilar compressor maps for the same compressor at different time periods [3], [7]. III. ONLINE OPTIMIZATION FRAMEWORK Paparella et al. [3] and Xenos et. al. [1] presented online optimization frameworks for minimizing the power con- sumption of compressors in real time. The basic components of a Real Time Optimization (RTO) application of industrial multi-stage centrifugal compressors can be seen in article [1]. The sensors of the monitoring system collect process data of the operation such as mass flow rates, pressures and temperatures. A steady-state identification algorithm exam- ines key process variables and identifies when the operation is in steady-state. Thus, if the operation is in steady-state, then the collected data should be validated. The next step is to use the validated data to update the models of the online compressors (parameter estimation). An optimization model employs these data-driven models and estimates the set points of the controlled variables for optimal load sharing. European Control Conference (ECC), Aalborg, DENMARK, 29 Jun 2016 - 01 Jul 2016. 2016 EUROPEAN CONTROL CONFERENCE (ECC). IEEE. 1025-1030. 01 Jan 2015