Data analysis and inference for an industrial deethanizer Francesco Corona a , Michela Mulas b , Roberto Baratti c and Jose Romagnoli d a Dept. of Information and Computer Science, Helsinki University of Technology P.O. Box 5400, FI-02015 HUT, Finland b Biotecnology and Chemical Technology Dept., Helsinki University of Technology P.O. Box 6100, FI-02015 HUT, Finland c Dept. of Chemical Engineering and Materials, University of Cagliari Piazza d’Armi, I-09123 Cagliari, Italy d Cain Dept. of Chemical Engineering, Louisiana State Univesrity South Stadium Road, LA-70803 Baton Rouge, USA In this paper, we present an application of data derived approaches for analyzing and monitoring an industrial deethanizer column. The discussed methods are used in visualizing process measurements, extracting operational information and designing an estimation model. Emphasis is given to the modeling of the data obtained with standard paradigms like the Self-Organizing Map (SOM) and the Multi-Layer Perceptron (MLP). Here, the effectiveness of these data-derived techniques is validated on a full-scale application where the goal is to identify significant operational modes and most sensitive process variables before developing an alternative control scheme. 1. Introduction A modern process plant is under tremendous pressure to maintaining and improving product quality and profit under stringent environmental and safety constraints. For efficient operation, any decision-making action related to the plant operation requires the knowledge of the actual state of the process. The availability of easily accessible displays and intuitive knowledge of the states is invaluable with immediate implications for profitability, management planning, environmental responsibility and safety. In this paper, we discuss the implementation and direct application of a strategy to model, visualize and analyze the information encoded in industrial process data. The approach is based on a classical machine learning method for dimensionality reduction and quantization, the Self-Organizing Map, SOM (Kohonen 2001). The SOM combines many of the main properties of other general techniques and shares many commonalities with two standard methods for data projection (Principal Components Analysis) and clustering (K-means). In addition, the SOM is also provided with a set of tools that allow for efficient data visualizations in high-dimensional settings. The use of the Self- Organizing Map in the exploratory stage of data analysis is discussed in (Kaski, 1997 and Vesanto, 2002) and it is widely employed in many fields.