IFAC PapersOnLine 51-21 (2018) 111–116 ScienceDirect ScienceDirect Available online at www.sciencedirect.com 2405-8963 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Peer review under responsibility of International Federation of Automatic Control. 10.1016/j.ifacol.2018.09.401 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. 1. INTRODUCTION Big data is a current word across almost all businesses and plays a key part in Industry 4.0. The term big data is used here to characterize data that is not just large but complex in nature (MacGregor, 2017). Furthermore, there is not just one issue with big data, there are different objectives depending upon whether one is in sales, marketing, finance, manufacturing, etc. and there are many different issues to be solved (data collection, warehousing, integration, and analytics). Most of these issues need to be improved in order to be able to use the data to extract actionable information. The focus in this paper is on the issues that one must consider to effectively extract information and how we use these models to aid flotation operation. To analyse historical data, one needs to make use of models, usually empirical, such as regression, data mining or latent variable models. George Box, a famous statistics professor used to often say “All models are wrong but some are useful”. The problem is that most people think of empirical models as there were interchangeable, irrespective of the nature of the data or the objectives of the problem. Whether a model is useful depends upon three factors (MacGregor, 2017): (i) The objectives of the model (ii) The nature of the data used for the modeling (iii) The regression method used to build the model From an objective point of view, there are basically two major classes of models: those to be used for passive use and those to be used for active use. Passive models are intended to passively observe the process in the future. Such passive applications include classification, inferential or soft sensors, and process monitoring. For such passive uses one does not need causal models, rather one wants to just model the normal variations common to the operating process. Historical data is ideal for building such models. Models for active use are intended to be used to actively alter the process. Such active applications include using the models to optimize or control the process or to trouble-shoot process problems or gain causal information from the data. For active use one needs causal models. Causality implies that for any active changes in the adjustable or manipulatable variables in the process, the model will reliably predict the changes in the output of interest (MacGregor, 2017). The problem is that to guarantee causality in any set of adjustable process variables one needs to have independent variation in those variables, such as would result from a designed experiment performed on the plant. But historical plant operating data almost never contains such information, rather most variables vary in a highly correlated manner and Keywords: modeling, flotation, industrial, Abstract: The use of simulators is a powerful tool to train plant operators and to be also incorporated in the development and test of supervisory control strategies. However, the phenomenological models describing the process are relatively complex, characterized by nonlinear relationships and whose parameters are depending on many local factors, such as, the plant configuration, the individual characteristics of the equipment, the availability of on-line measurements and the characteristics of the feed, among others. In this work, the previously developed phenomenological model is adapted to the particular characteristics of the rougher circuit of an industrial flotation plant, considering its particular layout and the available information to feed the simulator. The rougher circuit consists of three lines of 8 mechanical cells processing a feed of 4000 tons/h. The new model predictions were tested for a family of feed characteristics, including variation of mineralogical species under different operating conditions. Since some variables are commonly unmeasured during the operation, additional data were incorporated to improve the model predictability. The use of the simulator is illustrated in several examples, as well a discussion of the model prediction limitations due to some particularities found in the historical operating data. Copyright 2018 IFAC. * Department of Chemical Engineering, Santa Maria University, Valparaíso, Chile (Tel: 56-32-2654229; e-mail: luis.bergh@ usm.cl). L. Bergh*, J.Yianatos*, C. Acuña*, K. Inostroza* Adapting a Phenomenological Model of a Rougher Flotation Circuit to Industrial Historical Operating Data Base