A Data-Driven Smart Proxy Model for A Comprehensive Reservoir Simulation Faisal Alenezi Department of Petroleum and Natural Gas Engineering West Virginia University Email: falenezi@mix.wvu.edu Shahab Mohaghegh Department of Petroleum and Natural Gas Engineering West Virginia University Email: shahab.mohaghegh@mail.wvu.edu Abstract— One of the most important tools for studying fluid flow behavior in oil and gas reservoirs is reservoir simulation. It is constructed based on a comprehensive geological information. A comprehensive numerical reservoir model has tens of millions of grid blocks. Therefore, it becomes computationally expensive and time consuming to run the model for different reservoir simulation scenarios. There are many efforts have been made to reduce the computational size using the proxy models. Proxy models are the substitute to the complex numerical simulation by producing a meaningful representation of the complex system in a very short time. The conventional proxy models are either statistical or mathematical approaches. These conventional approaches are still limited to the complexity of the reservoir and the number of the numerical simulation runs needed to build the proxy model. In this study, a smart proxy model that is based on artificial intelligence and data mining is presented. A grid based smart proxy model is developed to reproduce the dynamic reservoir properties of a full- field numerical simulation in few seconds. A comprehensive spatio-temporal database is built using the conducted numerical simulation run. The data from the database is trained, calibrated, and verified throughout the development of the smart proxy model. Smart proxy model is able to produce pressure and saturation at each reservoir grid block accurately and with a significantly less computational time compared to the numerical reservoir simulation model. Keywords—Artificial Intelligence, Data Mining, Proxy Model- ing, Reservoir Simulation. I. I NTRODUCTION Petroleum industry strives to find oil and gas reserves, developing these resources, meet the world energy demand, and maximize profits. One of the most important tools in oil and gas reservoirs development and management is reservoir simulation. It is a necessary tool for reservoir engineering strategy plans. The key goal of reservoir simulation is to predict future performance of the reservoir and find ways and means of optimizing the recovery of some of the hydrocarbons under different operating conditions. Accurate reservoir simu- lation involves a comprehensive description of the reservoir properties. To date, the computational science, addressing numerical solution to complex multi-physic, non-linear, and partial differential equations, are at the lead of engineering problem solving and optimization [1]. Due to the complexity of a reservoir, sometimes it is com- putationally extravagant to develop and run numerical simu- lation models. Therefore, the petroleum industry investment in reservoir simulation tools is expensive. The rate of return on these investments should be calculated to maximize the benefits from the reservoir simulation. Reservoir simulation proxy models are one way to increase the return on investment in reservoir simulation. Proxy-modeling (also known as surro- gate modeling) is a computationally inexpensive alternative to full numerical simulation in assisted history matching, production optimization, and forecasting. A proxy model is defined as a mathematically, statistically, or data driven model defined function that replicates the simulation model output for selected input parameters [2]. The proxy model’s results are not to mimic the numerical simulation results with 100% accuracy, but the outputs generated with the amount of time to run these models, give a reasonable range of error. Reducing the computational time to few seconds, make these models sig- nificantly competent and attractive to the reservoir engineers [3]. There are several approaches for generating the proxy models. Response surface methodology (RSM), reduced order models (ROD), reduced physics models (RPM) are the first techniques introduced in this field. The most widely used approach is the response surface methodology. Response surface methodology (RSM) consists of a group of mathematical and statistical techniques used in the development of a sufficient functional relationship between a response of interest and a number of associated input variables [4]. In recent years, a newly developed technique for generating proxy modeling has introduced to the reservoir simulation. It is neither statistical nor mathematical; it is a smart approach that is based on data mining and artificial intelligence. II. DATA MINING AND ARTIFICIAL INTTLEGENCE TECHNIQUE The amount of data in the world is increasing dramatically. Data mining is about solving problems by analyzing and discovering the patterns already present in databases [5]. Artificial Intelligence is a powerful technique that teaches the machines how to process data. Data mining and Artificial Intelligent have been applied in petroleum engineering field. In his series of articles in Society of petroleum engineers journal, Shahab D. Mohaghegh presented three types of the virtual intelligence (neural networks, genetic algorithm, and fuzzy logic) and their applications in the oil and gas industry 978-1-4673-8956-3/16/$31.00 ©2016 IEEE