Adaptive Multi-Model Reinforcement Learning for Online
Database Tuning
Yaniv Gur
IBM Almaden Research Center
San Jose, CA
guryaniv@us.ibm.com
Dongsheng Yang
Princeton University
Princeton, NJ
dy5@princeton.edu
Frederik Stalschus
DHBW Stuttgart
Stuttgart, Germany
frederik.stalschus@ibm.com
Berthold Reinwald
IBM Almaden Research Center
San Jose, CA
reinwald@us.ibm.com
ABSTRACT
Mainstream DBMSs provide hundreds of knobs for performance
tuning. Tuning those knobs requires experienced database ad-
ministrators (DBA), who are often unavailable for owners of
small-scale databases, a common scenario in the era of cloud
computing. Therefore, algorithms that can automatically tune
the database performance with minimum human guidance is of
increasing importance. Developing an automatic database tuner
poses a number of challenges that need to be addressed. First,
out-of-the-box machine learning solutions cannot be directly
applied to this problem and, therefore, need to be modified to
perform well on this specific problem. Second, training samples
are scarce due to the time it takes to collect each data point and
the limited accessibility to query data submitted by the database
users. Third, databases are complicated systems with unstable
performance, which leads to noisy training data. Furthermore,
in a realistic online environment, workloads can change when
users run different applications at different times. Although there
are several research projects for automatic database tuning, they
have not fully addressed this challenge, and they are mainly de-
signed for offline training where the workloads do not change.
In this paper, we aim to tackle the challenge of online tuning in
evolving workloads environment by proposing a multi-model
tuning algorithm that leverages multiple Deep Deterministic Pol-
icy Gradient (DDPG) reinforcement learning models trained on
varying workloads. To evaluate our approach, we have imple-
mented a system for tuning a PostgreSQL database. The results
show that we can automatically tune a PostgreSQL database and
improve its performance on OLTP workloads and can adapt to
changing workloads using our multi-model approach.
1 INTRODUCTION
Modern DBMSs have hundreds of configuration knobs that affect
their performance. A DBMS that is not configured properly for
the current workload may lead to sub-optimal performance and
inefficient usage of system resources that may result in hundreds
of users that are not getting the performance they need for their
applications. The role of monitoring and configuring a DBMS
was traditionally done by a database administrator (DBA), an
expert dedicated to this task. However, nowadays, multiple DMBS
instances are deployed on the cloud and each instance could host
hundreds of databases, therefore, the task of monitoring and
© 2021 Copyright held by the owner/author(s). Published in Proceedings of the
24th International Conference on Extending Database Technology (EDBT), March
23-26, 2021, ISBN 978-3-89318-084-4 on OpenProceedings.org.
Distribution of this paper is permitted under the terms of the Creative Commons
license CC-by-nc-nd 4.0.
configuring a large-scale database infrastructure requires a large
number of DBAs, which would lead to high operation costs.
Over the last few years, several database vendors have identi-
fied the potential of using machine learning to automate different
database tasks on the cloud, such as automatic indexing, configu-
ration, and provisioning. A few examples include the autonomous
database from Oracle [11] and the self-driving database from
Alibaba [1]. The study of autonomous databases using AI is a
very active research area that already yielded a large number of
papers, where the most popular machine-learning paradigm in
recent works is reinforcement-learning [7, 9, 14, 18]. Born as a
machine-learning branch for solving complex control problems,
reinforcement learning is a natural choice the automatic database
tuning tasks.
One of the main challenges of operating an automatic DBMS
tuning system on the cloud is the fact that the database environ-
ment is dynamic: system resources, workloads, and database size
could change in the course of the day, therefore, a system for au-
tomatic tuning needs to be flexible and adapt to these changes to
provide the optimal performance for a given environment state.
In this paper, we address the problem of changing workloads in
an online tuning setting, and we employ reinforcement learning
for this task. While query-aware formulations for tuning were
previously proposed [7, 16], the problem of changing workloads
in an online tuning setting was not fully addressed.
Our main contributions in this paper are as follows:
• We propose a multi-model online tuning algorithm, sensi-
tive to workload changes, that leverages multiple DDPG
reinforcement learning models and selects the optimal
model for evolving workloads.
• We propose a simple reward function formulation for of-
fline and online tuning and show that it yields a more
stable learning curve compared to previous art [18].
• We demonstrate the offline and online tuning algorithms
on a PostgreSQL database and show that the performance
of the database can be significantly improved over the
baseline default performance.
2 RELATED WORK
In recent years, multiple studies have addressed the problem of
automatic DBMS tuning using various machine-learning tech-
niques. In [5] a method called adaptive sampling was used to
automate the knob configuration selection by sampling from
past experience and in OtterTune [16] Gaussian Process (GP)
regression was used to recommend the best knob settings. Rein-
forcement learning over continuous action configuration space
Short Paper
Series ISSN: 2367-2005 439 10.5441/002/edbt.2021.48