Developing Time Series Forecasting Models with Generative Large
Language Models
JUAN MORALES-GARCÍA, Universidad Católica de Murcia (UCAM), Spain
ANTONIO LLANES, Universidad Católica de Murcia (UCAM), Spain
FRANCISCO ARCAS-TÚNEZ, Universidad Católica de Murcia (UCAM), Spain
FERNANDO TERROSO-SÁENZ, Universidad Politécnica de Cartagena (UPCT), Spain
Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of Artificial Intelligence
(AI). One of the domains extensively explored for these models is their ability as generators of functional source code for
soſtware projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning
(ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating
the extent to which different tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly define the source code
necessary to generate viable predictive models. e use case defined is the forecasting of a time series that reports the indoor
temperature of a greenhouse. e results indicate that, while it is possible to achieve good accuracy metrics with simple
predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still
far from improving the accuracy of predictive models generated by human data scientists.
CCS Concepts: • Computing methodologies → Machine learning algorithms; Artificial intelligence;• Information
systems → Information systems applications.
Additional Key Words and Phrases: Deep Learning, Generative Large Language Models (GLLMs), ChatGPT, Copilot, Time
series forecasting
1 INTRODUCTION
e creation and evolution of Artificial Intelligence (AI) has been one of the most significant advances in the
technology and computer-science fields in the last decades [13]. In recent years, a new wave of innovation in AI
has led to the development of Generative Large Language Models (GLLMs) which are increasingly dominant in
all areas, such as OpenAI ChatGPT or GitHub Copilot [19]. Because of their ability to operate through natural
language, they are intended as intelligent assistants in a wide range of domains [9].
In this context, the creation of Deep Learning (DL) and Machine Learning (ML) models to solve certain cognitive
tasks, such as image recognition, video analysis or timeseries forecasting, was a task traditionally reserved for
highly skilled programmers or data scientists who designed the algorithms, implemented their logic and carefully
tuned their hyperparameters [17].
Due to the aforementioned ability of GLLMs to operate as assistants in many different fields, e current
paper examines the feasibility of applying such models to automatically generate the source code to instantiate
Authors’ addresses: Juan Morales-García, jmorales8@ucam.edu, Universidad Católica de Murcia (UCAM), Murcia, Spain; Antonio Llanes,
allanes@ucam.edu, Universidad Católica de Murcia (UCAM), Murcia, Spain; Francisco Arcas-Túnez, farcas@ucam.edu, Universidad Católica
de Murcia (UCAM), Murcia, Spain; Fernando Terroso-Sáenz, fernando.terroso@upct.es, Universidad Politécnica de Cartagena (UPCT), Murcia,
Spain.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permied. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
permissions@acm.org.
© 2024 Copyright held by the owner/author(s).
ACM 2157-6912/2024/5-ART
https://doi.org/10.1145/3663485
ACM Trans. Intell. Syst. Technol.