Developing Time Series Forecasting Models with Generative Large Language Models JUAN MORALES-GARCÍA, Universidad Católica de Murcia (UCAM), Spain ANTONIO LLANES, Universidad Católica de Murcia (UCAM), Spain FRANCISCO ARCAS-TÚNEZ, Universidad Católica de Murcia (UCAM), Spain FERNANDO TERROSO-SÁENZ, Universidad Politécnica de Cartagena (UPCT), Spain Nowadays, Generative Large Language Models (GLLMs) have made a signiﬁcant impact in the ﬁeld of Artiﬁcial Intelligence (AI). One of the domains extensively explored for these models is their ability as generators of functional source code for soﬅware projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning (ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which diﬀerent tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly deﬁne the source code necessary to generate viable predictive models. e use case deﬁned is the forecasting of a time series that reports the indoor temperature of a greenhouse. e results indicate that, while it is possible to achieve good accuracy metrics with simple predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still far from improving the accuracy of predictive models generated by human data scientists. CCS Concepts: • Computing methodologies → Machine learning algorithms; Artiﬁcial intelligence;• Information systems → Information systems applications. Additional Key Words and Phrases: Deep Learning, Generative Large Language Models (GLLMs), ChatGPT, Copilot, Time series forecasting 1 INTRODUCTION e creation and evolution of Artiﬁcial Intelligence (AI) has been one of the most signiﬁcant advances in the technology and computer-science ﬁelds in the last decades [13]. In recent years, a new wave of innovation in AI has led to the development of Generative Large Language Models (GLLMs) which are increasingly dominant in all areas, such as OpenAI ChatGPT or GitHub Copilot [19]. Because of their ability to operate through natural language, they are intended as intelligent assistants in a wide range of domains [9]. In this context, the creation of Deep Learning (DL) and Machine Learning (ML) models to solve certain cognitive tasks, such as image recognition, video analysis or timeseries forecasting, was a task traditionally reserved for highly skilled programmers or data scientists who designed the algorithms, implemented their logic and carefully tuned their hyperparameters [17]. Due to the aforementioned ability of GLLMs to operate as assistants in many diﬀerent ﬁelds, e current paper examines the feasibility of applying such models to automatically generate the source code to instantiate Authors’ addresses: Juan Morales-García, jmorales8@ucam.edu, Universidad Católica de Murcia (UCAM), Murcia, Spain; Antonio Llanes, allanes@ucam.edu, Universidad Católica de Murcia (UCAM), Murcia, Spain; Francisco Arcas-Túnez, farcas@ucam.edu, Universidad Católica de Murcia (UCAM), Murcia, Spain; Fernando Terroso-Sáenz, fernando.terroso@upct.es, Universidad Politécnica de Cartagena (UPCT), Murcia, Spain. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permied. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. Request permissions from permissions@acm.org. © 2024 Copyright held by the owner/author(s). ACM 2157-6912/2024/5-ART https://doi.org/10.1145/3663485 ACM Trans. Intell. Syst. Technol.