https://doi.org/10.1140/epjs/s11734-025-01710-z
Regular Article
Predicting temperatures in Brazilian states capitals via Machine Learning
1
Federal University of Paraná, 81531-980, Curitiba, PR, Brazil
2
Institute of Physics, University of São Paulo, 05508-090, São Paulo, SP, Brazil
3
Department of Physics and Interdisciplinary Center for Science, Technology and Innovation, Center for Modeling and Scientific Computing, Federal University of Paraná, 81531-980, Curitiba, PR, Brazil
4
Graduate Program in Science, State University of Ponta Grossa, 84030-900, Ponta Grossa, PR, Brazil
5
Department of Mathematics and Statistics, State University of Ponta Grossa, 84030-900, Ponta Grossa, PR, Brazil
6
Potsdam Institute for Climate Impact Research, Telegrafenberg A31, 14473, Potsdam, Germany
7
Department of Physics, Humboldt University Berlin, Newtonstraße 15, 12489, Berlin, Germany
Received:
31
March
2025
Accepted:
22
May
2025
Published online:
2
June
2025
Climate change refers to substantial long-term variations in weather patterns. In this work, we employ a Machine Learning (ML) technique, the Random Forest (RF) algorithm, to forecast the monthly average temperature for Brazilian’s states capitals (27 cities) and the whole country, from January 1961 to December 2022. To forecast the temperature at k-month, we consider as features in RF: (i) global emissions of carbon dioxide (CO), methane (CH
), and nitrous oxide (N
O) at k-month; (ii) temperatures from the previous three months, i.e.,
,
and
-month; (iii) combination of i and ii. By investigating breakpoints in the times series, we discover that 24 cities and the gases present breakpoints in the 80’s and 90’s. After the breakpoints, we find an increase in the temperature and the gas emission. Thereafter, we separate the cities according to their geographical position and employ the RF algorithm to forecast the temperature from 2010–08 until 2022–12. Based on i, ii, and iii, we find that the three inputs result in a very precise forecast, with a normalized root mean squared error (NMRSE) less than 0.083 for the considered cases. From our simulations, the better forecasted region is Northeast through iii (NMRSE = 0.012). Furthermore, we also investigate the forecasting of anomalous temperature data by removing the annual component of each time series. In this case, the best forecasting is obtained with strategy i, with the best region being Northeast (NRMSE = 0.090).
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to EDP Sciences, Springer-Verlag GmbH Germany, part of Springer Nature 2025
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.