Machine learning methods’ comparison for land surface
                                temperatures forecasting due to climate classification⋆
                                Tetiana Hovorushchenko1,†, Vitalii Alekseiko1,∗,† and Vitaly Levashenko2,†
                                1 Khmelnytskyi National University, Institutska str., 11, Khmelnytskyi, 29016, Ukraine
                                2 Zilina University, Univerzitná 8215, 010 26 Žilina, Slovakia


                                                 Abstract
                                                 The application of machine learning methods for short- and medium-term forecasting of the average
                                                 monthly temperature of the Earth’s surface, taking into account climatic zoning, is considered. The
                                                 peculiarities of predicting the temperature of a moving surface in the context of the regression problem
                                                 using machine learning methods are described. A comparison of the forecasting accuracy of methods based
                                                 on metrics was made. Peculiarities of calculating the metrics, according to the values of the investigated
                                                 parameters, are considered. The speed of operation of the methods was analyzed and statistical indicators
                                                 were calculated. To visualize the effectiveness of the methods, Taylor diagrams were constructed. The most
                                                 effective methods for forecasting the temperature of the Earth’s surface have been determined.

                                                 Keywords
                                                 machine learning (ML), forecasting, land surface temperature, climate zone, models’ evaluation,
                                                 regression, climate changes.1


                                1. Introduction
                                The problem of changing the temperature of the Earth’s surface has a wide range of consequences
                                that affect almost all aspects of people’s lives, including the spheres of agriculture, health care,
                                economy, energy, infrastructure, tourism, forest and water management.
                                   Social aspects are also particularly acute. Climate changes increasingly become the cause of
                                population migration, climate refugees appear, which in turn leads to changes in the social structure
                                of communities and creates new civilizational challenges. Today, climate change is felt all over the
                                planet, but certain regions are particularly vulnerable [1, 2]. In view of this, the question arises of
                                predicting possible climate changes in order to develop strategies for avoiding and mitigating the
                                consequences.
                                   Nowadays, there are several approaches to predict climate parameters, but the rapid development
                                of machine learning technologies has led to the emergence of new and effective methods that are
                                competitive to numerical knowledge-based alternatives [3, 4]. Machine learning (ML) technologies
                                are capable of processing large volumes of data more efficiently and capturing patterns, which makes
                                them extremely effective in solving the problem of forecasting.

                                2. Peculiarities of land surface temperature forecasting
                                Forecasting of climate parameters is extremely relevant in the context of modern climate changes
                                [5]. One of the main and most widely studied parameters is temperature. Modern scientific research
                                is aimed at determining the main trends in air [6, 7], land [8] and water [9] surface temperature


                                AdvAIT-2024: 1st International Workshop on Advanced Applied Information Technologies, December 5, 2024, Khmelnytskyi,
                                Ukraine - Zilina, Slovakia
                                ∗ Corresponding author.
                                † These authors contributed equally.

                                   tat_yana@ukr.net (T. Hovorushchenko); vitalii.alekseiko@gmail.com (V. Alekseiko); vitaly.levashenko@fri.uniza.sk
                                (V. Levashenko)
                                    0000-0002-7942-1857 (T. Hovorushchenko); 0000-0003-1562-9154 (V. Alekseiko); 0000-0003-1932-3603 (V. Levashenko)
                                            © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
changes. The study of these parameters has a key influence on determining the priorities of greening,
urban planning and landscape design [10].
   Forecasting the temperature of the Earth’s surface is associated with some complexities and
peculiarities due to the dynamic nature of the climate system and the Earth’s surface itself. The key
factors that determine these features are:

   – spatial variability;
   – temporal variability;
   – feedback mechanisms;
   – uncertainties in models;
   – anthropogenic impact;
   – extreme events;
   – availability and quality of data.

    The Earth’s surface temperature varies greatly among regions due to factors such as latitude,
proximity to oceans, elevation above sea level, and types of land cover (such as forests, deserts).
Forecasting must take into account these spatial variations, which may affect local weather
conditions.
    It should be noted that the temperature of the Earth’s surface fluctuates not only seasonally, but
also daily due to diurnal cycles (day-night). However, when studying the main trends in the average
monthly temperature, it is advisable to ignore daily cycles. Also, weather patterns and climate
phenomena such as El Niño and La Niña can cause interannual variability. In addition, they are able
to influence long-term temperature trends.
    The Earth’s climate system is driven by various feedback mechanisms, such as the albedo effect
(the reflectivity of the Earth’s surface), the concentration of greenhouse gases (e.g. CO2) and the
accumulation of heat in the ocean. These feedbacks can amplify or weaken temperature changes,
making predictions difficult.
    Climate models that simulate Earth’s climate include physical processes such as radiation,
convection, and ocean currents. These models contain uncertainties due to imperfect knowledge of
the parameters and the complexity of the interaction between different components of the climate
system.
    Another important factor is human activity, including industrial emissions, land-use change, and
urbanization [11], which contribute to warming trends [12]. Predicting how these factors will evolve
and interact with natural climate variability adds another layer of complexity to temperature
projections.
    Forecasting extreme temperature events, such as heat waves and cold snaps, requires
understanding not only trends in average temperature, but also the likelihood and intensity of such
events under changing climate conditions.
    Surface temperature forecasting is based on historical data from weather stations, satellites, and
other sources. Ensuring the accuracy and reliability of this data, especially in remote or data-poor
regions, can present challenges for forecasting models.
    Solving these complexities involves integrating observations, improving modeling methods, and
understanding of the Earth’s climate system. Technological advances and computing power continue
to improve our ability to more accurately predict surface temperatures on different time scales.


3. Methodology
3.1. Dataset
In the research it was used a dataset GlobalLandTemperatures [13] with Creative Commons License
(CC0: Public Domain) from Kaggle.
    This dataset includes Earth’s surface temperatures data from 1743 to 2013. Original tables content
following information:
    – dt (includes month and year, when the temperature was observed);
    – AverageTemperature (average monthly temperature);
    – AverageTemperatureUncertainty (with uncertainty values of measurement);
    – Country (includes country or territory, where the temperature was observed);
    Due to needs of research the dataset was modified. It was added columns ‘ClimateZone’ with
abbreviation of climate zones according to the World Climate Data [14] and ‘MainClimateZone’ with
letter, which means belonging to one of five main climate zones. Table 1 shows number of countries
of each main climate zone.
    Although the dataset cannot be fully called balanced, this is explained by the peculiarities of the
location of countries on the globe and the geopolitical situation. There are some important aspects:

   – Area of Countries;
   – Geopolitical factors;
   – Selecting Data Sources.

   First of all, large countries can have a variety of climate zones. For example, some countries cover
a vast territory with varying climate conditions: from arctic to temperate in Canada or from
temperate to arid in the USA. This may result in uneven presentation of temperature data.
   Secondly, political, economic and sociocultural differences between countries can also affect the
balance of the dataset. For example, access to climate observation technologies may be uneven across
countries, which may affect the accuracy of the data.
   Finally, different countries may have different climate monitoring systems and different data
sources. Some countries may be active in collecting data, while others may be less active. This can
also affect the balance of the dataset.

Table 1
Number of countries of each main climate zone
                       Main Climate Zone                Number of countries
                               A                               94
                               B                               45
                               C                               64
                               D                               26
                               E                                6


    In general, the imbalance of the dataset with the temperatures of the earth’s surface by country
is a complex issue associated with many factors. For more accurate climate analysis and modeling,
it is important to consider all these aspects.
    In the research was used data with similar values of uncertainty, but sometimes this values are
different, so forecasting may be more or less accurate for some regions. To avoid any discrimination,
it was used data for all countries and territories with relevant information.

3.2. Machine learning methods
It was conducted a study of the operation of various methods for different climate zones. To do this,
it was developed several models to forecast temperature for the period from 2000 to 2013.
Temperature data up to the year 2000 were used for model fitting.
    The following methods were chosen for the study:

   – neural network;
   – decision trees;
   – random forest;
   – K nearest neighbors;
   – method of support vectors;
   – gradient boosting;
   – Ada boost;
   – XG boost;
   – light GBM.

   Due to the climatic features of different regions of the Earth, it is advisable to conduct separate
studies for each of the climatic zones in order to identify the methods that are best adapted to the
corresponding temperature dependencies [15, 16].

3.2.1. Neural Network
A neural network (NN) is a set of algorithms modeled after the human brain designed for pattern
recognition. A neural network interprets the data using a kind of machine perception, labeling or
clustering of the raw data. Neural networks consist of layers of interconnected nodes (“neurons”)
that process input data, learn from it, and make decisions based on learned patterns. Each node is
assigned a weight that is adjusted during learning to minimize the prediction error.
    In the context of regression tasks for predicting numerical series, neural networks can model
complex relationships between inputs and outputs. Recurrent neural networks (RNNs), long-short-
term memory (LSTM) networks, and supervised recurrent units (GRUs) are particularly well-suited
to time series forecasting because they can capture temporal dependencies in data. By learning from
historical data, these networks learn patterns and trends that can be used to predict future values.

3.2.2. Decision Trees
Decision Tree (DT) is a non-parametric supervised learning method used for classification and
regression problems. It partitions the dataset into subsets based on the most important feature at
each node, making decisions based on feature values. Each branch of the tree represents a decision
rule, and each leaf represents an outcome. The decision rule can be represented as:

              if xi < a then go to left subtree, else go to right subtree,                      (1)
   where:
   xi – feature;
   a – threshold.
   In the context of a regression problem, decision trees can be used by partitioning the data into
subsets based on input feature values and predicting a numerical value for each subset [17]. In time
series forecasting, decision trees can model the relationship between time-based features and a target
variable. Although decision trees are easy to understand and interpret, individual decision trees can
be prone to overfitting and as a result may not perform well with complex patterns, but at the same
time the method is fundamental to ensemble methods such as random forest and gradient boosting.

3.2.3. Random Forest
Random Forest (RF) is an ensemble learning method. The work of the method is based on building
several decision trees during the learning process and deriving class membership (classification task)
or average prediction (regression task) of individual trees [18]. A random forest combines the
simplicity of decision trees with improved accuracy, robustness, and robustness to overfitting by
averaging the results of multiple trees that may individually be subject to overfitting [19].

                                                N
                                           1
                                        y = � yi ,                                              (2)
                                           N
                                               i=1
   where:
   yi – prediction of the i-th tree;
   N – total number of trees.
   In the context of numerical series prediction, each tree is trained on a random subset of data and
features, and their predictions are averaged to produce a final prediction. Random forests are quite
robust and handle a large number of input variables well.

3.2.4. K-nearest neighbors
K-Nearest Neighbors (KNN) is a simple instance-based learning algorithm that classifies a data point
based on how its neighbors are classed. In KNN, the parameter “K” represents the number of nearest
neighbors to consider. The algorithm calculates the distance between the new data point and the
training points and then assigns a class based on the K-nearest neighbor majority votes.
    KNN can be applied to regression tasks by averaging the numerical values of the K-nearest
neighbors. For time series forecasting, KNN can predict the future value by finding similar historical
patterns and averaging their subsequent values. KNN is simple to implement, but can be
computationally expensive and sensitive to the choice of K as well as the distance metric used.

3.2.5. Support Vector Regression
Support Vector Machine (SVM) is a supervised learning algorithm that can be used for classification
or regression. The algorithm of the method consists in finding the hyperplane that best divides the
data into classes [18]. In cases where the data cannot be partitioned linearly, SVM uses a
transformation of the data into a higher dimensional space where a hyperplane can be used for
partitioning.
   In regression problems, the support vector method is known as support vector regression (SVR).
SVR tries to find a function that deviates from the actual observed values by an amount that does
not exceed a given threshold and is as smooth as possible. For time series forecasting, SVR can
capture the underlying trend and seasonality in the data, although this often requires careful
parameter tuning and kernel selection.

3.2.6. Gradient Boosting
Gradient Boosting (GB) is a complex technique that sequentially builds models, where each new
model tries to correct mistakes made by previous models. This approach uses a gradient descent
algorithm to minimize the loss function. The method is powerful for both classification and
regression tasks [19]. It is very efficient and accurate in forecasting, although it may require
significant resources for intensive calculations.
   The loss function in regression problems is often represented by mean squared error or mean
absolute error. Variations of the gradient boosting method, in particular XGBoost and LightGBM,
are known for their high accuracy and ability to handle complex datasets.

3.2.7. Ada boost.R
Adaptive Boosting (AB, AdaBoost) is an ensemble learning technique that combines several weak
classifiers to create a strong classifier. This method focuses on cases that previous classifiers
misclassified and adjusts their weights accordingly, thus increasing the accuracy of the model. Each
subsequent model in the sequence is tuned to correct the errors of the previous ones, making it highly
effective at improving forecasting performance.
   AdaBoost can be adapted for the regression problem (AdaBoost.R). In this context, the method
combines the predictions of several weak methods, typically decision trees, to create a strong
predictive model. Each such method focuses on correcting the mistakes of the previous ones. For
numerical series prediction, AdaBoost.R can improve prediction accuracy by highlighting hard-to-
predict data points during fitting.
3.2.8. XG boost
XGBoost (XGB – Extreme Gradient Boosting) is a powerful and efficient implementation of gradient
boosting. The method includes numerous optimizations such as parallel processing, tree pruning,
and missing value handling, making it faster and more accurate than traditional gradient boosting
methods [16]. XGBoost is widely used in competitive machine learning due to its performance and
flexibility.
    XGBoost is a powerful tool for regression tasks and is widely used for predicting numerical series.
It includes optimizations such as parallel processing and regularization to prevent overfitting.
XGBoost builds trees sequentially, where each tree aims to reduce the residual errors of previous
trees. The method is known for its high performance and scalability, making it a popular choice for
forecasting tasks.

3.2.9. Light GBM
LightGBM (LGBM – Light Gradient Boosting Machine) is a gradient boosting framework that uses
tree-based learning algorithms. It is designed to be highly efficient and scalable, suitable for large
datasets. LightGBM uses histogram-based algorithms, which provides faster fitting and less memory
usage compared to traditional gradient boosting frameworks.
   LightGBM is particularly effective for regression tasks, including predicting number series. The
method uses histogram-based algorithms to efficiently group and divide data. LightGBM handles
large datasets and high-dimensional data efficiently, making it suitable for numerical series
prediction. It builds trees sequentially, with each tree correcting the errors of previous ones, similar
to other gradient boosting methods.

3.3. Models’ evaluation
To evaluate the effectiveness of the predictive model in the regression problem, various aspects of
performance are measured. The most common metrics include [20, 21, 22]:
   – Mean Absolute Error (MAE):
   Indicates the average of the absolute differences between predicted and actual values. Estimates
the accuracy of forecasts without considering the direction of errors [22].

                                           1  n
                                  MAE =      � |yi − y� i |,                                      (3)
                                           n  i=1
  where n – number of observations;
  yi – the actual value of the i-th observation;
  ŷi – the predicted value of the i-th observation.
  – Mean Squared Error (MSE):
  Indicates the mean of the squared differences between predicted and actual values, giving greater
weight to larger errors.

                                          1 n
                                 MSE =     � (y − y� i )2                                         (4)
                                          n i=1 i
    – Root Mean Square Error (RMSE):
    The square root of the root mean square error has the same units as the raw data, making it easier
to interpret [22].


                                       1 n                                                        (5)
                               RMSE = � � (yi − y� i )2
                                       n i=1
   – R-squared (R²):
   Indicates the proportion of variance in the dependent variable that can be predicted from the
independent variable(s). The value ranges from 0 to 1, with higher values indicating a better fit of
the model to the tasks at hand.

                                      ∑ni=1(yi − y� i )2
                                  2
                                 R =1− n                 ,                                     (6)
                                      ∑i=1(yi − y�i )2
      where y� i – the average among the actual values.
   – Mean Absolute Percentage Error (MAPE):
   Determines the average value of the absolute percentage errors. MAPE expresses the error as a
percentage of the actual values.

                                        100%  n   yi − y� i
                             MAPE =          � �            �                                  (7)
                                          n   i=1    yi
   – Symmetric Mean Absolute Percentage Error (sMAPE):
   It is a type of MAPE that takes into account positive and negative deviations symmetrically.

                                      100%   n      |yi − y� i |
                         sMAPE =           �                                                   (8)
                                        n    i=1 (|yi | + |y
                                                           � i |)/2
   – Mean Bias Deviation (MBD):
   Determines the average bias in the forecasts, indicating whether the model is systematically over-
or under-predicting.

                                          1  n
                                 MBD =      � (yi − y� i )                                     (9)
                                          n  i=1
    – Median Absolute Error (MedAE):
    Is the median of the absolute differences between the predicted and actual values. The mean
absolute error is less sensitive to outliers compared to the mean absolute error (MAE), making it a
reliable measure of model performance when there is significant outliers in the data.

                              MedAE = median(|yi − y� i |)                                     (10)
   A Taylor diagram is a graphical tool used to evaluate the performance of models by comparing
their results to observations. The chart combines three statistics into one graph: correlation
coefficient, standard deviation, and root mean square error (RMSE).
   The standard deviation σ represents the variability or spread of the data. Calculated for both
observation data and model data:


                                      1 n                                                      (11)
                                 σ = � � (xi − x� i )2 ,
                                      n i=1
  where:
  n – number of observations;
  xi – each individual observation;
  x� – the mean value of the observations.
  Correlation coefficient r between observation data and model data. It indicates how well the
model results match the observed data in terms of patterns and time intervals.

                            ∑ni=1(xobsi − x� obs )(xmodeli − x� model )
                  r=                                                          ,                (12)
                       �∑ni=1(xobsi − x� obs )2 ∑ni=1(xmodeli − x� model )2
   where:
   xobsi – individual values of observations;
   xmodeli – individual model values;
   x� obs – the mean value of the observations;
   x� model – model mean value.
   The centered root mean square error E' reflects the total difference between the model output and
the observations, taking into account both the variance and the bias.


                       1 n                                           2
                                                                                                 (13)
                 E′ = � � �(xmodeli − x� model ) − (xobsi − x� obs )�
                       n i=1
   In this way, a Taylor chart allows you to compare multiple models on a single graph, making it
easier to visualize and interpret the relative performance of different machine learning techniques.
   The chart layout makes it easy to see how close the model's performance is to ideal (represented
by a control point where the correlation is 1, the standard deviation matches the observed, and the
RMSE is zero).
   Thus, the Taylor plot is a powerful tool for evaluating and comparing the performance of machine
learning methods, providing a visual and quantitative assessment of their ability to accurately
reproduce observed data patterns.

4. Results
In general, most machine learning methods have demonstrated high predictive accuracy on test data.
The calculated metrics are presented in Tables 3.1 – 3.5, separately for each climate zone.
   For the tropical climate zone (Table 2), KNN (72.1%) and LGBM (73.6%) methods showed the
highest efficiency according to the R2 metric. At the same time, their MAPE was 0.68% and 0.65%,
respectively. However, the complex temperature patterns associated with the geographical location
of most countries do not allow a highly accurate forecast to be made. In particular, this is due to the
location of the studied territories in different hemispheres, as well as some countries located in both
hemispheres. The impossibility of making a high-precision forecast necessitates further studies of
the tropical climate zone, in particular by conducting separate studies for different hemispheres, as
well as climate subzones.

Table 2
ML methods’ metrics evaluation for land surface temperature forecasting in zone A
                                                    Metrics
 Method                                                MAPE,        sMAPE,
              MAE         MSE       RMSE        R2                               MBD       MedAE
                                                           %            %
   NN         0.3721     0.2117     0.4601    0.0854    1.4097       1.4189      0.1909     0.3157
    DT        0.3039     0.1418     0.3766    0.3872    1.1489       1.1586      0.2734     0.2622
    RF        0.2508     0.1041     0.3226    0.5503    0.9475       0.9534      0.1811     0.2383
   KNN        0.1803     0.0645     0.254     0.7212    0.6812       0.6841      0.0857     0.1422
   SVR        0.6606     0.5412     0.7356    –1.338    2.4935       2.5322      0.6590     0.6462
    GB        0.2643     0.1154     0.3397    0.5015     0.9963      1.0037      0.2232     0.2432
   AB         1.0841     1.6211     1.2732    –6.003     4.1325      4.2554      1.0781     1.1265
  LGBM        0.1708     0.0611     0.2471    0.7362     0.6471      0.6486      0.0238     0.1275
   XGB        0.2964     0.1363     0.3692    0.4110     1.1201      1.1293      0.2647     0.2625

    In the arid climate zone (Table 3), temperature patterns are clearly observed, which are common
to both hemispheres, except for the shift caused by different seasons in different hemispheres. This
allows for accurate forecasting using various methods. Thus, NN, DT, RF, KNN, GB, LGBM and XGB
show a high R2 metric value (above 96%). In addition, these methods have a MAPE below 5%, which
indicates the high efficiency of the methods.
    Temperature dependencies observed in the temperate climate zone are also well amenable to
processing by machine learning methods (Table 4). Immediately three methods (RF, LGBM and XGB)
show high performance, while another 3 methods (DT, KNN, GB) also show high performance,
although they are minimally inferior.
   Forecasting the land surface temperature in the continental climate zone using machine learning
methods demonstrates a fairly high accuracy (Table 5). All significant methods have an R2 score above
94%. The gradient enhancement method has the best results. He is somewhat influenced by LGBT and
XGB methods. The evaluation of MAPE and sMAPE metrics for continental and polar climatic zones was
not carried out, since in these zones temperature values close to 0 are quite often observed, when extended
to which, according to formulas 7 and 8, very high indicators are produced, which in fact do not reflect
real situation.
    For temperatures in the polar climate zone, the methods do not work very well (Table 6). The
highest rates are observed among LGBM. R2 is 92%.
    Since the time spent on creating a forecast plays a rather important role, it is advisable to choose
faster methods, provided the same accuracy of forecasting. Table 7 shows the running time of each
method for data from each climate zone. Comparing the results, we can conclude that the methods
of decision trees and k-nearest neighbors work the fastest. Analyzing the forecasting accuracy, it can
be concluded that the KNN, RF, GB, LGBM, XGB methods are quite effective for creating a forecast
in the short and medium term. These methods are able to fairly accurately predict the average
monthly temperature for the next decade. Tables 8 and 9 show the standard deviation, correlation
coefficients, and centered root mean square error for each of the methods in each climate zone.

Table 3
ML methods’ metrics evaluation for land surface temperature forecasting in zone B
                                                        Metrics
  Method                                                   MAPE,       sMAPE,
               MAE         MSE       RMSE         R2                                MBD       MedAE
                                                            %            %
    NN        0.7624      0.8241     0.9078     0.9626      3.6509      3.5874     –0.017      0.7348
    DT        0.4069      0.3311     0.5754     0.985       2.0901      2.072      –0.006      0.2769
    RF        0.3826      0.3043     0.5516     0.9862      1.9779      1.9651      0.011      0.2779
   KNN        0.4383      0.3461     0.5883     0.9843      2.1835      2.193       0.2083     0.3614
   SVR        2.4464       7.337     2.7087     0.667       10.264     10.9008      2.3992      2.79
    GB        0.4114      0.2941     0.5423     0.9867      2.0614      2.0606      0.1255     0.3287
    AB        3.3319      12.209     3.4941     0.4459      14.539     15.7484      3.3247     3.5822
  LGBM        0.3659      0.2488     0.4988     0.9887      1.8651      1.8592      –0.021     0.2597
   XGB        0.3792      0.2954     0.5436     0.9866      1.9607      1.9435      –0.01      0.2425


Table 4
ML methods’ metrics evaluation for land surface temperature forecasting in zone C
                                                        Metrics
  Method                                                   MAPE,       sMAPE,
               MAE         MSE       RMSE         R2                                MBD        MedAE
                                                            %            %
    NN        0.6968      0.8324     0.9124     0.9431      4.9592      4.8864      0.056      0.5704
    DT       0.4892    0.4466     0.6683   0.9695       3.6017   3.5947      0.04    0.3724
    RF       0.4528    0.4193     0.6475   0.9713       3.347    3.3468     0.0782   0.3408
   KNN       0.4826    0.4786     0.6918   0.9673       3.4636   3.5051     0.2661   0.3478
   SVR        1.47      3.012     1.736     0.794       9.2742   9.7888     1.3526   1.4363
    GB       0.4602    0.4422     0.665    0.9698       3.381    3.3721     0.1335   0.3555
    AB       1.7784    4.4077     2.0995   0.6986       12.396   13.5777    1.741    1.7006
  LGBM       0.4175    0.3879     0.6228   0.9735       3.0887   3.0719     0.0347   0.3008
   XGB       0.4488    0.4137     0.6432   0.9717       3.3478    3.335     0.0420   0.3496

Table 5
ML methods’ metrics evaluation for land surface temperature forecasting in zone D
                                                    Metrics
 Method                                                MAPE,     sMAPE,
             MAE        MSE       RMSE       R2                             MBD      MedAE
                                                        %          %
   NN        1.121      2.095     1.447    0.9725      –0.137    0.9426     1.121    2.095

    DT       0.9041     1.391     1.179    0.9817       –0.16    0.7756     0.9041   1.391

    RF       0.8834     1.286     1.134    0.9831       0.0497   0.7208     0.8834   1.286

   KNN       0.944      1.355     1.164    0.9822       0.4538   0.8171     0.944    1.355

   SVR       1.782      4.497     2.121    0.9409       1.1671   1.6315     1.782    4.497

    GB       0.8221     1.126     1.061    0.9852       0.0787   0.6538     0.8221   1.126

    AB       1.6938    4.3075     2.0754   0.9434       1.221    1.5527     1.6938   4.3075

  LGBM       0.8315    1.1502     1.0724   0.9849       0.1413    0.608     0.8315   1.1502

   XGB       0.822     1.2356     1.1116   0.9838      –0.154    0.6457     0.822    1.2356


Table 6
ML methods’ metrics evaluation for land surface temperature forecasting in zone E
                                                    Metrics
 Method                                                MAPE,     sMAPE,
             MAE        MSE       RMSE       R2                             MBD      MedAE
                                                        %          %
   NN        1.121      2.095     1.447    0.9725      –0.137    0.9426     1.121    2.095

    DT       0.9041     1.391     1.179    0.9817       –0.16    0.7756     0.9041   1.391

    RF       0.8834     1.286     1.134    0.9831       0.0497   0.7208     0.8834   1.286

   KNN       0.944      1.355     1.164    0.9822       0.4538   0.8171     0.944    1.355

   SVR       1.782      4.497     2.121    0.9409       1.1671   1.6315     1.782    4.497

    GB       0.8221     1.126     1.061    0.9852       0.0787   0.6538     0.8221   1.126

    AB       1.6938    4.3075     2.0754   0.9434       1.221    1.5527     1.6938   4.3075

  LGBM       0.8315    1.1502     1.0724   0.9849       0.1413    0.608     0.8315   1.1502

   XGB       0.822     1.2356     1.1116   0.9838      –0.154    0.6457     0.822    1.2356
Table 7
Time spent on creating a forecast
                                                               Time, sec
    Method
                         Zone A               Zone B               Zone C          Zone D              Zone E
        NN                     0.4             0.57                 1.36             0.45               0.66
        DT                 0.01                0.01                 0.01             0.01               0.01
         RF                0.47                0.59                 0.56             0.52               0.54
        KNN                0.01                0.01                 0.01             0.01               0.01
        SVR                0.12                0.07                 0.06             0.08               0.14
        GB                 0.21                0.24                 0.21             0.19               0.26
        AB                 0.35                0.39                 0.32             0.33                0.3
     LGBM                  0.07                0.10                 0.08             0.07               0.07
        XGB                0.09                0.12                 0.14             0.13               0.12

Table 8
Taylor statistics for ML methods in zones A, B and C
                                                       Taylor statistics
Method                   Zone A                            Zone B                            Zone C
                σ          r          E'         σ             r            E'      σ           r            E'
  NN          0.369      0.542       0.446      3.975       0.992          0.554   3.207      0.982     0.672
  DT          0.439      0.845       0.374      4.565       0.993          0.561   3.714      0.983     0.659
   RF         0.403      0.832       0.313      4.589       0.993          0.541   3.698      0.986     0.635
 KNN          0.411      0.868       0.244      4.609       0.993          0.582   3.679      0.986     0.676
  SVR         0.323      0.737       0.718      3.637       0.986          2.494   3.013      0.977     1.534
  GB          0.356      0.854       0.316       4.6        0.994          0.534   3.574      0.987     0.616
  AB          0.928      0.709       1.192      3.839       0.988          3.388   3.929      0.955     2.097
 LGBM         0.4546     0.863       0.246      4.703       0.994          0.499   3.668      0.987     0.603
 XGB          0.426      0.846       0.365      4.576       0.993          0.531   3.685     0.986      0.628

Table 9
Taylor statistics for ML methods in zones D and E
                                              Taylor statistics
  Method                      Zone D                                               Zone E
                      σ           r           E'             σ                        r               E'
     NN            3.9749     0.9918        0.5540        2.1978                   0.9324           1.2512
     DT               4.5649         0.9927           0.5607          2.766          0.95           0.9195
        RF            4.5891         0.9932           0.5415         2.7418         0.9536          0.8693
    KNN               4.6088         0.9932           0.5821         2.6977         0.9574          0.8793
    SVR               3.6370         0.9864           2.4939         2.6935         0.9494          2.3643
     GB               4.6002         0.9938           0.5341         2.7258         0.9676          0.7720
     AB               3.8389         0.9882           3.3879         1.9441         0.9437          2.2599
   LGBM               4.7031         0.9944           0.4987         2.6913         0.9673          0.7147
    XGB               4.5761         0.9934           0.5306         2.7396         0.9547          0.8717
   Figure 1 shows Taylor diagrams for the considered machine learning methods for each of the
climate zones.


                 a) Zone A
                                                               b) Zone B


                 c) Zone C                                     d) Zone D


                                         e) Zone E
Figure 1: Taylor diagrams for the considered machine learning methods.
5. Conclusions
The conducted research made it possible to identify the most effective methods for forecasting the
temperature of the Earth’s surface in terms of the accuracy of the forecast and the time spent in each
of the climatic zones. Analysis of the forecast, taking into account climatic zoning, allows to more
clearly determine the patterns of individual territories and make a more accurate forecast. The
proposed approach makes it possible to monitor the main trends of climatic changes in the context
of changes in the temperature of the Earth’s surface in the short- and medium-term perspectives.
The proposed machine learning methods are able to make an accurate and quick forecast of the main
trends in the change of the average monthly temperature of the Earth's surface for the next decade.
Evaluation of machine learning methods was carried out on the basis of metrics. The values of
standard deviation, correlation coefficient and centered mean squared error for each of the methods
were also calculated. To visualize the effectiveness of the methods, Taylor diagrams were
constructed. This research makes it possible to form a basis for further study of changes in climatic
indicators in the context of individual territories and the search for the most appropriate machine
learning methods for forecasting climatic changes, taking into account climatic zoning.

Acknowledments
This work was supported by the project “Earth Observation for Early Warning of Land Degradation
at European Frontier (EWALD)” under the European Union’s Framework Programme for Research
and Innovation Horizon Europe – the Framework Programme for Research and Innovation (2021-
2027), Grant Agreement No. ID 101086250.

Declaration on Generative AI
During the preparation of this work, the authors used Grammarly in order to: grammar and spelling
check; DeepL Translate in order to: some phrases translation into English. After using these
tools/services, the authors reviewed and edited the content as needed and take full responsibility for
the publication’s content.

References
[1] P. Hryhoruk, S. Grygoruk, N. Khrushch, T. Hovorushchenko. Using non-metric
    multidimensional scaling for assessment of regions’ economy in the context of their sustainable
    development. CEUR-WS. (2020). Vol. 2713. 315-333.
[2] S. Yıldırım, S.H. Bostancı, D.Ç. Yıldırım. Parameters for the Study of Climate Refugees. In: P.
    Singh, B. Ao, A. Yadav (eds) Global Climate Change and Environmental Refugees. Springer,
    Cham. (2023). doi:10.1007/978-3-031-24833-7_11.
[3] C. O. de Burgh-Day, T. Leeuwenburg. Machine learning for numerical weather and climate
    modelling: a review, Geosci. Model Dev., 16, 6433–6477, (2023).
[4] L. Chen, B. Han, X. Wang, J. Zhao, W. Yang, Z. Yang. Machine Learning Methods in Weather
    and Climate Applications: A Survey. Appl. Sci. (2023), 13, 12019. doi:10.3390/app132112019.
[5] T. Hovorushchenko, V. Alekseiko. Land surface temperature forecasting in the context of the
    development of sustainable cities and communities. Computer Systems and Information
    Technologies, 3, (2024). 6–12. doi:10.31891/csit-2024-3-1.
[6] D. Fister, J. Pérez-Aracil, C. Peláez-Rodríguez, J. Del Ser, S. Salcedo-Sanz. Accurate long-term air
    temperature prediction with Machine Learning models and data reduction techniques, Applied
    Soft Computing, Volume 136, (2023). 110118, ISSN 1568-4946.
[7] M. Jamei, M. Karbasi, M. Ali, A. Malik, X. Chu, Z. M. Yaseen, A novel global solar exposure
    forecasting model based on air temperature: Designing a new multi-processing ensemble deep
     learning paradigm. Expert Systems with Applications, Volume 222, (2023), 119811, ISSN 0957-
     4174, doi:10.1016/j.eswa.2023.119811.
[8] C. B. Pande, J. C. Egbueri, R. Costache, L. M. Sidek, Q. Wang, F. Alshehri, N. Md Din, V. K.
     Gautam, S. C. Pal. Predictive modeling of land surface temperature (LST) based on Landsat-8
     satellite data and machine learning models for sustainable development, Journal of Cleaner
     Production, Volume 444, (2024). 141035, ISSN 0959-6526.
[9] F. Di Nunno, S. Zhu, M. Ptak, M. Sojka, F. Granata. A stacked machine learning model for multi-
     step ahead prediction of lake surface water temperature, Science of The Total Environment,
     Volume 890, (2023). 164323, ISSN 0048-9697. doi:10.1016/j.scitotenv.2023.164323.
[10] O. E. Adeyeri, A. H. Folorunsho, K. I. Ayegbusi, V. Bobde, T. E. Adeliyi, C. E. Ndehedehe, A. A.
     Akinsanola. Land surface dynamics and meteorological forcings modulate land surface
     temperature characteristics, Sustainable Cities and Society, Volume 101, (2024), 105072, ISSN
     2210-6707, doi:10.1016/j.scs.2023.105072.
[11] N. Gupta, B. H. Aithal. Urban land surface temperature forecasting: a data-driven approach
     using regression and neural network models. Geocarto International, 39(1). (2024).
     https://doi.org/10.1080/10106049.2023.2299145.
[12] L. Tian, Y. Tao, M. Li, C. Qian, T. Li, Y. Wu, F. Ren . Prediction of Land Surface Temperature
     Considering Future Land Use Change Effects under Climate Change Scenarios in Nanjing City,
     China. Remote Sensing. 15(11):2914. (2023). doi:10.3390/rs15112914.
[13] Kaggle.                              Globallandtemperature.                              (2018).
     https://www.kaggle.com/datasets/sambapython/globallandtemperature.
[14] List of countries by climate zone and average yearly temperatures. (2024)
     https://weatherandclimate.com/countries#google_vignette
[15] O. Pavlova, V. Alekseiko. The concept of an information system for forecasting the temperature
     regime of the earth’s surface based on machine learning. Computer Systems and Information
     Technologies, №2, (2024). pp. 6–13. doi:10.31891/csit-2024-2-1
[16] S. Sharafi, M. Mohammadi Ghaleni. Revealing accuracy in climate dynamics: enhancing
     evapotranspiration estimation using advanced quantile regression and machine learning
     models. Appl Water Sci 14, 162. (2024). doi:10.1007/s13201-024-02211-5
[17] A. Nailman. Comparing machine learning algorithms for regression. Machine Learning Models.
     (2024, May 31). https://machinelearningmodels.org/comparing-machine-learning-algorithms-
     for-regression/
[18] B. Lefoula, A. Hebal, , & D. Bengora. Performance of machine learning methods for modeling
     reservoir management based on irregular daily data sets: a case study of Zit Emba dam. Earth
     Science Informatics, 17(1), (2023) pp. 145–161. doi:10.1007/s12145-023-01160-y.
[19] A. Nailman. Supervised Machine Learning types: Exploring the different approaches. Machine
     Learning Models. (2024, May 28). https://machinelearningmodels.org/supervised-machine-
     learning-types-exploring-the-different-approaches/.
[20] J. Chen. Analysis of Statistic Metrics in Different Types of Machine Learning. Highlights in
     Science, Engineering and Technology, 88, pp. 182–188. (2024). doi:10.54097/c4mz2q66.
[21] V. Plevris, G. Solorzano, N. Bakas, M. Ben Seghier. Investigation of performance metrics in
     regression analysis and machine learning-based prediction models. The 8th European Congress
     on Computational Methods in Applied Sciences and Engineering ECCOMAS Congress 2022. 5
     – 9 June 2022, Oslo, Norway. (2022). doi:10.23967/eccomas.2022.155.
[22] B. Wohlwend. Regression model evaluation metrics: R-Squared, Adjusted R-Squared, MSE,
     RMSE, and MAE. Medium. (2023, August 12). https://medium.com/@brandon93.w/regression-
     model-evaluation-metrics-r-squared-adjusted-r-squared-mse-rmse-and-mae-24dcc0e4cbd3