=Paper=
{{Paper
|id=Vol-3135/darliap_paper11
|storemode=property
|title=Predicting job execution time on a high-performance computing cluster using a hierarchical data-driven methodology
|pdfUrl=https://ceur-ws.org/Vol-3135/darliap_paper11.pdf
|volume=Vol-3135
|authors=Paolo Bethaz,Bartolomeo Vacchetti,Enrica Capitelli,Vladi Nosenzo,Luca Chiosso,Tania Cerquitelli
|dblpUrl=https://dblp.org/rec/conf/edbt/BethazVCNCC22
}}
==Predicting job execution time on a high-performance computing cluster using a hierarchical data-driven methodology==
<pdf width="1500px">https://ceur-ws.org/Vol-3135/darliap_paper11.pdf</pdf>
<pre>
Predicting job execution time on a high-performance
Computing cluster using a hierarchical data-driven
methodology
Paolo Bethaz1 , Bartolomeo Vacchetti1 , Enrica Capitelli2 , Vladi Nosenzo2 , Luca Chiosso3 and
Tania Cerquitelli1
1
  Department of Control and Computer Engineering, Politecnico di Torino, Italy
2
  Iveco Group, Torino, Italy
3
  NPO Torino Srl, Torino, Italy


                                          Abstract
                                          Nowadays, evaluating the performance of a vehicle before the production phase is challenging and important. In the
                                          automotive industry, many virtual simulations are needed to model the vehicle behavior in the best possible way. However,
                                          these simulations require a lot of time without the user knowing their runtime in advance. Knowing the required time in
                                          advance would allow the user to manage the simulations more effectively and choose the best strategy to use the available
                                          computational resources. For this reason, we present an innovative data-driven method to estimate in advance the execution
                                          time of simulations. Our approach integrates unsupervised techniques, such as constrained k-means clustering, with
                                          classification and regression algorithms based on tree structures.
                                          In this paper, we present an innovative and hierarchical data-driven method for estimating the execution time of jobs.
                                          Numerous experiments were conducted on a real dataset to verify the effectiveness of the proposed approach. The experimental
                                          results show that the proposed method is promising.

                                          Keywords
                                          Execution-time prediction, data-driven model, hierarchical model


1. Introduction                                                                        waiting time can lead to wasted cluster resources.
                                                                                          Since the number of analysis and simulations that must
   Today, more and more manufacturing industries rely be performed for a single product is considerable, it is
on either online data centers or physical HPC clusters important to find a method to avoid wasting cluster re-
to run a large variety of tasks to perform analyses and sources and increasing delays. This issue is relevant in
simulations. These tasks, or jobs, range from simulating the context of software application development for the
individual mechanical components to analyze entire man- industrial domain and we want to address it by relying
ufacturing processes. In this way, it is possible to shorten on an innovative methodology based on data analysis
the time to market of the final product while reducing the and machine learning techniques. In order to predict
number of errors made during the process. However, the the execution time of jobs, we have taken into account
execution of these jobs often requires resources that may not only the HPC resources required by the different
not be immediately available, thus delaying the job exe- jobs, but also the settings of the different solvers, that
cution and increasing the time needed to obtain the final are the various kind of software used for analysis and
results. In this paper, we present a data-driven methodol- simulation. Each simulation is characterized by a series
ogy for predicting the jobs’ execution time. We focus on of parameters (specific for each solver) that describe its
predicting the execution time of simulations and analysis configuration. These parameters are inserted manually
because it directly affects the waiting time of other jobs from the user in phase of submission of the job and are
before they are submitted and the problem of unknown then extracted in automatic way from the server used for
Published in the Workshop Proceedings of the EDBT/ICDT 2022 Joint running the simulations. The proposed approach is based
Conference (March 29-April 1, 2022), Edinburgh, UK                                     on a hierarchical classification model. Our methodology
$ paolo.bethaz@polito.it (P. Bethaz);                                                  is based on three separated models. The first model does
bartolomeo.vacchetti@polito.it (B. Vacchetti);                                         a preliminary binary classification and then it divides the
enrica.capitelli@external.cnhind.com (E. Capitelli);                                   data accordingly. The other two models classify the two
vladi.nosenzo@ivecogroup.com (V. Nosenzo);
luca.chiosso@external.nposervices.com (L. Chiosso);                                    different portions of data, one portion of data for each
tania.cerquitelli@polito.it (T. Cerquitelli)                                           model. In this way we are able to classify the data into
 0000-0001-5016-8635 (P. Bethaz); 0000-0001-5583-4692                                 four different classes while reducing the complexity of
(B. Vacchetti); 0000-0002-9039-6226 (T. Cerquitelli)                                   the task from a multiclass problem to a binary one at
          © 2022 Copyright for this paper by its authors. Use permitted under Creative
          Commons License Attribution 4.0 International (CC BY 4.0).                   each step.
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
   The rest of the paper is divided as follows. Section       the waiting time is impacted by many factors, such as
2 deals with the literature review related to HPC, from       HPC specifics. Technical specifics aside, one factor that
resource allocation to runtime prediction. Section 3 deals    impacts the waiting time of every simulation is the exe-
with the proposed methodology, while Section 4 presents       cution time of the job that are running on the HPC. Some
our results so far. Finally, in Section 5, we discuss our     studies have focused on this approach, such as [4, 14].
methods and future steps to continue this research.           While we agree on the centrality of the execution time
                                                              we have adopted a different strategy compared to the
                                                              studies mentioned before. As a matter of fact we use
2. Literature Review                                          pretty much the same toolbox, i.e. clustering for data
                                                              preprocessing and classification and or regression to esti-
There are several approaches and studies investigating
                                                              mate the execution time, however, as far as we know, we
how to improve HPC resource allocation. Research ac-
                                                              differentiate ourselves from previous work through the
tivities can be classified as: (i) predicting job failures,
                                                              use of a hierarchical approach, which will be explained
(ii) developing scheduling algorithms based on machine
                                                              in detail in the following section.
learning techniques, and (iii) using simulation execution
time estimation to predict the waiting time of jobs that
have yet to be submitted.                                     3. Data-driven methodology
The first research strand’s goal is to estimate if a specific
job will fail or complete its execution. The intent is to The building blocks of the proposed methodology, shown
stop prematurely those jobs that are predicted by the in Figure 1, are as follows: i) data cleaning, ii) model
algorithms as failures [8, 6, 9]. By learning which jobs building and iii) model evaluation. Each of these steps is
are more likely to fail and stopping them the HPC is able adequately described in the related subsection.
to improve its performance, while saving energy that             After the data collection phase, the generated dataset
would be wasted[12, 7]. However this type of approach contains a record for each submitted job. These jobs may
requires a lot of data in order to identify a pattern be- have been executed by different solvers. Here with solver
tween the attributes and the target variables. The most we mean the software used to run the simulation, each
significant variables can be extracted in different ways, of which is characterized by different model variables.
either through some feature engineering process or from Due to the different parameters that characterize each
different databases. Even so, parsing and transformation solver, and due to very different execution times between
operations have been proven very useful in order to im- solvers, we decided to consider only one solver at a time,
prove the prediction results [5] are extremely useful to thus avoiding working with a dataset too sparse.
obtain better prediction results. The issue here is related
to the fact that some failures are rare, hence they are not
easy to predict, but still consume a lot of resources [10].
For example, Liu et al. [8] integrated two algorithms in
order to estimate whether a job fails or not. The first
algorithm is a clustering one. It measures the correla-
tion among jobs from different contexts. The other one
is a multitask learning algorithm trained on correlated
jobs. Since our data is not enough to achieve meaningful
results in this paper we do not tackle this problem.
Another research approach focuses on the optimization Figure 1: Schema of the proposed methodology
of the available resources performed by a scheduler. Af-
ter analyzing the behavior of successful and failed jobs,
Jassas et al. [6] investigated scheduling algorithms in-
tended to optimize the reliability and availability of cloud 3.1. Data cleaning
applications. Since the number of resources cannot be
                                                              Since the parameters characterizing each job are entered
unlimited, large-scale HPC exploit waiting queues. How-
                                                              manually by the user during the job submission phase, it
ever it has been proved by Nurmi et al. [11] that regard-
                                                              is essential to check the correctness of the available data
less of the performance of a resources scheduler one of
                                                              before using them for subsequent analysis.
the main factor that impacts the prediction efficiency is
                                                              The data cleaning phase started with a collaboration with
the amount of time that a job has to wait before being
                                                              domain experts, who, thanks to their knowledge, helped
submitted. Following this idea, also authors in [2] try
                                                              us to define the admissibility ranges for each collected
to predict the waiting time of a job using a hierarchi-
                                                              variable, including the execution time. This phase helped
cal classification approach. However, the estimation of
                                                              us to better understand the available data, and led us
to the decision of eliminating all those jobs associated         good results obtained with the hierarchical classification
with anomalous values of execution time. Specifically,           approach pushed us to also try a mix between classifi-
all jobs with an execution time that was too low or too          cation and regression algorithms. However the results,
high compared to the execution time of other jobs run            while in some cases were better than the normal regres-
by the same solver were eliminated. To define when an            sion approach, were not satisfying. This led us to choose
execution time should be considered too high or too low,         the hierarchical classification approach. The following
we tried using the following three approaches to define          subsections describe in more detail which the structures,
appropriate thresholds, beyond which a point must be             the algorithms and the techniques that were used, for
considered an outlier [15]:                                      both regression and classification approaches. Due to the
Interquartile Range (IQR): IQR is the difference be-             fact that the amount of data in our possession is limited
tween the values ranking in 25% (Q1) and 75% (Q3) in a           we have decided to rely on the XGBoost method with
data set. IQR thresholding strategy calculates the thresh-       both the mixed regression and the classification approach.
old as follows:                                                  The XGBoost is a tree model that relies on the Extreme
                                                                 Gradient Boosting technique [3].
     • min threshold: Q1 - (1.5*IQR)
     • max threshold: Q3 + (1.5*IQR)
                                                                 3.2.1. Classification Approach
95th centile: the minimum threshold is set by taking the
                                                                 Unlike a single level classification where the model is
value ranking in 5% and the maximum threshold is set
                                                                 trained only once on all available data, in the hierarchical
by taking the value ranking in 95%;
                                                                 approach a binary tree structure is used, in which each
99th centile: the strategy is identical to that described in
                                                                 node of the tree corresponds to a binary classification
the previous point, but here the thresholds are defined so
                                                                 where a model predicts to which of the two classes the
that fewer outliers are identified. The minimum threshold
                                                                 job belongs. The depth of the tree depends on the number
is set by taking the value ranking in 1% and the maximum
                                                                 of total classes that we want to obtain (each of which
threshold is set by taking the value ranking in 99%.
                                                                 represents a temporal range of values). The hierarchical
We compared the number of jobs labeled as outliers with
                                                                 binary structure we used in this methodology has two
each of the 3 approaches, and in subsequent experiments
                                                                 levels of depth, to which correspond 3 predictive mod-
we tried to evaluate how the performance of a predictive
                                                                 els (nodes) and 4 total classes (leaves). Solutions with
model varies depending on the preprocessing used.
                                                                 different depths have also been tested experimentally,
                                                                 but the one with four classes has demonstrated to be
3.2. Model Building                                              the best compromise between good models’ performance
                                                                 and enough detailed classes. An example of how the
The task of our model is to predict the execution time
                                                                 structure looks is reported in Figure 2. In this way every
of a simulation running on a HPC. Since the goal is to
                                                                 classifier has to deal with a binary classification problem,
predict a time that is a continuous value, this could be
                                                                 but overall the classes considered are four.
tagged as a regression task. However, the experimental
results obtained by treating this task as a regression one
led to poor results. This behavior can be justified by the
fact that the data collection phase is quite recent, so the
available data at the moment are not numerous. For this
reason, we decided to treat it like a classification problem;
this was made possible by categorizing the available run-
times, dividing them into classes representing contiguous
time intervals. After this categorization, a classification
algorithm can then try to predict in which range of val-
ues the execution time of the analyzed job will fall. In
other words we have a multiclass problem. However, due           Figure 2: Hierarchical Structure Schema
to the scarce amount of data in our possession, the per-
formance of our initial model was not enough. Since the             From the figure, it is evident that the key of the en-
amount of data is limited it is difficult for a classification   tire structure are the three conditions that determine in
algorithm to make good predictions in a multiclass con-          which sub-branch the obtained prediction must go. To
text. On the other hand we did not want to oversimplify          each of these conditions corresponds a subdivision of the
the problem by reducing the number of classes consid-            dataset (or of a portion of it), in two classes that represent
ered. Thus by implementing a classification hierarchical         different time intervals of the execution time of the jobs.
approach we were able to improve the goodness of the             The overall performance of the classification method is
predictions without sacrificing too much quality. The
greatly influenced by the identified thresholds, so it is
important to try to define them as best as possible. To do
this, two different approaches have been tested:

     • balanced approach: in each division of the iden-
       tified hierarchical structure, the available data
       is divided into two classes, each containing the
       same number of jobs. This technique prevents
       any unbalanced class problem, allowing the pre-
       dictive model to be trained on balanced classes;
     • k-means approach: the classes to use for training a
       predictive model in each node of the structure, are
       chosen in an automatic way through a clustering        Figure 3: Mixed Approach Structure
       algorithm. In particular, the k-means algorithm
       is used, with K=2. In addition, to prevent the
       proposed solution from leading to an unbalanced-       3.3. Model Evaluation
       classes problem, we used a constrained version of
       the k-means algorithm, in which a minimum size         For the evaluation of every algorithm used in all the per-
       for each cluster can be specified. In particular,      formed experiments, we exploited the Leave-One-Out
       the constraint we used here is that each identified    Cross Validation (LOOCV) technique. Even if it is a com-
       class had to contain at least 40% of the total jobs.   putationally expensive technique, LOOCV results in a
                                                              reliable and unbiased estimate of model performance.
3.2.2. Mixed Approach                                         Moreover, this method turns out particularly useful when
                                                              the data available are limited, like in our case in which
Several regression algorithms (XGBoost [3], RandomFor- the phase of data collection is begun recently.
est [1], Lasso [13]) were tested and compared with each LOOCV is an extremization of k-fold validation, where
other, evaluating their performance based on the R2 value the value of k is equal to the number of available items
obtained. The Lasso Regression is a regression analysis (N). Its operations can be summarized in the following
method that enhance its prediction accuracy by com- steps:
bining variable selection and regularization techniques;
while RandomForest and XGBoost are both tree-based al-             1. Split the dataset into N disjoined groups, where
gorithms exploiting several decision trees, differing from            each group contains a single element;
each other on how the trees are constructed and how the            2. For each group:
results are combined.                                                      • Take the element in the considered group
Using a regression algorithm has the advantage of yield-                     as test set
ing a punctual value of the estimated runtime. However,                    • Take the remaining N-1 groups as training
due to limited availability of initial data, the obtained                    set
predictive model built on a single level is not very robust.               • Fit a model on the training set and evaluate
For this reason, we have decided to test a mixed hier-                       it on the test set
archical regression approach. With mixed approach we                       • Retain the evaluation of the model and
mean that there is a combination between classification                      then discard it
and regression. At the first prediction layer we have a
classifier, while at the second prediction layer we have           3. The model performance is estimate as the average
used two regressors. Thus the two regressors at the sec-              of the N experiments executed
ond prediction layer have to estimate the execution time
of jobs that belong to two different time intervals, one 4. Preliminary experimental
for each model. In this way we simplified the problem
while keeping the final prediction as a continuous value.          results
Figure 3 shows the scheme of the proposed mixed ap-
proach. Regarding the techniques used to obtain the two The experiments presented here show the actual results
classes in the first level of classification, both approaches obtained and that motivate us to rely on the hierarchical
described in the previous classification case were tested. classification approach. Section 4.1 offers insight on the
                                                              data that we have used to train our models. Section 4.2
                                                              discusses the effectiveness of the proposed techniques
                                                              to correctly identify outliers and how they impact the
                                                              selectivity of the dataset cardinality. Section 4.3 shows
the performance of classifier models, both normal and             Table 1
with the hierarchical structure. Section 4.4 presents the         Solvers
results obtained with different regression algorithms and            Type      Solver name            Application Field
the hierarchical mixed approach.
                                                                   Explicit       Adams              Multi body approach
                                                                   Explicit       Radioss             Crash Simulation
4.1. Dataset Description                                           Implicit       Abaqus          Linear/Non linear analysis
                                                                   Implicit       Nastran                FE Analysis
Our data was extracted from a PBS (Portable Batch Sys-
                                                                   Implicit      Optistruct      FE Analysis and optimization
tem) server used to run simulations of various nature,
from aerodynamics to virtual crash tests, related to the au-
tomotive context. Our data belong to two main categories,
i.e. explicit and implicit jobs. The implicit methods use         4.2. Data Cleaning
an algorithm "step by step", in which an appropriate con-         In this preprocessing phase we tried to remove all the
vergence criterion allows the analysis to continue or not,        jobs having an anomalous execution time compared to
reducing the time increment, depending on the accuracy            the runtimes of the other jobs. To do this, 3 different
of the results at the end of each step. Using the explicit        methodologies were tested as discussed in Section 3.1,
methods does not have problems of non-convergence,                based on: i) interquartile range (IQR), 95th centile, 99th
since in this case the time increment is defined at the           centile. The percentage of jobs labeled as outliers by each
beginning and remains constant during the calculation.            of the three techniques, separately by solver, is shown in
After the data collection phase, the dataset contains about       Table 2.
6000 records, each of which represents a job submitted
in the cluster. Jobs can be performed by five different           Table 2
solvers, depending on the type of analysis to be run. A           The number of outlier jobs for each solver
summary of the solvers contained in our dataset is given
                                                                     Solver        IQR        95th centile     99th centile
in Table 1. Due to the different parameters that character-
ize each solver, and due to very different execution times           Adams         14%            10%              2%
between solvers, the analyses described below consider               Radioss        1%            10%              2%
only one solver at a time.                                           Nastran       11%            10%              3%
                                                                    Optistruct      8%            11%              3%
                                                                     Abaqus        11%            10%              2%

                                                                     Since we can not know in advance which of the 3
                                                                  techniques will lead to greater benefits, in the following
                                                                  analysis we compared the results obtained with different
                                                                  preprocessing techniques, evaluating then the best of
                                                                  them. However, from Table 2 we can see that the IQR
                                                                  and 95th centile techniques show rather similar results
                                                                  (with an average difference of about 3%); instead the 99th
                                                                  centile technique often identifies a very low percentage
                                                                  of outliers compared to the other techniques. For this
                                                                  reason, in the following analyses we will consider only
Figure 4: KDE plot for execution times of two different solvers   the IQR and the 99th centile techniques, comparing the
                                                                  results obtained with these two different approaches.

   As a demonstration of the differences between the
solvers, Figure 4 shows the kernel density estimate (KDE)         4.3. Classification Model Evaluation
plot for the execution times of an explicit solver (Adams)        We have conducted a series of experiments with the pro-
and an implicit solver (Optistruct). KDE represents the           posed hierarchical approach, testing different preprocess-
data using a continuous probability density curve and             ing techniques and different strategies for identifying
the x-axis in the figure show how the jobs belonging to           thresholds.
the two solvers occupy very different ranges of execution           Table 3 contains the F-score values obtained using the
times, with much greater times for the explicit solver.           99th centile as preprocessing step and the k-means for
                                                                  thresholds identification, since it is the configuration
                                                                  with which the best results were obtained. For each
                                                                  solver, the first column of the table shows the results
Table 3                                                           baseline against which we want to compare our method-
F-score for Hierarchical Classification                           ology. The table shows the results obtained with both
                   1st level                2nd level             the two preprocessing techniques (IQR vs 99th centile)
                classification            classification          and with both the strategies to define the subdivision in
   Solver        1        2       1          2      3       4     classes (k-means vs balanced approach). For reasons of
   Adams        0.90     0.82    0.91      0.86 0.82       0.66
                                                                  space, each solver has been indicated in this table only
   Radioss      0.71     0.72    0.76      0.25 0.79       0.61
   Nastran      0.85     0.79    0.89      0.86 0.93       0.90   by its initials (Ad, R, N, O, Ab); moreover, ’K-M’ stands
  Optistruct    0.91     0.94    0.89      0.81 0.82       0.64   for k-means, while ’Bal’ indicates the balanced approach.
   Abaqus       0.82     0.69    0.81      0.67 0.88       0.80   Comparing the f-score values obtained in the two method-
                                                                  ology, it is evident how the hierarchical structure allows
                                                                  to obtain better performances than the baseline approach,
for the first classification level (where the first two           whatever preprocessing technique is used.
classes are identified), while the second column contains
the F-score values obtained in the second level of                Table 4
classification (classes 1 and 2 for the left sub-branch,          Baseline Classification F-score
classes 3 and 4 for the right sub-branch). To better                                     IQR                          99%
illustrate our methodology, a focus on a specific solver is        Solver      1      2      3       4      1      2      3    4
also represented in Figure 5, that shows the hierarchical          Ad. K-M    0.79   0.52 0.28      0.75   0.80   0.63 0.46   0.60
                                                                   Ad. Bal    0.75   0.64 0.49      0.72   0.78   0.71 0.59   0.71
approach specifically for the Nastran solver, indicating           R. K-M     0.44   0.44 0.23      0.14   0.44   0.48 0.23   0.62
the relevance of the identified classes in a real context.          R. Bal    0.26   0.55 0.54      0.57   0.21   0.35 0.42   0.64
The predictive model used in all the nodes is the XGBoost          N. K-M     0.66   0.49 0.87      0.57   0.85   0.75 0.66   0.57
                                                                    N. Bal    0.83   0.83 0.73      0.67   0.79   0.69 0.71   0.65
and on the figure are indicated the F-Score values for             O. K-M     0.67   0.84 0.68      0.68   0.67   0.64 0.73   0.84
each prediction. Here, the implemented model is able to             O. Bal    0.64   0.60 0.56      0.76   0.72   0.54 0.70   0.77
predict quite efficiently whether the execution time of an         Ab. K-M    0.56   0.77 0.58      0.55   0.76   0.45 0.65   0.75
analyzed job will be less than 8 minutes, between 8 and            Ab. Bal    0.60   0.61 0.57      0.62   0.55   0.69 0.60   0.64

16 minutes, between 16 and 30 minutes, or greater than
30 minutes. For solvers with different characteristics
obviously different thresholds will be obtained; however          4.4. Mixed Model Evaluation
the results in the Table 3 indicate that, except for the
Radioss solver (where the results obtained are below the          In the approach that exploit regression algorithms to
average behavior), for all solvers we are able to predict         estimate the execution time, the output of the predictive
quite accurately which class a job should belong to.              model is a continuous numerical value that indicates
                                                                  the runtime of the considered jobs. Table 5 contains
                                                                  the R2 values obtained with three different Regressors:
                                                                  RandomForest Regressor (RF), XGBoost Regressor and
                                                                  Lasso Regressor. The results obtained with RandomForest
                                                                  and XGBoost are very similar, both of them much better
                                                                  than the results obtained with a Lasso Regressor, which
                                                                  does not perform well. Furthermore, from the baseline
                                                                  table, we see that the results obtained after removing the
                                                                  outliers through the 99th centile, are on average higher
                                                                  than those obtained using the IQR. For this reason we
                                                                  use the 99th centile technique for testing our approach.
                                                                  Currently, the mixed approach is still an experimental
                                                                  approach and for now it has been tested on only one
                                                                  solver. The results are shown in Figure 6, where the
Figure 5: Nastran Hierarchical Approach
                                                                  considered solver is Nastran and the algorithm used is the
                                                                  XGBoost (both for classification and regression). We can
   In order to validate the results obtained with the hi-         see that the first classification level is the same obtained
erarchical approach, we conducted also a series of ex-            in Figure 5, with the same values of F1-score. Then, unlike
periments with a more traditional methodology and we              the classification approach, in the mixed approach we
compared them with our results. The traditional method-           used two regression models in the second level (one for
ology consists in a "single" level approach, in which the         each sub-branch), able to predict the value of execution
algorithm used is always an XGBoost, but there is no a hi-        time of the considered job.
erarchical structure. The results in Table 4 represent the           By having a first classification level that can distin-
Table 5                                                       its complexity among multiple prediction layers. In this
Baseline Regression R2                                        way every classifier has to deal with a binary classifica-
                      RF           XGBoost       Lasso        tion problem, instead of a multiple classification. Even if
  Solver       IQR         99%    IQR 99%     IQR 99%         the amount of data in the considered use case is limited
  Adams        0.31        0.70   0.36 0.69   0.23   0.27     our approach has shown a promising performance also
  Radioss      0.27        0.27   0.27 0.28   0.09   0.12     compared to single prediction level approaches. Espe-
  Nastran      0.38        0.39   0.40 0.37   0.08   0.09     cially in the classification context the hierarchical ap-
 Optistruct    0.63        0.45   0.60 0.44   0.40   0.43     proach performs better than the normal approach. We
  Abaqus       0.36        0.41   0.38 0.42   0.10   0.09     have also tried some experiments in which more than
                                                              one solver is taken into account. However, due to the
                                                              fact that every solver takes into account a different set of
guish two categories of jobs, the R2 values obtained in       variables the resulting dataset is very sparse. This issue
the second regression level are now higher than those         combined with the limited amount of data leads to poor
obtained with the Nastran solver using the baseline           predictions with both the hierarchical approach and a
approach. Moreover, the mean absolute error (MAE)             single prediction level method. Once that the data in
values shown in Figure 6, indicates that the average error    our possession has reached a higher numerosity, it will
associated with jobs that last less than 16 minutes is just   allow us to investigate whether or not our hierarchical
over one minute (69 seconds), therefore an acceptable         approach can outperform more classical approaches in
error in our use-case. Regarding instead the mean             a more complex environment. A higher amount of data
absolute error associated with jobs that last longer than     means that the impact of the different variables taken into
16 minutes, this one turns out to be about 11 minutes, so     account will be reduced. Currently we are still working
a bit more impactful.                                         on this project and we already have different improve-
                                                              ments that we want to address. We will keep gathering
                                                              more data that will allow us to build more stable and
                                                              robust models. We also intend to further investigate the
                                                              possibility of building a model that is able to make pre-
                                                              dictions on the whole set of solvers, instead of relying on
                                                              a different model for every solver. Finally we intend to
                                                              integrate our model inside the HPC structure in order to
                                                              help it assess the pending time of newly submitted jobs.


                                                              References
                                                               [1] Leo Breiman. 2001. Random forests. Machine learn-
                                                                   ing 45, 1 (2001), 5–32.
                                                               [2] Fabio Carfi, Enrica Capitelli, Vladi Massimo
                                                                   Nosenzo, and Tania Cerquitelli. 2021. Estimating
                                                                   the job’s pending time on a High-Performance Com-
                                                                   puting cluster through a hierarchical data-driven
Figure 6: Nastran Mixed Approach                                   methodology. In DOLAP. EDBT 2021, Nicosia,
                                                                   Cyprus.
                                                               [3] Tianqi Chen and Carlos Guestrin. 2016. XGBoost:
   The error grows as execution times increase. So, for            A Scalable Tree Boosting System. In Proceedings of
the considered solver, the better choice could be to adopt         the 22nd ACM SIGKDD International Conference on
techniques of regression in order to estimate low execu-           Knowledge Discovery and Data Mining (KDD ’16).
tion times; and to use instead a classification approach in        Association for Computing Machinery, New York,
order to predict the classes to which the job will belong          NY, USA, 785–794. https://doi.org/10.1145/2939672.
when it deals with longer execution times.                         2939785
                                                               [4] Mariza Ferro, Vinicius P Klôh, Matheus Gritz, Vitor
5. Discussion and Future Research                                  de Sá, and Bruno Schulze. 2021. Predicting Runtime
                                                                   in HPC Environments for an Efficient Use of Com-
   direction                                                       putational Resources. In Anais do XXII Simpósio em
                                                                   Sistemas Computacionais de Alto Desempenho. SBC,
We have presented a classification hierarchical model
                                                                   WSCAD 2021 – XXII Simpósio em Sistemas Com-
which is able to address a multiclass problem by dividing
     putacionais de Alto Desempenho, Belo Horizonte,       Statistical Society Series B 73 (06 2011), 273–282.
     72–83.                                                https://doi.org/10.2307/41262671
 [5] S. Ganguly, A. Consul, A. Khan, B. Bussone, J. [14] Hao Wang, Yi-Qin Dai, Jie Yu, and Yong Dong. 2021.
     Richards, and A. Miguel. 2016. A Practical Approach   Predicting running time of aerodynamic jobs in
     to Hard Disk Failure Prediction in Cloud Platforms:   HPC system by combining supervised and unsuper-
     Big Data Model for Failure Management in Datacen-     vised learning method. Advances in Aerodynamics
     ters. In 2016 IEEE Second International Conference    3 (03 2021). https://doi.org/10.21203/rs.3.rs-360961/
     on Big Data Computing Service and Applications        v1
     (BigDataService). 2016 IEEE Second International [15] Jiawei Yang, Susanto Rahardja, and Pasi Fränti.
     Conference on Big Data Computing Service and          2019. Outlier Detection: How to Threshold Outlier
     Applications, Oxford, United Kingdom, 105–116.        Scores?. In Proceedings of the International Confer-
     https://doi.org/10.1109/BigDataService.2016.10        ence on Artificial Intelligence, Information Processing
 [6] M. Jassas and Q. H. Mahmoud. 2018. Failure Anal-      and Cloud Computing (AIIPCC ’19). Association for
     ysis and Characterization of Scheduling Jobs in       Computing Machinery, New York, NY, USA, Arti-
     Google Cluster Trace. In IECON 2018 - 44th An-        cle 37, 6 pages. https://doi.org/10.1145/3371425.
     nual Conference of the IEEE Industrial Electronics    3371427
     Society. IECON 2018 - 44th Annual Conference of
     the IEEE Industrial Electronics Society, Omni Shore-
     ham, United States, 3102–3107. https://doi.org/10.
     1109/IECON.2018.8592822
 [7] P. Li, B. Zhang, Y. Weng, and R. Rajagopal. 2017.
     A Sparse Linear Model and Significance Test for
     Individual Consumption Prediction. IEEE Trans-
     actions on Power Systems 32, 6 (2017), 4489–4500.
     https://doi.org/10.1109/TPWRS.2017.2679110
 [8] Chunhong Liu, Liping Dai, Yi Lai, Guinbing Lai, and
     Wentao Mao. 2020. Failure prediction of tasks in the
     cloud at an earlier stage: a solution based on domain
     information mining. Computing 102 (2020), 2001–
     2023. https://doi.org/10.1007/s00607-020-00800-1
 [9] C. Liu, J. Han, Y. Shang, C. Liu, B. Cheng, and
     J. Chen. 2017. Predicting of Job Failure in Com-
     pute Cloud Based on Online Extreme Learning Ma-
     chine: A Comparative Study. IEEE Access 5 (2017),
     9359–9368. https://doi.org/10.1109/ACCESS.2017.
     2706740
[10] J. M. Navarro, G. H. A. Parada, and J. C. Dueñas.
     2014. System Failure Prediction through Rare-
     Events Elastic-Net Logistic Regression. In 2014 2nd
     International Conference on Artificial Intelligence,
     Modelling and Simulation. IEEE, Madrid, Spain, 120–
     125. https://doi.org/10.1109/AIMS.2014.19
[11] D. Nurmi, A. Mandal, J. Brevik, C. Koelbel, R. Wol-
     ski, and K. Kennedy. 2006. Evaluation of a Work-
     flow Scheduler Using Integrated Performance Mod-
     elling and Batch Queue Wait Time Prediction. In SC
     ’06: Proceedings of the 2006 ACM/IEEE Conference
     on Supercomputing. IEEE, Tampa, FL, USA, 29–29.
     https://doi.org/10.1109/SC.2006.29
[12] A. Rosà, L. Y. Chen, and W. Binder. 2017. Failure
     Analysis and Prediction for Big-Data Systems. IEEE
     Transactions on Services Computing 10, 6 (2017), 984–
     998. https://doi.org/10.1109/TSC.2016.2543718
[13] Robert Tibshirani. 2011. Regression shrinkage
     selection via the LASSO. Journal of the Royal

</pre>