<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal
of Intelligent &amp; Fuzzy Systems 45 (2023) 6959-6981.
doi:10.3233/JIFS</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.3233/JIFS-222362</article-id>
      <title-group>
        <article-title>Developing an Algorithm Selector for Green Configuration in Scheduling Problems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Carlos March</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Pérez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel Salido</string-name>
          <email>msalido@upv.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universitat Politècnica de València, Instituto de Automática e Informática Industrial</institution>
          ,
          <addr-line>Camì de Vera S/N, Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universitat Politècnica de València, valgrAI - Valencian Graduate School and Research Network of Artificial Intelligence</institution>
          ,
          <addr-line>Camì de Vera S/N, Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>15</volume>
      <fpage>1</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>The Job Shop Scheduling Problem (JSP) is central to operations research, primarily optimizing energy eficiency due to its profound environmental and economic implications. Eficient scheduling enhances production metrics and mitigates energy consumption, thus efectively balancing productivity and sustainability objectives. Given the intricate and diverse nature of JSP instances, along with the array of algorithms developed to tackle these challenges, an intelligent algorithm selection tool becomes paramount. This paper introduces a framework designed to identify key problem features that characterize its complexity and guide the selection of suitable algorithms. Leveraging machine learning techniques, particularly XGBoost, the framework recommends optimal solvers such as GUROBI, CPLEX, and GECODE for eficient JSP scheduling. GUROBI excels with smaller instances, while GECODE demonstrates robust scalability for complex scenarios. The proposed algorithm selector achieves an accuracy of 84.51% in recommending the best algorithm for solving new JSP instances, highlighting its eficacy in algorithm selection. By refining feature extraction methodologies, the framework aims to broaden its applicability across diverse JSP scenarios, thereby advancing eficiency and sustainability in manufacturing logistics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Job Shop Scheduling Problem</kwd>
        <kwd>Energy Eficiency</kwd>
        <kwd>Algorithm Selection</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Feature Extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The Job Shop Scheduling Problem (JSP) is a cornerstone
issue in operations research and optimization, serving as a
critical benchmark for assessing the performance of various
algorithms. JSP entails the complex task of scheduling jobs
on machines in a manufacturing environment to optimize
several performance metrics, such as makespan, flow time,
tardiness, resource utilization, and energy consumption [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Efective benchmarking of JSP solutions requires a
multifaceted evaluation of these metrics, particularly focusing
on makespan, energy consumption, and tardiness to gauge
scheduling eficiency and resource utilization [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Tools like
JSPLIB play a vital role in these benchmarking eforts by
providing researchers with diverse instances derived from
significant studies and experiments, thereby enhancing the
evaluation of algorithms [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Understanding the characteristics of problem instances
is essential for efective benchmarking in JSP. Critical
factors include the number of jobs and machines, variability
in processing times, machine availability, and precedence
relationships, all of which significantly impact algorithm
performance [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Additionally, considering energy
consumption, which varies based on machine speed and operational
factors, adds another layer of complexity [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Achieving a
balance between energy consumption and scheduling
decisions is crucial for attaining energy eficiency without
compromising production goals [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        JSP’s focus on energy eficiency has intensified in recent
years due to its substantial environmental and economic
impacts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Researchers have investigated strategies such as
employing speed-adjustable machines and vehicles to
minimize energy consumption while maintaining productivity
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Advanced algorithms and optimization techniques have
been developed to address these energy-related challenges,
taking into account factors like machine speed, idle time,
and energy requirements [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Real-world implementations
of these strategies have demonstrated tangible benefits,
including cost savings and positive environmental efects [ 10].
      </p>
      <p>In addition to traditional optimization methods, machine
learning techniques are increasingly being utilized to
recommend algorithms for solving problems within the JSP
family. For instance, Müller et al. designed a system capable
of selecting the most suitable solver for addressing
Flexible JSP by leveraging machine learning approaches [11].
Similarly, Strassl and Musliu [12] analyzed JSP instances
without energy consumption from the literature to extract
features that inform algorithm performance, resulting in a
homogeneous set of instances with consistent
characteristics. These features were then used to train various models,
with Random Forest achieving the highest accuracy at 90%
[12].</p>
      <p>In conclusion, the integration of machine learning
techniques into JSP research provides new avenues for
improving algorithm selection and performance, particularly in
handling complex and varied instances. This integration
enhances the eficiency and efectiveness of job shop
scheduling by combining the strengths of traditional optimization
approaches with innovative machine learning methods. The
ongoing advancements in this field are driving both
academic research and practical applications toward more
sustainable and innovative solutions.
2. Problem Description and Model</p>
      <p>Formulation
The JSP tackled in this study emphasizes its intricate energy
considerations. The JSP poses a significant computational
challenge, being NP-complete due to its dificulty finding
optimal solutions within reasonable time frames.</p>
      <p>The core challenge of the JSP involves optimizing task
allocation across multiple jobs and machines while
minimizing key criteria, notably the total job completion time.
However, achieving this optimization is complex due to various
real-world constraints and dependencies, contributing to
the JSP’s NP-completeness. The combinatorial explosion of
possible job and machine combinations further complicates
the problem, making exhaustive exploration impractical as
the number of jobs and machines grows.
2.1. Mixed Integer Programming
The JSP involves various sets, parameters, variables, and
constraints crucial for formulation and solution:
Sets:
•  = {1, . . . , }, the set of jobs.
•  = {1, . . . , }, the set of machines.
•  = {1, . . . , }, the set of speeds.
•  , ∀ ∈  , the set of tasks in job . In standard JSP
 =  .</p>
      <p>Parameters:
• , ∀ ∈  , ∀ ∈  , the due date of task job .
• , ∀ ∈  , ∀ ∈  , the release date of task job
.
• , ∀ ∈  , ∀ ∈  , the processing time of task
job  on machine  with speed .
• , ∀ ∈  , ∀ ∈  , the energy consumption for
processing task job  with speed .</p>
      <p>Variables:
• , ∀ ∈ , ∀ ∈  , the completion time of task
job 
• , ∀ ∈ , ∀ ∈  , tardiness of task job  with
respect to its due date
•  ∈ {0, 1}, ∀ ∈  , ∀ ∈  ,  ∈  , binary
sequencing variables (i.e.,  = 1 denotes that
task  of job  is performed with speed  on machine
)
•  ∈ {0, 1}, ∀ ∈  , ∀,  ∈  , ∀,  ∈
,  ,  ̸= , binary assignment variables (i.e.,
 = 1 denotes that task  of job  precedes task
 of job  on machine )
* = arg min [ (), (),   ()]
 ∈ Φ
 ≥  + 
 ≥ 0 ,  ≥ 0
∀ ∈ , ∀ ∈ ,
∀ ∈  , ∀ ∈ ,  = 1</p>
      <p>∀ ∈ , ∀,  ∈ ,
∀,  ∈ ,  , ∀ ∈ ,
 ̸=  ∧  &lt;  ∧  = 1
∀ ∈ , ∀ ∈  ∀ ∈ 
subject to:
∑︁  = 1
∈
∑︁  = 1
∈
 ≥  −</p>
      <p>≥
 + 
(6)
(7)
(8)
(9)
(10)
(11)
(12)
This model seeks the optimal solution * that minimizes
the three measures mentioned in equation 1. considering
the constraints associated: the maximum makespan of all
task jobs  (), the total energy consumption (),
and the total tardiness   (). The simultaneous
optimization of these objectives requires a delicate balance between
the various considerations and constraints of the problem.
Therefore, two approaches to optimizing the problem
solutions are proposed, allowing us to analyze the methods’
behavior better.
2.2. Mono-objective optimization
This section presents the mono-objective optimization for
a specific scheduling problem involving multiple jobs and
machines, emphasizing key performance metrics such as
makespan, energy consumption, and total tardiness.
  = max ()</p>
      <p>∈∈
  = ∑︁ ∑︁</p>
      <p>∈ ∈
  =
∑︁ ∑︁ ∑︁ 
∈ ∈ ∈
∀ ∈ , ∀ ∈  ∀ ∈ 
∀,  ∈ , ∀,  ∈ ,  ,</p>
      <p≯= ,  ≤ 
∀ ∈ , ∀ ∈ ,
∀ ∈  ,  = 1
(1)
(2)
(3)
(4)
(5)</p>
      <p>Equation 8 represents the makespan, which is the
maximum completion time among all machines, by calculating
the total processing time of all job tasks on each machine
and selecting the maximum value across all machines.
Equation 9 describes energy consumption by computing the total
energy consumed by all job tasks across all machines. Lastly,
Equation 10 is formulated to show the total tardiness, which
represents the number of time units of each job or operation
that are performed outside its time window, i.e., the period
of time between the release date and the due date.

  − −1
1+ − −1
+
  − −2
2+ − −2
+
 
1+</p>
      <p>Minimizing the objective Function 11 aims to find a
solution that achieves a balanced trade-of among the
components. The values 1+,2 and −1,2 are used to normalize
the * solution obtained in the three-dimensional objective
space. This allows a correct comparison between the values
of the objective function in minimizing the problem, giving
the same weight to all the parts, and avoiding any of the
variables dominating the search.</p>
      <p>1+ = ∑︁( ∑︁
∈ ∈
max())
∈
2+ = ∑︁ (∑︁ max())</p>
      <p>∈ ∈
−1 = max( ∑︁</p>
      <p>∈ ∈ ∈
−2 = ∑︁ (∑︁ min())
min())
∈
∈
∈ ∈
(13)
(14)
(15)</p>
      <p>Equations between 12 and 15 determine both
maximum (1+ and 2+) and minimum (−1 and −2 ) values
for makespan and energy consumption respectively. For
makespan, these values signify the maximum and minimum
completion times across all machines, accounting for the
maximum and minimum processing times of job tasks on
each machine. Similarly, in terms of energy consumption,
they represent the maximum and minimum energy utilized
among all machines, considering the maximum and
minimum energy consumption of all job tasks on each machine.</p>
    </sec>
    <sec id="sec-2">
      <title>3. Algorithm Selector</title>
      <p>The selection of algorithms for a given problem  
involves identifying the most appropriate algorithm from a
collection capable of solving   , taking into account the
specific characteristics of   . Rubinof [ 13] formalized
this process of algorithm selection. Rubinof defined key
elements, including the problem space , representing all
instances of   ; the algorithm space , encompassing
algorithms capable of solving any  ∈ ; and a
performance metric , which quantifies algorithm efectiveness
for solving  ∈ .</p>
      <p>The core objective is to establish a function  :  →
 that, for each problem instance  ∈ , selects the
optimal algorithm from  based on metric . To efectively
characterize each , a feature set  is constructed to
represent  and assist in the decision-making process for
. Consequently,  is defined as a composite function  =
 ∘ , where  :  → R| | maps  to its feature vector
in R| |, and  : R| | →  selects the algorithm from 
based on this feature representation. The choice of  is
critical as it must be informative and accurately represent
the characteristics of   .</p>
      <p>Considering the modeling of algorithm selectors in Figure
1, an algorithm selector structure is proposed, where it can
be seen that it is composed of a training phase, in which
the features of a set of instances are processed to generate
a set of data and are solved using three solvers: GECODE,
CPLEX, and GUROBI. Once the instances have been solved,
the extracted features are related to the best algorithm that
has solved that instance, and diferent machine learning
models are trained in order to validate which is the one
that obtains the best accuracy and thus use it to recommend</p>
      <p>The following subsections detail each of the training
profuture instances.
cesses.</p>
      <sec id="sec-2-1">
        <title>3.1. Feature processing</title>
        <p>For each instance, we extract the typical characteristics of a
JSP (Job Shop Scheduling) problem, such as the number of
jobs | |, the number of machines | |, the type of Release
date, and Due date constraint /, and the number of
speeds ||. Additionally, we extract other features that are
obtained in a less direct manner and aim to be as
informa(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)</p>
        <p>The maximum energy consumption (Equation 19)
indicates the highest energy required for any single operation.
The mean energy consumption (Equation 20) provides the
average energy used across all operations, reflecting the
overall energy profile. The minimum energy consumption
(Equation 21) shows the lowest energy usage for an
operation, highlighting the least energy-intensive job segment.
∑︁
︃(</p>
        <p>∑︁
tive as possible about the complexity of the instance they
represent. The extra features extracted are:</p>
        <p>The maximum processing time (16) represents the longest
time required to complete any single operation within the
job set. The mean processing time ( Equation 17) gives
the average duration of the operations, providing an
overall sense of the job length. The minimum processing time
(Equation 18) shows the shortest time needed for any
operation, indicating the fastest job segment.</p>
        <p>max( )
mean( )
min( )
max()
mean()
min()
)︃
)︃
)︃
)︃
∑︁
∑︁
︃(
︃(
∈ ∈
∈ ∈
∑︁ max()
∑︁ min()
∈
∈</p>
        <p>The sum of the maximum (Equation 24) and minimum
(Equation 25) energy consumption is obtained by adding
for each operation its maximum and minimum energy
consumption, respectively. These values provide insights into
the total energy requirements of the job set under extreme
conditions.</p>
        <p>⎧
⎪
⎨
− 1
∑︁</p>
        <p>︃(
⎪
⎩∈ ∈
∑︁
∈
max()
)︃</p>
        <p>Maximum Tardiness (Equation 26) represents the
maximum possible delay in a solution. If there are no release
or due date constraints (/ = 0), it is set to -1.
Otherwise, it sums the maximum processing times, indicating the
worst-case delay scenario.</p>
        <p>⎧
⎪− 1
⎪
⎪
⎪
⎪
⎪
⎪⎪⎪⎪ ∑︁  − 
⎪⎪⎪⎨ ∈ ∑︀∈</p>
        <p>||
⎪
⎪
⎪
⎪
⎪⎪⎪⎪ ∑︁ ∑︁  − 
⎪⎪⎪⎪⎪ ∈ ∈ 
⎩
||·| |</p>
        <p>Time-Window (Equation 27) represents the number of
times a job or operation can be performed within its time
window. This metric varies based on the type of release and
due date constraints: no constraints, job-level constraints, or
operation-level constraints, indicating flexibility in
scheduling.</p>
        <p>Overlap (Equation 28) represents the degree of overlap
between the time windows of the jobs or operations. This
metric assesses how much the scheduling windows for diferent
jobs or operations coincide, which impacts the complexity
and dificulty of scheduling.</p>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Machine Learning Models</title>
        <p>Once the instances have been vectorized and solved, a
tabular data set is constructed with the characteristics of each
instance and the solver that has found the best solution,
that is, the one that has obtained the lowest value of the
objective function.</p>
        <p>This dataset has been separated into two subsets, a
training subset with a size of 80% and a test subset with the
remaining 20%. In addition, it has been ensured that the
same number in proportion of instances exists in the two
subsets.</p>
        <p>The training dataset has been used to validate diferent
models using five-fold cross-validation. The validated
models are the following:
• Logistic Regression: This is a statistical method for
analyzing a dataset in which one or more
independent variables determine an outcome. The outcome
is measured with a dichotomous variable (i.e., two
possible outcomes). Logistic regression is
particularly useful for binary classification problems and
provides insights into the relationships between the
variables and the probability of the outcomes.
• Decision Tree: This is a decision support tool that
uses a tree-like model of decisions and their
possible consequences, including chance event outcomes,
resource costs, and utility. A decision tree is built by
splitting the dataset into subsets based on the value
of input features, with the goal of making the most
informative splits. This method is easy to interpret
and visualize, making it useful for understanding
the structure of the data.
• Gaussian Naive Bayes: This is a probabilistic
classiifer based on Bayes’ theorem, with the assumption
that the features are independent given the class
label and that they follow a Gaussian distribution.
Despite its simplicity, Gaussian Naive Bayes can
perform well in various situations, especially when the
assumption of independence roughly holds true.
• K-Nearest Neighbors (KNN): This is a
nonparametric method used for classification and
regression. For classification, the input consists of
the  closest training examples in the feature space,
and the output is a class membership. The object
is assigned to the class most common among its 
nearest neighbors.
• Random Forest: This is an ensemble learning method
for classification and regression that constructs
multiple decision trees during training and outputs the
mode of the classes (classification) or mean
prediction (regression) of the individual trees. Random
forests improve the predictive accuracy and control
over-fitting by averaging multiple trees, reducing
the model’s variance.
• XGBoost [14]: This is an optimized distributed
gradient boosting library designed to be highly eficient,
lfexible, and portable. It implements machine
learning algorithms under the gradient boosting
framework, which builds models in a stage-wise fashion
and generalizes them by optimizing for a
diferentiable loss function. XGBoost is known for its speed
and performance, making it a popular choice for
structured/tabular data.
• Multi-Layer Perceptron (MLP): This is a class of
feedforward artificial neural networks that consist of at
least three layers of nodes: an input layer, a
hidden layer, and an output layer. Except for the input
nodes, each node (or neuron) uses a nonlinear
activation function. MLPs are capable of learning complex
mappings from inputs to outputs and are trained
using backpropagation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Evaluation</title>
      <p>All experiments were conducted on a system equipped with
an Intel 3.60 GHz 12th generation Core i7 CPU and 64 GB
of RAM. The implementation was developed in Python 3.11.
Well-known solvers such as GUROBI [15], CPLEX [16], and
GECODE [17], which are implemented on Minizinc, were
utilized.</p>
      <p>To evaluate the quality of the solutions obtained, the
mono-objective function shown in Equation 11 is used to
compare the best solutions from the solvers. Other
important results, such as the average objective function, solving
time, optimum, satisfaction rate, and the number of
unsolved solutions, are presented above.</p>
      <sec id="sec-3-1">
        <title>4.1. Instances</title>
        <p>Instance creation is one of the most important aspects of
evaluation as it allows a specific number of instances to be
configured to ensure the most comprehensive evaluation
possible, taking into account all possible combinations the
problem may encounter in real-life scenarios.</p>
        <p>The JSP Benchmark used for testing is composed of the
number of jobs ( ) and machines ( ) to determine each
job’s tasks. These variables can take any natural number.
In this test set, the set {5, 10, 20, 25, 50, 100} is considered
for  . The release and due date can take values {0, 1, 2},
speed scaling can take values {1, 3, 5}, and statistical
distributions considered are {uniform, normal, exponential}. For
each configuration, 10 instances are generated with diferent
seeds to ensure substantial variation between them.
Therefore, a total of 6( ) × 6( ) × 3() × 3() × 3() ×
10() = 9720 instances are obtained.</p>
        <p>The time allocated for resolving each instance depends
on the specific characteristics of the problem. An example
of this allocation is illustrated in Figure 2. In this
example, an instance with 50 jobs, 10 machines, no Release Date
or Due Date, and a single speed per machine is allocated
149 seconds. The principle is that if the maximum
allocation time for an instance is 300,000 milliseconds, each
characteristic’s impact on the allocation should be
equivalent. Therefore, each characteristic contributes at most
300, 000/4 = 75, 000 milliseconds. In this manner, for the
given example, if the Release Date and Due Date are
assigned at the operation level (RDDD = 2), they contribute
75,000 milliseconds. If they are absent (RDDD = 0), they
contribute 50 milliseconds. When assigned at the job level,
the contribution is determined by exponential interpolation
between these two cases.</p>
      </sec>
      <sec id="sec-3-2">
        <title>4.2. Results</title>
        <p>Upon defining the problem instances and setting appropriate
search time limits for each solver, the focus shifted toward
analyzing and interpreting the outcomes. This involved
evaluating the eficacy of the solvers employed, assessing
solution quality, and considering the broader implications
within the problem domain.</p>
        <p>Figure 3 illustrates the distribution of solved instances
among GUROBI, CPLEX, and GECODE across three
categories: optimal (best solution), satisfied (feasible but not
optimal), and unresolved (not solved).</p>
        <p>GUROBI emerged as the top performer overall,
solving the highest number of optimal solutions and
consistently demonstrating its ability to find acceptable solutions
even when optimal ones were unfeasible. This underscores
GUROBI’s robust capability in eficiently managing a
diverse array of problem types. Conversely, GECODE excelled
in finding feasible solutions, significantly outperforming
other solvers in achieving satisfactory solutions. Moreover,
GECODE showed the fewest instances left unresolved,
highlighting its reliability in tackling complex problems without
abandoning them.</p>
        <p>In contrast, CPLEX, while proficient, faced challenges
with more complex problem instances, leading to a higher
incidence of unresolved cases. Although it achieved
reasonable numbers of optimal and satisfactory solutions, its
performance consistency was observed to be less reliable
compared to GUROBI and GECODE.</p>
        <p>Table 1 compares solution times and objective function
values from GUROBI, CPLEX, and GECODE across
diferent job and machine configurations. This analysis reveals
insights into each solver’s performance characteristics,
highlighting strengths and limitations in solving optimization
problems.</p>
        <p>For 5 to 20 jobs, GUROBI consistently shows shorter
solution times and competitive objective values compared to
CPLEX and GECODE. Its eficiency and precision make it
highly efective in simpler problem instances.</p>
        <p>In medium-sized scenarios (20 to 50 jobs), GUROBI
maintains an edge, particularly with fewer machines, though
CPLEX occasionally performs better in specific
configurations. GUROBI generally achieves superior objective
function values in varied problem setups.</p>
        <p>In complex cases (50 to 100 jobs), GECODE demonstrates
exceptional scalability despite encountering timeouts in
some instances. GUROBI and CPLEX struggle more often
with timeouts as problem size and complexity increase, yet
GUROBI often maintains competitive objective values.</p>
        <p>These insights underscore the importance of selecting
solvers based on problem specifics. GUROBI excels in
smaller to medium-sized instances, balancing eficiency
and high-quality solutions. CPLEX performs well in
certain medium-sized setups but faces scalability challenges.
GECODE shines in complex problems, ofering robust
scalability and reliability despite occasional computational
hurdles. These findings aid practitioners in optimizing solver
choices and considering trade-ofs between solution quality,
eficiency, and problem complexity.</p>
      </sec>
      <sec id="sec-3-3">
        <title>4.3. Complexity analysis</title>
        <p>Observing the results obtained by the methods used, a
relationship is observed between the parameters employed and
the complexity of the instances. This part of the study
focuses on the in-depth analysis of each parameter to observe
its contribution to the overall complexity of the instances.
(a) by number of machines</p>
        <p>(b) by number of jobs</p>
        <p>To delve deeper into the data presented in Table 1,
Figure 4 provides a general overview of the number of solved
instances. Organizing the data by the number of machines,
as shown in subfigure 4a, it is evident that as the number of
machines increases, the number of solved instances
progressively decreases except for the case of 5 machines, where
all instances are solved for all possible job configurations.
Looking at subfigure 4b, which is organized by the number
of jobs for all possible machine configurations, it can be
seen that all instances are solved for 5 and 10 jobs, but there
is a notable decrease for the rest. Focusing on the set of 20
and 25 jobs, it can be observed that there is a slight decrease
from 25 machines onwards; later, the reasons for this are
analyzed. For instances with 50 and 100 jobs, the decrease
in the number of solved instances is exponential. Only all
instances are solved for a configuration of 5 machines for
100 jobs and 5 and 10 machines for 50 jobs. This figure
illustrates how the number of jobs and machines afects the
possibility of obtaining a solution to the problem at hand.</p>
      </sec>
      <sec id="sec-3-4">
        <title>4.4. Algorithm selector results</title>
        <p>Model
Logistic Regression
Gaussian Naive Bayes
Decision Tree
 -Nearest Neighbors
Random Forest
XGBoost
MLP</p>
        <p>Table 2 shows the validation results of the tested models.
As can be seen, XGBoost is the model with the best
validation accuracy. Training this model with the total training
data set and testing it with the test set finally yields an
accuracy of 84.51%. This indicates that XGBoost performs well
during the validation phase and generalizes efectively to
unseen data. The high accuracy suggests that XGBoost’s
ensemble learning approach, which combines multiple
decision trees to improve performance, is particularly
wellsuited to this dataset. Moreover, the performance diference
between XGBoost and other models like Random Forest and
MLP, which also showed strong results,</p>
        <p>The confusion matrix in Figure 5 presents the algorithm
selector’s classification results. Each cell value represents
the number of instances where the algorithm selector
predicted the algorithm in the corresponding column for a
problem best solved by the algorithm in the corresponding
row. For instance, the selector correctly identified GUROBI
for 611 out of the total instances where GUROBI was the
best choice. Similarly, GECODE was correctly identified 396
times. However, there are misclassifications, such as
predicting GUROBI when CPLEX was optimal, which occurred
70 times.</p>
        <p>An in-depth analysis of the precision and recall metrics
provides further insights into the performance of the
algorithm selector. GECODE achieved a precision of 90.20%,
indicating that 90.20% of the instances predicted as GECODE
were correctly identified. Its recall was 86.84%, signifying
that 86.84% of the actual GECODE instances were correctly
detected. This high precision and recall demonstrate the
algorithm selector’s robustness in identifying GECODE
instances accurately.</p>
        <p>On the other hand, CPLEX showed a precision of 62.76%,
meaning that only 62.76% of the predictions for CPLEX were
accurate, and a recall of 73.75%, which indicates that 73.75%
of the actual CPLEX instances were correctly classified. The
lower precision for CPLEX suggests a higher rate of false
positives, which could imply that the algorithm selector
often misclassifies other algorithms as CPLEX.</p>
        <p>For GUROBI, the precision was 89.85%, reflecting that
89.85% of the GUROBI predictions were correct, and the
recall was 86.66%, meaning that 86.66% of the actual GUROBI
instances were identified correctly. These values indicate a
strong performance, similar to GECODE, highlighting the
selector’s eficiency in recognizing GUROBI accurately.</p>
        <p>These metrics, precision, and recall, are crucial for
evaluating the algorithm selector’s efectiveness, as they
provide a more comprehensive understanding of its
performance beyond simple accuracy. They highlight the
selector’s strengths in accurately identifying certain algorithms
while also pointing out areas where misclassification occurs,
thus providing a clear direction for further improvements.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusions</title>
      <p>This study explores the complexities of JSP, emphasizing
its NP-completeness and diverse optimization goals such as
makespan, energy consumption, and tardiness. The
problem presents significant computational challenges due to
its combinatorial nature, making timely optimal solutions
crucial in operations research and manufacturing.</p>
      <p>An innovative aspect of this research is integrating
machine learning techniques to enhance algorithm selection
for JSP instances. By extracting comprehensive features like
job and machine characteristics, release dates, and energy
requirements, models such as XGBoost and Random Forest
were efectively trained. These models accurately
recommend suitable solvers like GUROBI, CPLEX, and GECODE,
streamlining decision-making for solving diverse and
complex scheduling problems.</p>
      <p>GUROBI proved particularly eficient for smaller to
medium-sized instances, consistently delivering optimal
and satisfactory solutions across diferent configurations.
Meanwhile, GECODE demonstrated robust scalability,
excelling in complex scenarios despite occasional
computational challenges. This analysis underscores the importance
of selecting solvers based on specific problem parameters
to optimize solution quality and computational eficiency.</p>
      <p>Looking ahead, the study suggests refining feature
extraction methodologies to enhance the algorithm selector’s
accuracy across a broader range of JSP scenarios. Advancements
in solver performance under varying constraints promise
to expand the practical utility of scheduling optimization
tools in real-world manufacturing settings, emphasizing
eficiency and sustainability.</p>
      <p>Although the results obtained are not as high as those
reported in the literature, it should be noted that the
energyaware JSP constitutes a more complex problem compared
to the standard JSP and flexible JSP found in the literature.
Furthermore, this study uses a smaller set of features than
those used in other studies, yet the accuracy achieved is not
significantly lower.</p>
      <p>In conclusion, this work advances both academic
understanding and practical applications by integrating traditional
optimization techniques with modern machine-learning
approaches. It ofers tools that can significantly benefit
research and industrial practices, addressing contemporary
challenges in operations management and manufacturing
logistics.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>A survey of job shop scheduling problem: The types and models</article-title>
          ,
          <source>Computers &amp; Operations Research</source>
          <volume>142</volume>
          (
          <year>2022</year>
          )
          <article-title>105731</article-title>
          . doi:https://doi.org/10.1016/j. cor.
          <year>2022</year>
          .
          <volume>105731</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>D. B. M. M. Fontes</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Homayouni</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Gonçalves</surname>
          </string-name>
          ,
          <article-title>A hybrid particle swarm optimization and simulated annealing algorithm for the job shop scheduling problem with transport resources</article-title>
          ,
          <source>European Journal of Operational Research</source>
          <volume>306</volume>
          (
          <year>2023</year>
          )
          <fpage>1140</fpage>
          -
          <lpage>1157</lpage>
          . doi:https: //doi.org/10.1016/j.ejor.
          <year>2022</year>
          .
          <volume>09</volume>
          .006.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Barber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Salido</surname>
          </string-name>
          ,
          <article-title>Psplibenergy: a extension of psplib library to assess the energy optimization in the rcpsp</article-title>
          ,
          <source>Inteligencia Artificial</source>
          <volume>17</volume>
          (
          <year>2014</year>
          )
          <fpage>48</fpage>
          -
          <lpage>61</lpage>
          . doi:
          <volume>10</volume>
          .4114/intartif. vol17iss54pp48-
          <fpage>61</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. M. S.</given-names>
            <surname>El-Kholany</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gebser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Schekotihin</surname>
          </string-name>
          ,
          <article-title>Problem decomposition and multi-shot asp solving for jobshop scheduling</article-title>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Salido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gurrea</surname>
          </string-name>
          ,
          <article-title>A metaheuristic search technique for solving the warehouse stock management problem and the routing problem in a real company</article-title>
          ,
          <source>in: Artificial Intelligence XXXVII</source>
          , Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>201</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giret</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>A. Salido, Multi-objective optimization for energy-eficient flexible job shop scheduling problem with transportation constraints, Robotics and Computer-Integrated Manufacturing 59 (</article-title>
          <year>2019</year>
          )
          <fpage>143</fpage>
          -
          <lpage>157</lpage>
          . doi:https://doi.org/10.1016/j. rcim.
          <year>2019</year>
          .
          <volume>04</volume>
          .006.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Climent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Nicoló</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arbelaez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Salido</surname>
          </string-name>
          ,
          <article-title>A hybrid metaheuristic with learning for a real supply chain scheduling problem</article-title>
          ,
          <source>Engineering Applications of Artificial Intelligence</source>
          <volume>126</volume>
          (
          <year>2023</year>
          )
          <article-title>107188</article-title>
          . doi:https://doi.org/10.1016/j. engappai.
          <year>2023</year>
          .
          <volume>107188</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Salido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Escamilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Barber</surname>
          </string-name>
          ,
          <article-title>A genetic algorithm for energy-eficiency in jobshop scheduling</article-title>
          ,
          <source>Int J Adv Manuf Technol</source>
          <volume>85</volume>
          (
          <year>2016</year>
          )
          <fpage>1303</fpage>
          -
          <lpage>1314</lpage>
          . doi:https://doi.org/10.1007/ s00170-015-7987-0.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Jyothi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. B.</given-names>
            <surname>Dubey</surname>
          </string-name>
          ,
          <string-name>
            <surname>Minimizing</surname>
          </string-name>
          non-processing en-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>