<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Resource allocation with cooperative agents</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefania Costantini</string-name>
          <email>stefania.costantini@univaq.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni De Gasperis</string-name>
          <email>giovanni.degasperis@univaq.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pasquale De Meo</string-name>
          <email>pdemeo@unime.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Gullo</string-name>
          <email>francesco.gullo@univaq.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Provetti</string-name>
          <email>a.provetti@bbk.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Birkbeck, University of London, UK, and University of Milan</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of L'Aquila</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Messina</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>We study the problem of cooperative resource allocation in multi-agent systems, focusing on scenarios such as hospital networks. In our model, agents (e.g., hospitals) redistribute limited resources, such as medical personnel, in a way that satisfies both local constraints and global equity objectives. We devise ad-hoc optimization strategies for a static scenario, where resource needs are fixed over time. We empirically evaluate the proposed approaches through a set of experiments. Our results demonstrate that our approaches are highly efective.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multi-agent systems</kwd>
        <kwd>Cooperating agents</kwd>
        <kwd>Resource allocation</kwd>
        <kwd>Reinforcement learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The fair and eficient allocation of resources in decentralised environments has long been a fundamental
challenge in Artificial Intelligence and Economics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In domains where autonomous agents pursue
common goals, such as healthcare networks (e.g., the British National Health Service) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], wireless
networks [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], or cloud computing networks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the ability to enable cooperation without centralised
authority is of both theoretical and practical importance.
      </p>
      <p>
        In this work, we consider a network of agents managing local resources to achieve individual
objectives while cooperating toward a collective goal through resource exchanges (e.g., lending). This
paradigm aligns with the concept of fairness, which ensures equitable outcomes while maintaining
system functionality. Our motivating example is an idealized model of the NHS healthcare network:
hospitals manage their physician rosters but may temporarily lend doctors to others during localized
emergencies (e.g., outbreaks of transmissible diseases). This scenario shares similarities with the
interbanking scenario [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], though healthcare systems difer in their universal objective of avoiding hospital
failures, unlike banking systems where central banks underwrite systemic stability.
      </p>
      <p>
        A number of approaches have been proposed so far to allocating resources in multi-agent systems
(MARA), including Distributed Constraint Optimization [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Social Choice Theory [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and Market-Based
Coordination [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Among these, Nash Welfare Optimization (NWO) stands out as a principled method.
NWO models societal welfare as the geometric mean of individual utilities, and balances eficiency and
fairness by prioritizing improvements for agents with lower allocations and it comes with no surprise
that NWO has been widely studied for its theoretical guarantees and practical efectiveness [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
      </p>
      <p>However, we identify three critical limitations in its application to healthcare networks, namely:
(a) Minimum vs. Target Stafing : while NWO ensures hospitals meet baseline stafing thresholds
(that is, we assume that each hospital has at least  doctors who ensure its functioning), it ignores
aspirational targets (that is, we assume that each hospital wants at least  doctors) needed for optimal
service delivery. In some cases, it is appropriate to concentrate more resources in highly specialised
medical centres (such as hospitals specialising in rare diseases or involved in innovative clinical trials),
even if this may lead to slight inequalities in staf distribution. NWO does not meet this requirement.</p>
      <p>
        (b) Constraint Handling: NWO lacks explicit mechanisms to enforce global constraints, such as
the fact that the total number of doctors must be constant or hard lower bounds on  thresholds.
Specifically, the constraint on the overall size of the labour force is a direct consequence of public
expenditure management policies, which in some countries impose a freeze on recruiting new staf [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>(c) Dynamic Adaptability: in emergencies (e.g., pandemics), stafing needs fluctuate rapidly. NWO (as
well as any other method based on an optimisation algorithm) requires replanning the allocation of
doctors from scratch after each change, but this is a computationally prohibitive task for large hospital
networks. Moreover, the eficacy of reallocation is questionable if a new distribution of the workforce is
necessitated after a brief period.</p>
      <p>In this work, we propose novel approaches that address limitations (a)-(c). We focus on a static
scenario, i.e., we assume that the stafing needs of each hospital are fixed over time. We defer the
study of a dynamic scenario (in which the demand for staf vary over time) to future work. Our main
contribution can be summarised as follows.</p>
      <p>
        1. We introduce a new optimisation problem where the objective function is the sum of the squares
of the diferences between the number of doctors actually assigned to a hospital and the
corresponding target. Our problem can be formulated as a quadratic programming (QP) problem [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ],
which can be solved with very accurate and eficient solvers.
2. In defining our QP problem, we explicitly introduce constraints on the minimum number of
doctors to be allocated to each hospital, together with invariance of the total number of doctors in
the system. We also reformulate the NWO method to incorporate these constraints and obtaine
a convex optimisation problem for which we have eficient solvers [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We compare the QP
method with NWO and with a reallocation method called Progressive Taxation, which simulates a
tax system in which the wealthiest individuals donate some of their resources to increase social
justice [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. We also devise a hybrid formulation that combines the objective function of the
QP and NWO problems. Experimental results show that the QP model excels in reducing the
diference between the number of doctors assigned to a hospital and the target, while the NWO
model is more efective in ensuring a fair distribution of doctors, where fairness is measured
by means of the Gini Index [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] of the staf size across all the various hospitals. We provide an
empirical evaluation, whose results attest the high efectiveness of the proposed approaches, with
QP and NWO mostly prevailing over Progressive Taxation.
      </p>
      <p>While our approach is grounded in healthcare, its principles generalize to other domains requiring
cooperative resource redistribution.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Multi-agent Resource Allocation: Proposed Approaches</title>
      <p>Let us begin with an idealised multi-agent resource allocation framework that models an NHS-style
healthcare domain. We are given a set of  hospitals, denoted as ℋ = {⃗ℎ1, ⃗ℎ2, . . . ⃗ℎ} where each
hospital ⃗ℎ has three dimensions: the current number of doctors available, , the target rooster,  and
the minimum number of doctors needed for the hospital to operate, . In a static scenario, the total
number of doctors available in ℋ is fixed, i.e., ∑︀</p>
      <p>=1  = .</p>
      <p>Clearly, if some hospital ⃗ℎ has critically-low levels of staf, i.e.,  ≤ , the best option is to recruit
more doctors. Yet, this may not be possible, even for relatively-long interim periods. Thus, we consider
transferring doctors from other hospitals to improve the overall eficiency of the hospital system ℋ. The
idea is that ‘wealthy’ hospitals (those operating at or near full rooster) could lend doctors to ⃗ℎ. This
practice is common in NHS-style health systems, e.g., in Summer when population density in tourism
areas shows huge alterations. Similarly, emergency situations, such as the outbreak of epidemics and
their containment in the geographical areas where they have occurred, may require the emergency
re-assignment of medical staf to cope with the spike in hospital admissions.</p>
      <p>
        The goal now is to model the emergency re-distribution scenario, so as to compute solutions that
are (near-)optimal w.r.t. the global objective of maintaining all hospitals viable and operating, while
respecting all constraints on the capacity of individual hospitals. In the remainder of this section, we
describe the methods we devise for achieving this goal. In particular, Section 2.1 presents a Quadratic
Programming [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] formulation. We then describe two approaches to redistributing resources that have
been widely studied in economic theory, known as Progressive Taxation (Section 2.2) and Nash Welfare
Optimisation (Section 2.3), along with a hybrid approach that combines them.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Quadratic Programming (QP)</title>
        <p>We introduce the notion of inter-hospital transfer matrix (for short, transfer matrix):
Definition 1 (Transfer Matrix). Given a set ℋ = {⃗ℎ1, ⃗ℎ2, . . . ⃗ℎ} of hospitals, the transfer matrix
X ∈ R×  is a matrix whose entries  quantify the flow of staf from ⃗ℎ to ⃗ℎ . If  &gt; 0, then ⃗ℎ has
lent staf to ⃗ℎ . If  = 0, then there is no flow of doctors from ⃗ℎ to ⃗ℎ . If  &lt; 0, then ⃗ℎ has received
doctors from hospital ⃗ℎ .</p>
        <p>Notic how Definition 1 assumes that the lease of doctors between hospitals may be fractional. This
choice has important computational implications, as it enables efective and eficient algorithms. Also,
this choice ofers a higher level of flexibility in the sense that decision-makers receive suggestions on
how to redistribute the available workforce but they have some leeway in the final choice.</p>
        <p>We can now define , the staf level at ⃗ℎ after redistribution, as the sum of the initial staf count at
⃗ℎ plus the number of incoming doctors and minus the number of those transferred elsewhere:
 =  + ∑︀
=1  −
∑︀=1 
(1)
We collect all coeficients  into a vector y = [1, 2, . . . , ]. The following constraints apply:
a) minimum satisfaction of individual hospital requests: after the redistribution of doctors, each
hospital ⃗ℎ must have a number of doctors at least equal to , i.e.,  ≥ ;</p>
        <p>b) retention of staf : the number of doctors working in each hospital after the redistribution must be
equal to the number of doctors initially on duty, since no doctors have been hired/fired: ∑︀
=1  = .</p>
        <p>Now, a solution exists whenever the total number of available staf, , is suficient to cover the aggregated
minimal demands from hospitals, i.e., ∑︀</p>
        <p>=1  ≤ .</p>
        <p>The quantity  −  expresses the diference between the number of doctors on duty in the hospital
⃗ℎ after the redistribution and the target number of doctors. A key desideratum for staf reallocation
consists in making  −  as small as possible (ideally, no staf should be asked to move to a new
workplace). In practice, we may have the following two sub-optimal scenarios. First, an ⃗ℎ may have
fewer doctors than its target (and therefore the diference  −  is negative). Second, ⃗ℎ may have
more doctors than it actually wants (and, thus, the diference  −  is positive). To compensate the two
errors above, we take the square of  −  as an estimate of the error and we compute the sum of the
errors across all the hospitals as the objective function to be minimized. This results in the following
quadratic programming problem:
min {︀ ∑︀
y =1( − )2 }︀
s.t.</p>
        <p>∑︀=1  =  and  ≥  ∀ ∈ 1, . . . , .
(2)</p>
        <p>The above optimisation problem is a convex quadratic programming one. As such, it can be solved
by of-the-shelf solvers 1.
1E.g., the qpsolvers unified Python module for quadratic programming: https://pypi.org/project/qpsolvers/.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Progressive Taxation</title>
        <p>
          The progressive taxation model of [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] is a fiscal framework wherein the efective tax rate escalates
commensurately with increases in the taxable base (e.g., income or asset valuation). This mechanism
facilitates wealth redistribution and mitigate the gap between rich and poor individuals in the society;
consequently, progressive taxation aligns with the principle of vertical equity: those with greater
resources have a bigger capacity to contribute to government funding.
        </p>
        <p>We formalise progressive taxation in a multi-agent models with the following redistribution rule.
Consider the top and bottom decile:
•  (Resource-abundant): top 10% hospitals by staf size ;
•  (Resource-constrained): bottom 10% hospitals by staf size .</p>
        <p>The progressive-taxation-based reallocation procedure will iteratively apply the equations below
(where the “()” superscript denotes the value at iteration ):
Taxation Phase:
Allocation Update:
 () =
⎧ 0.1()
⎪⎨− | |
⎪
⎩0
if  ∈ ,  ∈ 
otherwise
⎪⎧∑︀∈  () = − 0.1() if  ∈ 
⎪
⎪
⎪
∆ () = ⎨∑︀∈  () = |1 | ∑︀∈ 0.1() if  ∈ 
⎪
⎪
⎪
⎪⎩0</p>
        <p>otherwise
(+1) = max {︁, () + ∆ () }︁
(3)
(4)
(5)
(+1) of the transfer matrix at the ( + 1)-th iteration are equal to  () (resp.</p>
        <p>The resulting entries 
−  ()) only if  (resp. ) belongs to ,  (resp. ) belongs to  , and  () + ∆ ()) is no
() + ∆ () (resp. 
(+1) = 0.
less than  (resp.  ). In all other cases,</p>
        <p>The rationale of the above equations is as follows. Hospitals in class  donate 10% of their staf 2 to
those in  . Thus, 10% of the total staf in  is equally distributed among hospitals in  . Conversely,
hospitals in the middle deciles experience no change in stafing.</p>
        <p>The above equations lead to a solution that satisfies the constraints of min satisfaction of individual
hospital requests and staf retention (see Sect. 2.1). The final solution is found by either running
the reallocation for a given number of steps  or setting convergence bounds (although no formal
convergence is guaranteed for the general case).</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Nash Welfare Optimization (NWO)</title>
        <p>
          Nash Welfare Optimization (NWO) [
          <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
          ] is a Pareto-eficient allocation mechanism whose
maximization prevents extreme disparities in resource distribution. This makes NWO well-suited for our goal
of redistributing doctors among hospitals, as it avoids scenarios where hospitals fall below minimum
operational stafing levels (  &lt; ) or become excessively overstafed (  ≫ ).
2The 10% threshold is arbitrary and can be adjusted by decision makers.
        </p>
        <p>NWO solves the following constrained optimization problem, where the objective function is the
product of agents’ utilities:
myax {∏︀=1( −  + 1) }
s.t.</p>
        <p>∑︀=1  =  and  ≥  ∀ ∈ {1, . . . , }.
(6)
Next, the multiplicative objective in (6) can be transformed into a convex optimization problem:
myax {∑︀=1 log ( −  + 1) } ,
which is computationally tractable and solvable with standard convex optimization solvers.</p>
        <p>The choice of  −  + 1 (rather than  − ) in the utility function is critical. In fact, if we used
 − , the objective function to be maximised would be the product of terms of the type  − , and
each of these terms must be strictly positive (if this was not the case, the utility of some agents would
be negative or equal to zero and the logarithm would make no sense). Thus, the condition  −  &gt; 0
is verified if we assume that the number of available doctors is greater than the sum of the targets ,
i.e. ∑︀=1  ≤ . In this case, each hospital could be assigned the target number of doctors and the
surplus could be distributed randomly. In practice, we can assume that ∑︀
=1  ≤ , but in general
we expect ∑︀ =1 . In this case, if the NWO algorithm tried to
=1  to be considerably less than ∑︀
allocate each hospital a number of doctors equal to (or greater than) its target, the remaining doctors
would not be suficient to meet the minimum requirements of other hospitals, with the consequence
that the constraint  ≥  would be violated for some hospitals.</p>
        <p>In what follows, we will also consider a hybrid optimisation strategy where the cost function is a
linear combination of the quadratic programming and NWO ones:
min {︀  ∑︀
y =1( − )2 − (1 −  ) ∑︀=1 log ( −  + 1) }︀
(7)
s.t.</p>
        <p>∑︀=1  =  and  ≥  ∀ ∈ 1...</p>
        <p>The objective function in (7) is convex as it is a linear combination of convex functions, thus eficient
solvers are available. Note that the parameter  controls the contribution of the cost functions considered
in the quadratic-programming and NWO cases. So, as  → 0 we see the cost function converge to
NWO whereas for  → 1 it converge to the QP formulation instead.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>We evaluate the efectiveness of our approach through experiments designed to address how the
proposed NWO, Progressive Taxation, QP and hybrid strategies perform in practice. Specifically, we
are interested in assessing how efectively our approaches redistribute doctors to minimize inequality
while ensuring each hospital meets its stafing target.</p>
      <p>We adopt the following metrics to evaluate the performance of our approaches.
1. Target Deviation (or Mean Absolute error: MAE), which is defined as the average diference
between actual staf levels and targets:</p>
      <p>MAE = 1 ∑︀
=1 | − |
(8)
Ideally, we would like each hospital to have as close to the target number of doctors as possible;
thus, the lower MAE, the more efective the approach to redistributing doctors.</p>
      <p>
        2. Gini Inequality Index () [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], defined as
is undesirable. Thus, ideally, low values of  should be sought.
      </p>
      <p>The Gini Index is one of the most commonly used inequality measures (e.g., income inequality or
inequality in life expectancy). It ranges from 0 (perfect equality) to 1 (perfect inequality). In our
case, a  close to 1 indicates that the available doctors are concentrated in a few hospitals, which</p>
      <sec id="sec-3-1">
        <title>3.1. Results</title>
        <p>Our experiments aim to compare the efectiveness of medical staf reallocation strategies in a
static
scenario, i.e., where hospital stafing requirements do not change over time.</p>
        <p>To do so, we generated a large test instance made up of 150 randomly-generated hospitals; i.e., for
each hospital the minimum number of staf needed to function properly, the current number of doctors
on the rooster and the target number were assigned at random.</p>
        <p>We ran the QP, Progressive Taxation, and NWO strategies of Sec. 2 on the random instance and
computed the Gini Index and Target Deviation achieved by each method. To ensure statistical robustness,
we repeated the experiment 20 times and calculated the mean and standard deviation of both measures.
The results are shown in Table 1. We can observe that QP and NWO exhibit diferent behaviours. In
fact, QP tends to redistribute resources to ensure that the number of doctors actually allocated to a
hospital is as close to the target as possible. As a result, some hospitals may receive many more doctors
than others, leading to a more unequal distribution of human resources and a corresponding increase
in the Gini Index. NWO follows the opposite logic, as its redistribution aims to ensure that all hospitals
end up with a comparable number of staf, which can have a negative impact on the Target Deviation.</p>
        <p>Progressive Taxation behaves similarly to NWO and tends to smooth out inequalities. This is
witnessed by a Gini Index value that gets close to that of NWO and by a drop in Target Deviation.</p>
        <p>We then focus on the hybrid strategy and examine how the Target Deviation and the Gini Index vary
as the parameter  changes. We have divided the range of variation of  (i.e., the segment from zero to
one) into 35 intervals of equal size, and each division point corresponds to a value of  used to compute
the objective function of the hybrid strategy. Figure 1 shows mean and standard deviation of Target
Deviation and Gini Index obtained by the hybrid strategy. As  approaches zero, the objective function
of the hybrid strategy coincides with that of NWO and this explains why we obtain increasingly higher
values of the objective deviation, accompanied by smaller values of the Gini Index.</p>
        <p>In the health sector, advanced therapies and innovative drugs are expensive and often require a
large and well-trained workforce; consequently, some treatments should be concentrated in a few
highly specialised hospitals, which should also have significant stafing levels. We therefore believe
that minimising the Target Deviation is at least as important as minimising the Gini Index.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Related Work</title>
      <p>
        Multiagent resource allocation (MARA) constitutes a fundamental challenge in multiagent systems,
requiring autonomous agents to distribute limited resources in ways that balance eficiency and fairness.
Such problems arise ubiquitously [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. As systems grow in scale and complexity, designing mechanisms
that reconcile individual agent incentives with collective welfare becomes increasingly critical.
      </p>
      <p>
        At its core, MARA involves agents negotiating resource distributions, often encountering dilemmas
where self-interest conflicts with group optimality. Central here is the concept of fairness, a principle
vital to both human societies and artificial systems [
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18, 19, 20</xref>
        ].
      </p>
      <p>Our work can be positioned under the broad umbrella of MARA. Specifically, we propose novel
strategies for the re-allocation of resources in a multi-agent, cooperative setting.</p>
      <p>
        Centralized vs. decentralized approaches. Traditional centralized methods, such as the Hungarian
and Gale-Shapley algorithms, rely on a central authority with full system knowledge to compute optimal
allocations [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The Nash Welfare Optimization (NWO) ofers a principled framework for fair resource
distribution by maximizing the geometric mean of agent utilities. NWO balances eficiency and equity,
prioritizing improvements for agents with lower initial utility [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. NWO occupies a middle ground
between utilitarian welfare (maximizing total utility) and egalitarian welfare (maximizing minimum
utility) and satisfies axiomatic properties such as scale invariance, Pareto eficiency, and independence of
irrelevant alternatives [
        <xref ref-type="bibr" rid="ref10 ref11">11, 10</xref>
        ]. Due to these strengths, NWO has been applied successfully in collective
decision-making, project funding allocation, and fair division problems [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>In this work, we provide a novel contextualization of NWO to the multi-agent, cooperative resource
re-allocation setting, based on quadratic programming.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Our work addresses cooperative resource re-allocation in a multi-agent environments, focusing on the
important case of redistribution of hospital staf across a regional Health system such as the British NHS.
We devise three principles approaches to this problem, which are based on quadratic programming (QP),
Progressive Taxation, and Nash Welfare Optimization (NWO), respectively. We conduct experiments
whose main results attest the high efectiveness of the proposed approaches. In the future, we plan to
investigate the problem in a dynamic scenario, where the demand for staf vary over time.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>Research partially supported by the PNRR Project CUP E13C24000430006 “Enhanced Network of
intelligent Agents for Building Livable Environments - ENABLE”, and by PRIN 2022 CUP E53D23007850001
Project “TrustPACTX - Design of the Hybrid Society Humans-Autonomous Systems: Architecture,
Trustworthiness, Trust, EthiCs, and EXplainability (the case of Patient Care)”.
During the preparation of this work, the author(s) used Grammarly, solely to spell check and improve
the grammar. The tool was not used to alter, generate, or influence the semantic contents of the paper.
The authors retain full responsibility for the accuracy, originality, and integrity of the entire work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Jong</surname>
          </string-name>
          , P. Stone,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taylor</surname>
          </string-name>
          ,
          <article-title>Multiagent resource allocation: A review of mechanisms and applications</article-title>
          ,
          <source>Autonomous Agents and Multi-Agent Systems</source>
          <volume>22</volume>
          (
          <year>2008</year>
          )
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Behari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Hughes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nagaraj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tuyls</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taneja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tambe</surname>
          </string-name>
          ,
          <article-title>Towards a pretrained model for restless bandits via multi-arm generalization</article-title>
          ,
          <source>in: Proc. of the International Joint Conference on Artificial Intelligence</source>
          ,
          <source>(IJCAI</source>
          <year>2024</year>
          ), Jeju, South Korea,
          <year>2024</year>
          , pp.
          <fpage>321</fpage>
          -
          <lpage>329</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nallanathan</surname>
          </string-name>
          <article-title>, Multi-agent reinforcement learning-based resource allocation for UAV networks</article-title>
          ,
          <source>IEEE Transactions on Wireless Communications</source>
          <volume>19</volume>
          (
          <year>2020</year>
          )
          <fpage>729</fpage>
          -
          <lpage>743</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , T. Guo,
          <article-title>CE-NAS: an end-to-end carbon-eficient neural architecture search framework</article-title>
          ,
          <source>in: Proc. of the International Conference on Advances in Neural Information Processing Systems 38 (NIPS</source>
          <year>2024</year>
          ), Vancouver, BC, Canada,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Battiston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Puliga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kaushik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tasca</surname>
          </string-name>
          , G. Caldarelli, Debtrank:
          <article-title>Too central to fail? financial networks, the fed and systemic risk</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>2</volume>
          (
          <year>2012</year>
          )
          <article-title>541</article-title>
          . URL: https://doi.org/10.1038/ srep00541. doi:
          <volume>10</volume>
          .1038/srep00541.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tong</surname>
          </string-name>
          , B. de Keijzer, C. Ventre,
          <article-title>Reducing systemic risk in financial networks through donations</article-title>
          ,
          <source>in: Proc. of the European Conference on Artificial Intelligence (ECAI</source>
          <year>2024</year>
          ), volume
          <volume>392</volume>
          , IOS Press, Santiago de Compostela, Spain,
          <year>2024</year>
          , pp.
          <fpage>3405</fpage>
          -
          <lpage>3412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] S. de Jong, S. Uyttendaele,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tuyls</surname>
          </string-name>
          ,
          <article-title>Learning to reach agreement in a continuous ultimatum game</article-title>
          ,
          <source>J. Artif. Intell. Res</source>
          .
          <volume>33</volume>
          (
          <year>2008</year>
          )
          <fpage>551</fpage>
          -
          <lpage>574</lpage>
          . URL: https://api.semanticscholar.org/CorpusID:13248455.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Online nash social welfare maximization in multi-agent systems</article-title>
          ,
          <source>in: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nongaillard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sohier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hilaire</surname>
          </string-name>
          ,
          <article-title>Centralized and distributed approaches for resource allocation: A comparative study</article-title>
          ,
          <source>Journal of Intelligent Manufacturing</source>
          <volume>27</volume>
          (
          <year>2016</year>
          )
          <fpage>789</fpage>
          -
          <lpage>803</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Delemazure</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Durand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mathieu</surname>
          </string-name>
          ,
          <article-title>Aggregating correlated estimations with (almost) no training</article-title>
          ,
          <source>in: Proc. of the European Conference on Artificial Intelligence</source>
          ,
          <source>(ECAI 2023 - 26th )</source>
          , volume
          <volume>372</volume>
          <source>of Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press, Krakow, Poland,
          <year>2023</year>
          , pp.
          <fpage>541</fpage>
          -
          <lpage>548</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hossain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Micha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Fair algorithms for multi-agent multi-armed bandits</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>24005</fpage>
          -
          <lpage>24017</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Papanicolas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Woskie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jha</surname>
          </string-name>
          ,
          <article-title>Health care spending in the united states and other high-income countries</article-title>
          ,
          <source>Jama</source>
          <volume>319</volume>
          (
          <year>2018</year>
          )
          <fpage>1024</fpage>
          -
          <lpage>1039</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Boyd</surname>
          </string-name>
          , L. Vandenberghe, Convex optimization, Cambridge university press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Stiglitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Rosengard</surname>
          </string-name>
          ,
          <article-title>Economics of the public sector: Fourth international student edition</article-title>
          , WW Norton &amp; Company,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Farris</surname>
          </string-name>
          ,
          <article-title>The gini index and measures of inequality</article-title>
          ,
          <source>The American Mathematical Monthly</source>
          <volume>117</volume>
          (
          <year>2010</year>
          )
          <fpage>851</fpage>
          -
          <lpage>864</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Moulin</surname>
          </string-name>
          ,
          <article-title>Fair division and collective welfare</article-title>
          , MIT press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Nash</surname>
          </string-name>
          , et al.,
          <source>The bargaining problem, Econometrica</source>
          <volume>18</volume>
          (
          <year>1950</year>
          )
          <fpage>155</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceragioli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Venable</surname>
          </string-name>
          ,
          <article-title>Fairness-aware distributed planning for resource allocation in multiagent systems</article-title>
          ,
          <source>in: Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI)</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          ,
          <article-title>Fairness in multi-agent systems with reinforcement learning</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>275</volume>
          (
          <year>2019</year>
          )
          <fpage>25</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Parkes</surname>
          </string-name>
          ,
          <article-title>Envy-free allocation in combinatorial auctions</article-title>
          ,
          <source>Games and Economic Behavior</source>
          <volume>70</volume>
          (
          <year>2010</year>
          )
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>