<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>P. Pagliuca);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paolo Pagliuca</string-name>
          <email>paolo.pagliuca@istc.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Trivisano</string-name>
          <email>g.trivisano@alumni.uniba.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandra Vitanza</string-name>
          <email>alessandra.vitanza@istc.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Multi-Agent Systems, Multi-Objective Optimization, Evolutionary Algorithms, OpenAI-ES, Aggregation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Cognitive Sciences and Technologies, National Research Council (CNR-ISTC)</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi di Bari Aldo Moro</institution>
          ,
          <addr-line>Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Multi-Agent Systems are characterized by multiple agents interacting to solve tasks that may be dificult, or even impossible, for a single agent. While discovering solutions to problems with a single objective might be relatively straightforward, the picture changes when coping with Multi-Objective Optimization (MOO), where problems require the simultaneous optimization of multiple objectives that potentially conflict with each other. This is particularly relevant in Multi-Agent Systems (MASs), since each agent's behavior afects the overall system performance. For example, the capability of a system, composed of many robots, to both locomote and aggregate simultaneously requires the definition of appropriate fitness measures and the usage of suitable algorithms. In this work, we investigate the conditions necessary to promote aggregation in a robotic MAS, with a particular focus on how conflicting objectives can hinder the learning of efective behaviors. Specifically, we designed a novel fitness function and tested it in a relatively simple aggregation scenario. Furthermore, we considered a recently introduced MOO problem, in which a MAS of five robots must develop the ability to aggregate while in motion. Our outcomes show that, despite the challenges in designing efective fitness functions, the proposed formulation successfully supports aggregation in the simpler scenario and enhances aggregation capabilities in the more complex one.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Multi-Agent Systems (MASs) [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ] are characterized by the presence of multiple intelligent entities,
termed as “agents”, that coexist in a shared environment and can interact, thus reciprocally influencing.
As a result of these local interactions, MASs are capable of developing very complex behaviors and can
solve problems that are not feasible for a single agent. In particular, a MAS requires the presence of
at least two agents (i.e., a two-agent system or dyad). Thanks to the interaction between agents, the
whole system can give rise to sophisticated behaviors. Examples of complex strategies discovered in
robotic MASs include foraging [
        <xref ref-type="bibr" rid="ref10 ref4 ref5 ref6 ref7 ref8 ref9">4–11</xref>
        ], aggregation [12–18] and predator avoidance [19–23]. A subfield
of MASs is swarm robotics [24–26], where a group of agents (e.g., robots or drones) performs tasks not
feasible for a single element. The idea is to mimic biological systems like colonies of ants, herds of
sheep, schools of fish or flocks of birds [
      </p>
      <p>27, 28].</p>
      <p>
        Generally, the development of the agents’ skills in a MAS is often achieved through Evolutionary
Algorithms (EAs) [29], i.e., optimization methods that are inspired from biological evolution and are
capable of providing solutions to a broad set of problems, such as classic control [30–32], robot navigation
[33–35], foraging [
        <xref ref-type="bibr" rid="ref4">4, 36</xref>
        ], function optimization [37, 38], or competitive co-evolution [39, 40].
      </p>
      <p>EAs have proven to be valuable tools to cope with Multi-Objective Optimization (MOO) problems
[41–44], in which multiple conflicting objectives must be optimized simultaneously [ 45]. In such
scenarios, findings solutions that satisfy all objectives is very dificult. Therefore, Pareto optimality
(A. Vitanza)</p>
      <p>CEUR</p>
      <p>ceur-ws.org
[46] is used to identify a set of valuable solutions from which the experimenter can select the best.
MOO in a MAS is a significantly complex scenario, where multiple agents must interact in order to
optimize diferent, often conflicting, goals. Finding solutions to this kind of problems is far from trivial.
As illustrated in [47], the authors employed OpenAI-ES (OpenAI Evolutionary Strategy) [48] to evolve
two behaviors, locomotion and aggregation, in a swarm of five Ant Pybullet robots [ 49]. Although the
evolved agents exhibited good locomotion capability, they failed to develop any form of aggregation.
This underscores the dificulty of simultaneously optimizing multiple objectives, especially in MASs.
Moreover, the study emphasizes that the definition of the fitness function is paramount in these kinds
of problems.</p>
      <p>Building on these insights and taking inspiration from this previous study, this work explores how
these conflicting objectives interact to shape the evolution of distinct behaviors in a MAS. To this
end, we first introduced a new fitness function specifically tailored to promote aggregation and tested
it in a simple robot aggregation scenario. Next, we adapt the fitness function originally proposed
in [47] by replacing only its aggregation reward component with our new formulation. Our results
show that (a) in the simple robotic aggregation scenario, the new fitness function eficiently evolves
efective aggregation behaviors; (b) in the AntBullet Swarm scenario, the new adapted function shows
improvements in aggregation, although the swarm does not yet display fully coordinated aggregation.</p>
      <p>The rest of the paper starts with an overview of related works in the field of MAS, EAs and MOO
(Section 2), with a specific focus on aggregation. Then, the problems addressed and the experimental
settings are described (Section 3). In Section 4, we present the quantitative and qualitative outcomes of
our experiments. Finally, Section 5 contains our final remarks and possible future research directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        Aggregation is a process in which individuals gather in groups for specific purposes [ 50], such as
protection against predators [51] or speeding up for food [52]. This phenomenon is frequently observed
in nature, both in microorganisms (e.g., Dictyostelium discoideum [53], Capsaspora owczarzaki [54]
and Brachionus calyciflorus [55]) and in complex animals (e.g., flocks of birds [ 56] and schools of
ifsh [
        <xref ref-type="bibr" rid="ref11 ref12">57, 58</xref>
        ]). Aggregation represents one of the fundamental collective behaviors, as it gives rise to
various cooperative behaviors [18], such as coordinated movement [
        <xref ref-type="bibr" rid="ref13">59</xref>
        ], self-assembly [
        <xref ref-type="bibr" rid="ref14">60</xref>
        ] or collective
transport [
        <xref ref-type="bibr" rid="ref15">61</xref>
        ]. A specific body of research focuses on self-organized aggregation [ 18], in which the
individuals of the group aggregate autonomously, without any central control. This kind of aggregation
allows the development of control systems that are robust to partial failure, flexible and scalable, and can
be performed using only local interactions between individuals. In biological systems, self-organized
aggregation manifests through either positive or negative feedback mechanisms [18]: the former can
occur as an attraction force towards a given signal source, while the latter acts as a regulatory or
repulsive mechanism between individuals.
      </p>
      <p>
        The efectiveness and evolutionary importance of this behavior have prompted research to replicate
it in simulated environments, following the principles of swarm robotics [25, 26]. Several studies
have used evolutionary algorithms [29], and specifically evolutionary strategies [
        <xref ref-type="bibr" rid="ref16 ref17">62, 63</xref>
        ], to develop
aggregation behaviors in MASs or robot swarms. Evolutionary algorithms — translating the principles
of natural evolution into algorithms — are promising solutions for the automatic learning of complex
control policies that are dificult to design manually from scratch. In [
        <xref ref-type="bibr" rid="ref18">64</xref>
        ], diferent algorithms, including
CMA-ES (Covariance Matrix Adaptation Evolution Strategy) [
        <xref ref-type="bibr" rid="ref19">65</xref>
        ], GA (Genetic Algorithm) [
        <xref ref-type="bibr" rid="ref20">66</xref>
        ] and
OpenAI-ES [48], were compared on an aggregation task. The study evaluates convergence time, policy
quality, scalability and generalization for swarms of diferent sizes. In general, all the algorithms
proved to be efective in completing the task, with diferences in the stability of the aggregated cluster
and the aggregation times, especially for smaller swarms (5, 10 and 20 robots). In [15], CMA-ES and
xNES (Exponential Natural Evolution Strategies) [
        <xref ref-type="bibr" rid="ref21">67</xref>
        ] were compared on a specific aggregation task,
highlighting the importance of communication. Similarly, a comparison of the diferent aggregation
behaviors evolved by CMA-ES, xNES and OpenAI-ES under diferent experimental conditions is proposed
forming a cross. The index of the central robot is 1. Starting from the robot placed in the upper right corner, the
indices are progressively increased in a counter-clockwise direction.
in [16]. Finally, the paper [
        <xref ref-type="bibr" rid="ref22">68</xref>
        ] demonstrates the validity of an automatic approach based on evolutionary
optimization via PSO (Particle Swarm Optimization) [
        <xref ref-type="bibr" rid="ref23">69</xref>
        ] to design interpretable and scalable PFSM
(Probabilistic Finite State Machines) controllers for the fundamental task of aggregation in swarm
robotics. The use of this type of controllers ofers greater policy interpretability and potentially better
transferability to real-world robotic environments compared to the typically used neural network
controllers.
      </p>
      <p>
        The works discussed so far focus on the optimization of a single objective, known as Mono-Objective
Optimization. In such problems, there exists at least one optimal solution, as well as multiple equivalent
solutions. Instead, when the problem involves multiple and potentially conflicting objectives, as
previously mentioned, we speak of Multi-Objective Optimization (MOO) [45]. Compared to the previous
scenario, solving a MOO problem poses significant challenges, since the aim is to obtain the optimal
solution for all the objectives simultaneously, which is not feasible. In this case, the possible solutions,
called Pareto-optimal [46], are potentially infinite and lie on the Pareto front. The choice of one of these
solutions depends on the subjective preferences of a human decision maker. In contrast to the problem
presented in [47], another approach that investigated the development of aggregation, in conjunction
with other tasks, is the one illustrated in [
        <xref ref-type="bibr" rid="ref24">70</xref>
        ], where a decentralized algorithm for robotic swarms
based on limited local interactions is introduced and applied to a MOO problem in which agents are
rewarded for their ability to aggregate and avoid obstacles. The approach, tested both in simulation and
in real-world settings, has been proven to be efective in developing cohesive behavior and dynamic
obstacle avoidance.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Robot Aggregation problem</title>
        <p>
          1, is ( 1,  1) = (2.5, 2.5)
The first scenario involves a group of 5 MarXbot robots [
          <xref ref-type="bibr" rid="ref25">71</xref>
          ] that are placed in a square-walled arena
of 5 × 5
        </p>
        <p>. The initial configuration of the robots is shown in Fig. 1: the central robot is located at
the center of a theoretical square, whose vertices represent the initial positions of the other robots.
The radius of each MarXbot robot (  ) is 8.5 . The position of the central robot, labeled with index
. The initial locations of the other robots, whose indices are increased
counter-clockwise, are determined according to the following rules (Eq. 1):
consisting of 4 sectors of 90∘ each. The camera enables robots to detect colored objects in the
environment. Additionally, each robot can turn on/of both frontal and rear LEDs (red and blue, respectively)
in order to signal its presence to the other robots.</p>
        <p>The robots’ goal is to aggregate within the arena. To this end, we defined the fitness function as
reported in Eqs. 2 - 4:
 ̂ =
  −1
∑ |   −  , |
1
  − 1 =1
  =  × 
  =
− 
 ̂2
 2
 
∑  
1
  =1</p>
        <p>In Eqs. 2 - 4, the symbol  , indicates the distance between the agents  and  (with  ≠  ),  ̂ is the
resulting average distance for the  − ℎ agent and   is the number of robots. The   parameter
represents the target distance considered suficient to solve the problem (in our experiments, we set

 = 1.5</p>
        <p>). The symbols  and  denote, respectively, the maximum value achieved by the function
  when the target distance is reached and the standard deviation of the Gaussian function. As regards
the experiments reported here, we used the values  = 7
and  = 8 . Finally, the symbol   represents
the distance reward (see Fig. 2). This function was chosen empirically after a preliminary investigation
phase (see Fig. 3), during which we observed that   is characterized by a more gradual slope than
alternative functions, while still maintaining the desired maximization.</p>
        <p>A feed-forward neural network controls each robot. The network has 12 inputs, one layer of 10
hidden neurons and 4 outputs. The hidden and output neurons have biases, and their activation function
is the tanh. The infrared sensors and the camera provide input data feeding the neural network, which
performs its computation and produces two outputs to control the wheel speeds, and two outputs to
control the frontal and rear LEDs.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. AntBullet Swarm problem</title>
        <p>
          provided in Fig. 4.
we applied the following modifications:
The second scenario is notably more challenging and requires dealing with a MOO problem. In particular,
we consider the problem introduced in [47], which involves a MAS composed of 5 AntBullet robots
that must aggregate while locomoting in an unbounded environment. A snapshot of the problem is
The agents’ goal is to learn how to aggregate and locomote. Due to the pitfalls highlighted in [47],
• we split the evolutionary process into two phases and adopted an incremental evolution approach
[
          <xref ref-type="bibr" rid="ref26">72</xref>
          ]. During the first phase, agents learn only to locomote; in the second phase, instead, they
have to aggregate while locomoting;
i 4
D
        </p>
        <p>• we modified the aggregation reward function by using the one defined in Eq. 3;
• during the second phase, the reward for locomotion is notably reduced — but not removed — in
order to make the agents exploit the skills acquired in the first phase.</p>
        <p>
          This latter decision was made because existing studies [
          <xref ref-type="bibr" rid="ref27">34, 73</xref>
          ] have shown that training agents
to learn two diferent skills sequentially may lead to a “vanishing” phenomenon, where previously
acquired capabilities are forgotten.
        </p>
        <p>Therefore, with the aim to promote the evolution of successful strategies, the fitness function has
been designed according to Eqs. 5 - 7:</p>
        <p>=
 1 =
⎧ 1 if  &lt;  2
⎨
⎩ 2 if  ≥  2
1</p>
        <p>∑   +   +  
  =1
 2 =
1</p>
        <p>∑ 0.1 ×   +   +   +  
  =1
(5)
(6)
(7)</p>
        <p>The symbol   in Eq. 5 represents the total number of evaluation steps performed during evolution.
The symbols   ,   and   in Eqs. 6 - 7 indicate, respectively, the progress reward (i.e., how much the
agent  locomotes), the stall cost related to the magnitude of the actions performed by agent  , and the
number of joints extended at their limits. In Eq. 7,   is the distance reward computed according to
Eq. 3 (diferently from the previous scenario, here we set the target distance   to 1.5 meters, coherently
with [47]). We modified this component to improve the performance of the aggregation shown in
[47], aiming to address the challenges of MOO more efectively. In fact, as pointed out by the authors,
i 4
D
M × e di
M × e di
the original fitness function (see [ 47], Eqs. 5-6) was inefective due to the diferent magnitudes of
the components   and   . Moreover, locomotion is easier to achieve than aggregation: a robot can
locomote regardless of the others, whereas aggregation involves the need to interact and coordinate
with others. Consequently, the outcomes presented in [47] show that agents, evolved with OpenAI-ES,
are characterized by good locomotion capabilities, but do not aggregate.</p>
        <p>In addition to the fitness modification, we endowed the robots with an omni-directional camera
capable of detecting other robots within an 8-meter range. The camera view is split into 6 sectors of 60∘
each. This sensor has been designed to make each robot capable of perceiving the presence of nearby
mates, hence encouraging the evolution of more efective aggregation behaviors. The information
provided by the camera input for a generic robot  is defined as follows: if robot  detects a teammate  ,
the corresponding sector  is activated and returns a value defined according to Eq. 8:
  = (1.0 −</p>
        <p>)
 ,
 
(8)
where the symbol  , indicates the distance between the robots  and  and   is the maximum
distance (with   = 8). A detailed list of the robot’s equipment is provided in Table 1.</p>
        <p>The robot controller is a feed-forward neural network with 34 input neurons, an internal layer
containing 20 hidden neurons, and 8 output neurons. All neurons have associated biases. The activation
function of hidden neurons is the tanh function, while the activation function of output neurons is the
linear function.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Experimental setting</title>
        <p>
          For both scenarios, the OpenAI Evolutionary Strategy (OpenAI-ES) [48] was employed, as it represents a
modern and quite sophisticated algorithm for evolving successful locomotion [
          <xref ref-type="bibr" rid="ref28 ref29">48, 74, 75</xref>
          ] and aggregation
[
          <xref ref-type="bibr" rid="ref18">14, 16, 64</xref>
          ] behaviors. OpenAI-ES works by initializing a single solution, called centroid, which encodes
the connection weights of the neural network controller determining the robot’s behaviors. The centroid
        </p>
        <p>Sensor
 −  
 _
 _
0.3 ×  
0.3 ×  
0.3 ×  
Motor
)
)</p>
        <p>ID
6
7
8–23
24–27  
28–33
0–7     ∈</p>
        <p>_</p>
        <sec id="sec-3-3-1">
          <title>The environment is unbounded, allowing the agents to move freely.</title>
          <p>Robot’s equipment (i.e., sensors and motors). The symbol  indicates the agent’s height, while the symbol
represents the initial height of the agent. The symbol 
denotes the relative angle between the
 _
robot and the target (for locomotion). The symbols   ,   ,   are the velocity components along the three axes.</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>The symbols</title>
          <p>and ℎ
represent the robot’s orientation. The symbol (
 ,    ) identifies the position and
Lastly, the symbol   indicates the torque applied to joint  .
velocity of the joint  . The symbol  
ground ( 
_
 = 1) or not ( 
_
_</p>
          <p>represents a flag indicating whether the foot  touches the
 = 0), while   denotes the input from sector  of the camera.</p>
          <p>
            Sensor

ℎ
_ /ℎ
(  ,    )   ∈  

    ∈  
_ ,
_
_

_
    ∈ [ 
_ /ℎ
_ ]
is iteratively updated through an advanced process consisting of mirrored sampling [
            <xref ref-type="bibr" rid="ref30">76</xref>
            ], gradient
estimation, and optimization using the Adam optimizer [
            <xref ref-type="bibr" rid="ref31">77</xref>
            ]. The algorithm is illustrated in Fig. 5.
          </p>
          <p>
            In more detail, OpenAI-ES seeks to identify the most promising areas of the solution space (i.e., the
connection weights corresponding to better displayed behaviors), so that evolution is targeted toward
those regions, increasing the chance to discover efective strategies for the considered problem. To
this end, OpenAI-ES evaluates samples in both directions (i.e., mirrored sampling [
            <xref ref-type="bibr" rid="ref30">76</xref>
            ]) and ranks them
based on fitness. This process aims to reveal the existence of relationships between the weights and the
ifnal performance. Lastly, OpenAI-ES performs a gradient estimation based on the fitness ranking and
updates the centroid using the Adam optimizer [
            <xref ref-type="bibr" rid="ref31">77</xref>
            ], which retains historical data through a pair of
initialize centroid
max number
of generations
reached
          </p>
          <p>sample from
Gaussian distribution
update centroid using</p>
          <p>Adam optimizer</p>
          <p>Yes
End</p>
          <p>generate offspring
(symmetric sampling)</p>
          <p>estimate gradient
evaluate offspring
through fitness function
rank offspring by
evaluation values
momentum vectors (mean and variance).</p>
          <p>
            The experiments were conducted using evorobotpy3 [
            <xref ref-type="bibr" rid="ref32">78</xref>
            ], a modern simulation tool that contains
the implementations of a variety of EAs and some predefined problems. Moreover, it is integrated
with libraries such as Gymnasium [
            <xref ref-type="bibr" rid="ref33">79</xref>
            ] and Pybullet [49], enabling users to easily create customized
simulations.
          </p>
          <p>A detailed list of the parameters used for both scenarios is provided in Table 2.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Robot Aggregation problem</title>
        <p>In the first scenario, we aim to test whether the distance reward   , defined in Eq. 3, enables the
discovery of efective aggregation behaviors. By examining the performance of the MAS at the end of
the evolutionary process, we can see that the average fitness   is 1.190 (see Fig. 6), with a standard
deviation of 0.028. During the evolutionary process, the analysis of the fitness curve reveals that
OpenAI-ES rapidly improves performance in the initial steps and stabilizes at around 5 × 108 steps
(see Fig. 6). This implies that the algorithm quickly finds good solutions, whereas the refinement of
the discovered strategies requires a longer evolutionary process. If we analyze the final aggregation
achieved by the agents, we can observe that the robots manage to discover strategies that ultimately
lead to swarm aggregation (see Fig. 7). In particular, the MAS forms a more compact group, located
around the center of the arena. This underscores that the distance reward   fosters the development of
efective aggregation behaviors.</p>
        <p>(a)
(b)
(c)
0.0
0.2
0.4</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. AntBullet Swarm problem</title>
        <p>Based on the considerations reported in the previous section, we investigated whether the fitness
function defined in Eq. 5 allows the OpenAI-ES algorithm to evolve strategies in which robots aggregate
during locomotion. As we already pointed out in Section 3, we divided the evolutionary process into two
separate phases and exploited incremental evolution to enhance agents’ performance and capabilities.
The mean fitness at the end of evolution is 4444.679 (standard deviation 230.541) and indicates that
OpenAI-ES discovers behavioral strategies in which agents exhibit a good locomotion behavior (see the
ifrst half of the evolutionary process in Fig. 8). The average fitness obtained before the switch (indicated
by the vertical line in Fig. 8) is 1655.366, with a standard deviation of 832.893. These results align with
those reported in [47], although the diferent number of inputs and hidden neurons prevents a direct
comparison.</p>
        <p>It is worth noting that the ability to locomote is independent of other agents. However, being
able to locomote is crucial for the development of aggregation behaviors. In the second phase of the
evolutionary process, thanks to the use of the reward function defined in Eq. 3, the agents increase
their performance soon after the switch (see the second half of the evolutionary process in Fig. 8),
although the improvements in aggregation capability are quite limited. This underscores the dificulty
of designing adequate and efective reward functions for MOO problems.</p>
        <p>Furthermore, if we examine the final positions of the agents (Fig. 9), we can observe that the
aggregation reached by the MAS here is less efective than the previous scenario. In fact, agents
fail to move close to one another. Sometimes, the majority of agents succeed in approaching each
other (Fig. 9-(a)), while in other cases the MAS disperses (Fig. 9-(b)). Finally, in most cases, the final
configuration of the MAS looks similar to the initial cross formation (Fig. 9-(c)). Overall, the modified
iftness function and the added camera sensor, useful to perceive the others, allow slight improvements
compared to the results reported in [47], where the authors underline the complete absence of such
a capability. Nonetheless, the function defined in Eq. 3 does not lead to further enhancements with
respect to the aggregation capability.</p>
        <p>(a)
(b)
(c)</p>
        <p>To reinforce the aggregation analysis, we calculated the dispersion metric (see [82]), which assesses
swarm cohesion and is defined in Eq. 9. This metric was introduced in our previous studies ([14, 15]:
  = 1 ∑ ||  − || ̄ 2 (9)</p>
        <p>4  2 =1</p>
        <p>Eq. 9 takes into account the final spatial arrangement of the whole group, where   denotes the
position of agent  ,  ̄ indicates the COG of the swarm,   refers to the robot radius. The notation || ⋅ ||
denotes the Euclidean norm.</p>
        <p>Thus, to analyze the dynamics of the solutions, Fig. 10 illustrates how dispersion varies during the
evaluation of the swarm. As can be seen, the dispersion generally increases on average, reaching a final
value of 222.868. This implies that the swarm is dispersing, as the agents have a higher propensity to
locomote and are unable to aggregate properly. Examining the dispersion values achieved by the swarm
reported in Fig. 11, we observe that the group tends to increase its cohesion, with final dispersion values
of 85.621 (Fig. 11-(a)) and 97.025 (Fig. 11-(b)). Therefore, the MOO problem can be addressed more
efectively in the best cases, with the agents capable of achieving a higher level of aggregation.</p>
        <p>Interestingly, the adoption of the incremental evolution paradigm improves locomotion. Specifically,
some agents exhibit refined gaits that make use of all their legs to move (see behavior at https://youtu.
be/gW_LdjBOIbs). This allows overcoming the local minima reported in [47], where at least one leg
remained extended to maintain stability by avoiding falling. Another discovered locomotion strategy
resembles a horse’s gallop (see behavior at https://youtu.be/MRquB4HhEFo); indeed, moving quickly
clearly helps to maximize the   component in Eqs. 6 - 7. However, this type of locomotion conflicts
with the objective of aggregation with others, as the rapid movement tends to reduce coordination
among agents.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In a Multi-Agent System (MAS), agents interact with other entities in order to solve problems that may
be very dificult, if not impossible, for a single agent to address. While solving problems consisting of a
single objective might be relatively trivial for a MAS, dealing with Multi-Objective Optimization (MOO)
is more challenging. It requires the design of appropriate fitness functions and/or the usage of suitable
methods in order to make agents able to evolve efective behaviors. In fact, MOO is characterized by
conflicting objectives that should be optimized simultaneously, which requires the identification of
compromise solutions that perform well across all objectives. For example, evolving both locomotion
and aggregation behaviors is particularly challenging, as demonstrated in [47].</p>
      <p>In this work, we delve into the analysis of methods that allow the evolution of aggregation in a
robotic MAS, particularly focusing on how conflicting objectives can interfere with the evolution
of collective behaviors. In more detail, we design a new function   (Eq. 3), specifically tailored to
promote aggregation, and we test it in two diferent scenarios. The first one is a Mono-Objective
Steps</p>
      <p>Optimization problem where a MAS of 5 MarXbot robots must develop an aggregation capability.
The second scenario involves a MOO problem, the AntBullet Swarm problem introduced in [47], in
which 5 AntBullet robots must evolve the ability to aggregate while locomoting. In order to foster
the development of aggregation behaviors in the latter scenario, we adopted an incremental evolution
framework by splitting evolution into two distinct phases. In the first phase, agents must evolve
only a locomotion capabilities, which are mandatory to aggregate with others. Instead, in the second
phase, agents must evolve both aggregation and locomotion simultaneously. For this purpose, we
modified the fitness function defined in [ 47] by using the new function introduced here and endowing
agents with an omni-directional camera that allows them to perceive others. The results indicate
that the function   successfully promotes aggregation in the Mono-Objective Optimization scenario.
Moreover, it slightly improves the aggregation outcomes with respect to [47], although the agents fail in
discovering behavioral strategies that optimize both objectives (i.e., aggregation and locomotion). Finally,
the modifications to the AntBullet Swarm problem allow the evolution of an improved locomotion
capabilities, which resemble strategies observed in natural organisms.</p>
      <p>For future research, we plan to further investigate the design of functions that enable the evolution of
efective aggregation behavioral strategies in the AntBullet Swarm problem, as well as the adoption of
diferent frameworks, like ontogenetic approaches [ 83], to optimize the agent’s neural network controller.
In this respect, using learning methods, like back-propagation [84] or Spike-Timing-Dependent Plasticity
(STDP) [85, 86], could promote the diferentiation and/or specialization of agents, ultimately leading
to better aggregation capabilities. In addition, employing groups of heterogeneous agents [22] may
be valuable for promoting diversity within the MAS. Moreover, more studies will focus on all aspects
that could afect the generalizability of learned behaviors. Investigating how performance varies in
response to changes in initial conditions, sensory inputs or parameters will reinforce the validity of our
approach.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The work of A.V. was supported by the National Recovery and Resilience Plan (PNRR)-Ministry of
University and Research (MUR) Project through FAIR–Future Artificial Intelligence Research under
Grant PE0000013-CUP B53D22000980006.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
ference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT),
2025.
[11] P. Pagliuca, M. Favia, S. Livi, A. Vitanza, Interdipendenza nei gruppi: esperimenti con robot sociali,</p>
      <p>Sistemi intelligenti 2 (2025) 335–355.
[12] E. Bahgeçi, E. Sahin, Evolving aggregation behaviors for swarm robotic systems: A systematic
case study, in: Proc. IEEE Swarm Intelligence Symposium, 2005, pp. 333–340.
[13] M. Hiraga, Y. Wei, K. Ohkura, Evolving collective cognition for object identification in foraging
robotic swarms, Artificial Life and Robotics 26 (2021) 21–28.
[14] P. Pagliuca, A. Vitanza, Self-organized aggregation in group of robots with OpenAI-ES, in: Int.</p>
      <p>Conf. on Soft Computing and Pattern Recognition, Springer, 2022, pp. 770–780.
[15] P. Pagliuca, A. Vitanza, Evolving aggregation behaviors in swarms from an evolutionary algorithms
point of view, in: Applications of Artificial Intelligence and Neural Systems to Data Science,
Springer, 2023, pp. 317–328.
[16] P. Pagliuca, A. Vitanza, A comparative study of evolutionary strategies for aggregation tasks in
robot swarms: Macro- and micro-level behavioral analysis, IEEE Access 13 (2025) 72721–72735.
[17] D. H. Stolfi, G. Danoy, Evolutionary swarm formation: From simulations to real world robots,</p>
      <p>Engineering Applications of Artificial Intelligence 128 (2024) 107501.
[18] V. Trianni, R. Groß, T. H. Labella, E. Şahin, M. Dorigo, Evolving aggregation behaviors in a swarm
of robots, in: European Conference on Artificial Life, Springer, 2003, pp. 865–874.
[19] H. M. La, R. S. Lim, W. Sheng, J. Chen, Cooperative flocking and learning in multi-robot systems for
predator avoidance, in: 2013 IEEE International Conference on Cyber Technology in Automation,
Control and Intelligent Systems, IEEE, 2013, pp. 337–342.
[20] H. M. La, R. Lim, W. Sheng, Multirobot cooperative learning for predator avoidance, IEEE</p>
      <p>Transactions on Control Systems Technology 23 (2014) 52–63.
[21] J. Li, S. X. Yang, Intelligent collective escape of swarm robots based on a novel fish-inspired
self-adaptive approach with neurodynamic models, IEEE Transactions on Industrial Electronics
(2024).
[22] P. Pagliuca, A. Vitanza, N-mates evaluation: a new method to improve the performance of
genetic algorithms in heterogeneous multi-agent systems., Proceedings of the 24th Edition of the
Workshop From Object to Agents (WOA23) 3579 (2023) 123–137.
[23] P. Pagliuca, A. Vitanza, The role of n in the n-mates evaluation method: a quantitative analysis,
in: 2024 Artificial Life Conference (ALIFE 2024), MIT press, 2024, pp. 812–814.
[24] M. Brambilla, E. Ferrante, M. Birattari, M. Dorigo, Swarm robotics: a review from the swarm
engineering perspective, Swarm Intelligence 7 (2013) 1–41.
[25] E. Şahin, Swarm robotics: From sources of inspiration to domains of application, in: International
workshop on swarm robotics, Springer, 2004, pp. 10–20.
[26] H. Hamann, Swarm robotics: A formal approach, volume 221, Springer, 2018.
[27] M. Dorigo, G. Theraulaz, V. Trianni, Swarm robotics: Past, present, and future [point of view],</p>
      <p>Proceedings of the IEEE 109 (2021) 1152–1165.
[28] N. Horsevad, H. L. Kwa, R. Boufanais, Beyond bio-inspired robotics: how multi-robot systems
can support research on collective animal behavior, Frontiers in Robotics and AI 9 (2022) 865414.
[29] T. Back, Evolutionary algorithms in theory and practice: evolution strategies, evolutionary
programming, genetic algorithms, Oxford university press, 1996.
[30] F. J. Gomez, R. Miikkulainen, et al., Solving non-markovian control tasks with neuroevolution, in:
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), volume 99,
1999, pp. 1356–1361.
[31] C. Igel, Neuroevolution for reinforcement learning using evolution strategies, in: The 2003</p>
      <p>Congress on Evolutionary Computation, 2003. CEC’03., volume 4, IEEE, 2003, pp. 2588–2595.
[32] P. Pagliuca, N. Milano, S. Nolfi, Maximizing adaptive power in neuroevolution, PloS one 13 (2018)
e0198788.
[33] C. Lamini, S. Benhlima, A. Elbekri, Genetic algorithm based approach for autonomous mobile
robot path planning, Procedia Computer Science 127 (2018) 180–189.
[34] P. Pagliuca, S. Nolfi, Integrating learning by experience and demonstration in autonomous robots,</p>
      <p>Adaptive Behavior 23 (2015) 300–314.
[35] A. Ram, G. Boone, R. Arkin, M. Pearce, Using genetic algorithms to learn reactive control
parameters for autonomous robotic navigation, Adaptive behavior 2 (1994) 277–305.
[36] P. Pagliuca, D. Y. Inglese, The importance of functionality over complexity: A preliminary study
on feed-forward neural networks, in: Advanced Neural Artificial Intelligence: Theories and
Applications, Springer, 2025, pp. 447–458.
[37] S. M. Elsayed, R. A. Sarker, D. L. Essam, A new genetic algorithm for solving optimization problems,</p>
      <p>Engineering Applications of Artificial Intelligence 27 (2014) 57–69.
[38] P. Pagliuca, Analysis of the exploration-exploitation dilemma in neutral problems with evolutionary
algorithms, Journal of Artificial Intelligence and Autonomous Intelligence 1 (2024) 8.
[39] S. Nolfi, D. Floreano, Coevolving predator and prey robots: Do “arms races” arise in artificial
evolution?, Artificial life 4 (1998) 311–335.
[40] S. Nolfi, P. Pagliuca, Global progress in competitive co-evolution: a systematic comparison of
alternative methods, Frontiers in Robotics and AI 11 (2025) 1470886.
[41] K. Deb, Multi-objective optimisation using evolutionary algorithms: an introduction, in:
Multiobjective evolutionary optimisation for product design and manufacturing, Springer, 2011, pp.
3–34.
[42] C. M. Fonseca, P. J. Fleming, An overview of evolutionary algorithms in multiobjective optimization,</p>
      <p>Evolutionary computation 3 (1995) 1–16.
[43] J. Horn, N. Nafpliotis, D. E. Goldberg, A niched pareto genetic algorithm for multiobjective
optimization, in: Proceedings of the first IEEE conference on evolutionary computation. IEEE
world congress on computational intelligence, Ieee, 1994, pp. 82–87.
[44] E. Zitzler, Evolutionary algorithms for multiobjective optimization: Methods and applications,
volume 63, Shaker Ithaca, 1999.
[45] K. Deb, K. Sindhya, J. Hakanen, Multi-objective optimization, in: Decision sciences, CRC Press,
2016, pp. 161–200.
[46] Y. Censor, Pareto optimality in multiobjective problems, Applied Mathematics and Optimization 4
(1977) 41–59.
[47] P. Pagliuca, A. Vitanza, Enhancing aggregation in locomotor multi-agent systems: a theoretical
framework, Proceedings of the 25th Edition of the Workshop From Object to Agents (WOA24)
3735 (2024) 42–57.
[48] T. Salimans, J. Ho, X. Chen, S. Sidor, I. Sutskever, Evolution strategies as a scalable alternative to
reinforcement learning, arXiv preprint arXiv:1703.03864 (2017).
[49] E. Coumans, Y. Bai, Pybullet, a python module for physics simulation for games, robotics and
machine learning, 2016.
[50] Z. Firat, E. Ferrante, Y. Gillet, E. Tuci, On self-organised aggregation dynamics in swarms of robots
with informed robots, Neural Computing and Applications 32 (2020) 13825–13841.
[51] J. Menezes, E. Rangel, B. Moura, Aggregation as an antipredator strategy in the rock-paper-scissors
model, Ecological Informatics 69 (2022) 101606.
[52] J. Nauta, P. Simoens, Y. Khaluf, Memory induced aggregation in collective foraging, in:
International conference on swarm intelligence, Springer, 2020, pp. 176–189.
[53] T. M. Konijn, K. B. Raper, Cell aggregation in dictyostelium discoideum, Developmental Biology 3
(1961) 725–756.
[54] R. Q. Kidner, E. B. Goldstone, H. J. Rodefeld, L. P. Brokaw, A. M. Gonzalez, N. Ros-Rocher, J. P. Gerdt,
Exogenous lipid vesicles induce endocytosis-mediated cellular aggregation in a close unicellular
relative of animals, bioRxiv (2024) 2024–05.
[55] S.-H. Cheng, H.-Y. Zhang, M.-Y. Zhu, L. M. Zhou, G.-H. Yi, X.-W. He, J.-Y. Wu, J.-L. Sui, H. Wu, S.-J.</p>
      <p>Yan, et al., Observations of linear aggregation behavior in rotifers (brachionus calyciflorus), PLoS
One 16 (2021) e0256387.
[56] A. Cavagna, A. Cimarelli, I. Giardina, G. Parisi, R. Santagati, F. Stefanini, M. Viale, Scale-free
correlations in starling flocks, Proceedings of the National Academy of Sciences 107 (2010)
[80] X. Glorot, Y. Bengio, Understanding the dificulty of training deep feedforward neural networks,
in: Proceedings of the thirteenth international conference on artificial intelligence and statistics,
JMLR Workshop and Conference Proceedings, 2010, pp. 249–256.
[81] S. Iofe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal
covariate shift, arXiv preprint arXiv:1502.03167 (2015).
[82] M. Gauci, J. Chen, W. Li, T. J. Dodd, R. Groß, Self-organized aggregation without computation,</p>
      <p>The Int. Jrnl. of Robotics Research 33 (2014) 1145–1161.
[83] D. Floreano, P. Husbands, S. Nolfi, Evolutionary robotics, Handbook of robotics (2008).
[84] D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by back-propagating errors,</p>
      <p>Nature 323 (1986) 533–536.
[85] S. Song, K. D. Miller, L. F. Abbott, Competitive hebbian learning through spike-timing-dependent
synaptic plasticity, Nature neuroscience 3 (2000) 919–926.
[86] A. Vitanza, L. Patané, P. Arena, Spiking neural controllers in multi-agent competitive systems for
adaptive targeted motor learning, Journal of the Franklin Institute 352 (2015) 3122–3143.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dorri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Kanhere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jurdak</surname>
          </string-name>
          <article-title>, Multi-agent systems: A survey</article-title>
          ,
          <source>IEEE Access 6</source>
          (
          <year>2018</year>
          )
          <fpage>28573</fpage>
          -
          <lpage>28593</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Julian</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <article-title>Botti, Multi-agent systems</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <fpage>1402</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.</given-names>
            <surname>Van der Hoek</surname>
          </string-name>
          , M.
          <article-title>Wooldridge, Multi-agent systems</article-title>
          ,
          <source>Foundations of Artificial Intelligence</source>
          <volume>3</volume>
          (
          <year>2008</year>
          )
          <fpage>887</fpage>
          -
          <lpage>928</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Aldana-Franco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Montes-González</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Nolfi,</surname>
          </string-name>
          <article-title>The improvement of signal communication for a foraging task using evolutionary robotics</article-title>
          ,
          <source>Journal of Applied Research and Technology</source>
          <volume>22</volume>
          (
          <year>2024</year>
          )
          <fpage>90</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Calvez</surname>
          </string-name>
          , G. Hutzler,
          <article-title>Automatic tuning of agent-based models using genetic algorithms</article-title>
          , in: International workshop on multi-
          <source>agent systems and agent-based simulation</source>
          , Springer,
          <year>2005</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hiraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ohkura</surname>
          </string-name>
          ,
          <article-title>Evolving collective cognition of robotic swarms in the foraging task with poison</article-title>
          ,
          <source>in: 2019 IEEE Congress on Evolutionary Computation (CEC)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>3205</fpage>
          -
          <lpage>3212</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nolfi</surname>
          </string-name>
          ,
          <article-title>Robust optimization through neuroevolution</article-title>
          ,
          <source>PLOS ONE 14</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Inglese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vitanza</surname>
          </string-name>
          ,
          <article-title>Measuring emergent behaviors in a mixed competitivecooperative environment</article-title>
          ,
          <source>International Journal of Computer Information Systems and Industrial Management Applications</source>
          <volume>15</volume>
          (
          <year>2023</year>
          )
          <fpage>69</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <article-title>Learning and evolution: factors influencing an efective combination</article-title>
          ,
          <source>AI</source>
          <volume>5</volume>
          (
          <year>2024</year>
          )
          <fpage>2393</fpage>
          -
          <lpage>2432</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Favia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Livi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vitanza</surname>
          </string-name>
          ,
          <article-title>Conceptualizing evolving interdependence in groups: Insights from the analysis of two-agent systems</article-title>
          ,
          <source>in: Proceedings of the 21st International Con11865-11870.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>B. L.</given-names>
            <surname>Partridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Pitcher</surname>
          </string-name>
          ,
          <article-title>The sensory basis of fish schools: relative roles of lateral line and vision</article-title>
          ,
          <source>Journal of comparative physiology 135</source>
          (
          <year>1980</year>
          )
          <fpage>315</fpage>
          -
          <lpage>325</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Pitcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. L.</given-names>
            <surname>Partridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wardle</surname>
          </string-name>
          ,
          <article-title>A blind fish can school</article-title>
          ,
          <source>Science</source>
          <volume>194</volume>
          (
          <year>1976</year>
          )
          <fpage>963</fpage>
          -
          <lpage>965</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [59]
          <string-name>
            <given-names>T.</given-names>
            <surname>Shibata</surname>
          </string-name>
          , T. Fukuda,
          <article-title>Coordinative behavior in evolutionary multi-agent system by genetic algorithm</article-title>
          ,
          <source>in: IEEE International Conference on Neural Networks, IEEE</source>
          ,
          <year>1993</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [60]
          <string-name>
            <surname>R. O'Grady</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Groß</surname>
            ,
            <given-names>A. L.</given-names>
          </string-name>
          <string-name>
            <surname>Christensen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dorigo</surname>
          </string-name>
          ,
          <article-title>Self-assembly strategies in a group of autonomous mobile robots</article-title>
          ,
          <source>Autonomous Robots</source>
          <volume>28</volume>
          (
          <year>2010</year>
          )
          <fpage>439</fpage>
          -
          <lpage>455</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [61]
          <string-name>
            <given-names>R.</given-names>
            <surname>Asad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hayakawa</surname>
          </string-name>
          , T. Yasuda,
          <article-title>Evolutionary design of cooperative transport behavior for a heterogeneous robotic swarm</article-title>
          ,
          <source>Journal of Robotics and Mechatronics</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <fpage>1007</fpage>
          -
          <lpage>1015</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [62]
          <string-name>
            <surname>I. Rechenberg</surname>
          </string-name>
          , Evolutionsstrategie:
          <article-title>Optimierung technischer systeme nach prinzipien der biologischen evolution, frommann-holzboog,</article-title>
          <string-name>
            <surname>Stuttgart-Bad Cannstatt</surname>
          </string-name>
          (
          <year>1973</year>
          )
          <fpage>47</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [63]
          <string-name>
            <given-names>H.-P.</given-names>
            <surname>Schwefel</surname>
          </string-name>
          ,
          <article-title>Numerische Optimierung von Computer-Modellen mittels der Evolutionsstrategie: mit einer vergleichenden Einführung in die Hill-Climbing-und Zufallsstrategie</article-title>
          , volume
          <volume>1</volume>
          , Springer,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [64]
          <string-name>
            <given-names>J. Rais</given-names>
            <surname>Martínez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Aznar</surname>
          </string-name>
          <string-name>
            <surname>Gregori</surname>
          </string-name>
          ,
          <article-title>Comparison of evolutionary strategies for reinforcement learning in a swarm aggregation behaviour</article-title>
          ,
          <source>in: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [65]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ostermeier</surname>
          </string-name>
          ,
          <article-title>Completely derandomized self-adaptation in evolution strategies</article-title>
          ,
          <source>Evolutionary computation 9</source>
          (
          <year>2001</year>
          )
          <fpage>159</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [66]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Holland</surname>
          </string-name>
          ,
          <source>Adaptation in natural and artificial systems</source>
          , University Michigan Press,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [67]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wierstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schaul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Glasmachers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Natural evolution strategies</article-title>
          ,
          <source>The Journal of Machine Learning Research</source>
          <volume>15</volume>
          (
          <year>2014</year>
          )
          <fpage>949</fpage>
          -
          <lpage>980</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [68]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Katada</surname>
          </string-name>
          ,
          <article-title>Evolutionary design method of probabilistic finite state machine for swarm robots aggregation</article-title>
          ,
          <source>Artificial Life and Robotics</source>
          <volume>23</volume>
          (
          <year>2018</year>
          )
          <fpage>600</fpage>
          -
          <lpage>608</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [69]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kennedy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Eberhart</surname>
          </string-name>
          ,
          <article-title>Particle swarm optimization</article-title>
          ,
          <source>in: Proceedings of ICNN'95-international conference on neural networks</source>
          , volume
          <volume>4</volume>
          , ieee,
          <year>1995</year>
          , pp.
          <fpage>1942</fpage>
          -
          <lpage>1948</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [70]
          <string-name>
            <given-names>A.</given-names>
            <surname>Leccese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gasparri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Priolo</surname>
          </string-name>
          , G. Oriolo,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ulivi</surname>
          </string-name>
          ,
          <article-title>A swarm aggregation algorithm based on local interaction with actuator saturations and integrated obstacle avoidance</article-title>
          ,
          <source>in: 2013 IEEE International Conference on Robotics and Automation</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>1865</fpage>
          -
          <lpage>1870</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [71]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bonani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Longchamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Magnenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rétornaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Burnier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Roulet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vaussard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bleuler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mondada</surname>
          </string-name>
          ,
          <article-title>The marxbot, a miniature mobile robot opening new perspectives for the collectiverobotic research</article-title>
          ,
          <source>in: 2010 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems</source>
          , IEEE,
          <year>2010</year>
          , pp.
          <fpage>4187</fpage>
          -
          <lpage>4193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [72]
          <string-name>
            <given-names>F.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Miikkulainen</surname>
          </string-name>
          ,
          <article-title>Incremental evolution of complex general behavior</article-title>
          ,
          <source>Adaptive Behavior</source>
          <volume>5</volume>
          (
          <year>1997</year>
          )
          <fpage>317</fpage>
          -
          <lpage>342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [73]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Bagnell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <article-title>Reinforcement learning in robotics: A survey</article-title>
          ,
          <source>The International Journal of Robotics Research</source>
          <volume>32</volume>
          (
          <year>2013</year>
          )
          <fpage>1238</fpage>
          -
          <lpage>1274</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [74]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Milano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nolfi</surname>
          </string-name>
          ,
          <article-title>Eficacy of modern neuro-evolutionary strategies for continuous control optimization</article-title>
          ,
          <source>Frontiers in Robotics and AI</source>
          <volume>7</volume>
          (
          <year>2020</year>
          )
          <fpage>98</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [75]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Nolfi,</surname>
          </string-name>
          <article-title>The dynamic of body and brain co-evolution, Adaptive Behavior 30 (</article-title>
          <year>2022</year>
          )
          <fpage>245</fpage>
          -
          <lpage>255</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [76]
          <string-name>
            <given-names>D.</given-names>
            <surname>Brockhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Auger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Arnold</surname>
          </string-name>
          , T. Hohm,
          <article-title>Mirrored sampling and sequential selection for evolution strategies</article-title>
          ,
          <source>in: International Conference on Parallel Problem Solving from Nature</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [77]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Adam: A method for stochastic optimization</article-title>
          ,
          <source>preprint arXiv:1412.6980</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [78]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pagliuca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nolfi</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Vitanza,</surname>
          </string-name>
          <article-title>Evorobotpy3: a flexible and easy-to-use simulation tool for evolutionary robotics</article-title>
          ,
          <source>in: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2025 Companion)</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [79]
          <string-name>
            <given-names>M.</given-names>
            <surname>Towers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kwiatkowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Terry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. U.</given-names>
            <surname>Balis</surname>
          </string-name>
          , G. De Cola,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deleu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Goulao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kallinteris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krimmel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>KG</surname>
          </string-name>
          , et al.,
          <article-title>Gymnasium: A standard interface for reinforcement learning environments</article-title>
          ,
          <source>preprint arXiv:2407.17032</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>