<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>munication</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anastasios N. Kontogiorgis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Melanie Bouroche</string-name>
          <email>melanie.bouroche@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ATT'24: Workshop Agents in Trafic and Transportation</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science and Statistics, Trinity College Dublin</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Reliable communication is crucial for efective and safe coordination among connected autonomous vehicles (CAVs), especially in complex scenarios such as roundabouts or intersections. This work investigates the efect of unreliable communication on emergent communication (EC) for coordinating autonomous vehicles at nonsignalised intersections. Existing EC solutions typically assume reliable - error and noise free - communication, which is usually not the case in realistic scenarios. We evaluate how communication limitations such as message noise, partial and whole message loss afect the performance of four state-of-the-art models, namely CommNet, TarMac, IC3Net and GA-Comm in a non-signalised intersection task of increasing dificulty and reduced visibility. We investigate each model's resilience to these communication disturbances, additionally analysing the comparative impact of each disturbance type and intensity on model success rates.</p>
      </abstract>
      <kwd-group>
        <kwd>noise</kwd>
        <kwd>message drops</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Intelligent transportation systems (ITS) aim to enable safer and more sustainable transportation by
alleviating current mobility issues such as accidents, pollution and ineficient utilisation of resources [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Among the ITS technologies, vehicle-to-vehicle (V2V) communication has the potential to revolutionise
the transportation sector and enable improvements in safety, energy eficiency, and infrastructure
utilisation [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. More specifically, information sharing among vehicles through vehicular ad hoc
networks (VANETs) in which Connected Autonomous Vehicles (CAVs) communicate through V2V
communication, and road-side infrastructure through vehicle-to-infrastructure communication (V2I),
can extend situational awareness and assist in building a richer representation of the CAVs’ extended
neighbourhood [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        An especially challenging subset of trafic scenarios arises when vehicles need to coordinate in order
to use a common resource, such as a roundabout or an intersection. Intersections are among the most
complex and ineficient elements of current trafic systems [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] in which a disproportionate number of
accidents, injuries and fatalities occur [7, 8]. Traditional solutions in autonomous driving manually
define which actions to perform in specific situations using techniques such as behaviour-trees or finite
state machines [9, 10]. While successful, these approaches lack the ability to generalise and cater for
unexpected situations [11]. Designing ‘universal’ hand-crafted rules that can handle the complexity of
all possible scenarios accounting for uncertainty and the intricate relations that emerge between road
actors becomes a daunting task. Indeed, specifying a-priori intelligent behaviour in complex systems is
considered challenging, if not impossible [12, 13].
      </p>
      <p>Multi-agent reinforcement learning (MARL) provides a powerful framework for tackling problems
in which the joint actions of multiple decision-making agents influence a shared environment. By
modelling the interactions between agents, MARL has the potential to develop autonomous systems
capable of navigating complex environments and collaborating to solve challenging tasks [14]. In
(M. Bouroche)
https://github.com/AnastasiosKo/NoisyComm (A. N. Kontogiorgis); https://www.scss.tcd.ie/Melanie.Bouroche/</p>
      <p>CEUR</p>
      <p>ceur-ws.org
particular, MARL has been proposed as a promising solution for coordinating connected autonomous
vehicles (CAVs) in complex trafic scenarios [ 15, 16]. A key advantage of MARL is its ability to learn
from experience and generalise to unexpected situations, rather than relying on predetermined rules.
This flexibility allows for the development of strategies that can better handle complex and dynamic
trafic scenarios.</p>
      <p>Recently, emergent communication (EC) between reinforcement learning (RL) agents has gathered
significant interest in the research community since the pioneering works of [ 17, 18]. In this approach,
agents learn to coordinate through a shared channel, allowing for the discovery of communication
protocols based on task requirements. Learned communication tends to be more flexible and goal
oriented leading to improvements in coordination and task success [19, 20]. While significant advances
have been made in the field, showing promise for solving real-world problems, achieving robust
performance requires addressing communication reliability. In realistic environments the quality of
communication is subject to changes due to interference such as noise, message jumbling, information
congestion, message delays and losses [21]. While real-world constraints, namely limited bandwidth,
communication bottleneck and eficiency issues, have been addressed in the literature [ 22, 23, 24, 25],
communication itself is largely assumed to be error and noise-free [26, 27].</p>
      <p>To address this limitation, this work investigates the behaviour of existing EC solutions - typically
assuming reliable communication - in the presence of noise and message drops. We aim to address this
gap by methodically testing the following state-of-the-art EC solutions, CommNet [18], IC3Net [24],
TARMAC [22] and GA-Comm [28], analysing how communication constraints afect task performance
in a non-signalised intersection environment. We investigate model robustness in the presence of noise
and the impact of each type of communication disturbance on task performance.</p>
      <p>This work makes the following contributions:
• A review and critique of existing literature in emergent communication tackling noise.
• A systematic investigation of the efects of unreliable communication on agent performance in
the widely used non-signalised intersection environment first introduced by [ 18].
• A comparative analysis of model resilience to communication disturbances - disturbances not
present during training - detailing the comparative impact of each disturbance type and intensity
on model performance.</p>
      <p>The rest of this paper is organised as follows. Section 2 provides a review of emergent communication
applied in non-signalised intersections, and of the current literature on emergent communication in the
presence of noise. Section 3 details the experimental setup including the non-signalised intersection
environment used for evaluation and the noise models that emulate communication disturbances. In
Section 4, the findings of training and testing with noise and message drops are presented and analysed,
along with a discussion on the comparative impact of noise type and intensity on the performance
of the models tested. Finally, Section 5 summarises the key findings and concludes the paper with a
discussion of future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>Emergent communication among reinforcement learning agents is an evolving field that aims to develop
communication protocols without prior explicit instructions or predefined language rules, allowing
agents to collaborate towards a common goal. This approach enables the development of communication
strategies based on the requirements of the task, providing increased flexibility. A question that naturally
arises is whether (and how well) agents can learn a ’language’ over a joint communication channel,
allowing them to maximise their utility [29].</p>
      <p>A large body of work in the field has been directed toward ‘language’-based coordination of deep
RL agents in complex tasks. Early works [30, 31] utilised predefined communication protocols to
facilitate information exchange and collaboration, however, such approaches may prove rigid. In recent
years learnable communication has been widely explored with the emerging communication protocols
collaboratively solving tasks such as riddles [17], navigation in complex 3D environments [22] and
agent coordination in non-signalised intersection environments [18, 24, 22, 28, 23].</p>
      <p>The following sections explore the use of emergent communication in non-signalised intersection,
highlighting state-of-the-art approaches that enhance agent coordination and communication eficiency.
We further discuss the impact of communication constraints, and the concept of noise in emergent
communication, broadly categorising relevant literature based on the type of noise it addresses.</p>
      <sec id="sec-2-1">
        <title>2.1. Non-signalised Intersection Environment</title>
        <p>For training and validation, we employ the non-signalised intersection first introduced by [ 18], a
widelyused [24, 22, 28, 23] environment for bench-marking emergent communication models. It comprises
intersecting pathways and agents with default vision of a 3×3 surrounding grid, limiting the visual range
afects the agent ability to navigate the intersection necessitating communication to avoid collisions.
Agents enter the intersection with probability  arrive with the maximum number of cars at any moment
given by  max, which changes according to the level of dificulty. Each agent occupies a single grid
cell per time-step and has an available action space of ’accelerate’ (advancing one cell) or ’brake’ (no
move). The reward consists of a penalty −0.01 that accumulates linearly over time and a collision
penalty  collision = −10 which classifies the episode as a failure. Success rate, used as an evaluation
metric, indicates whether a collision occurred during an episode. The total reward at time  is given by:
 
 () =    collision + ∑    time,
=1
where   is the number of collisions at time  , and   is the number of cars present.</p>
        <p>We focus on the easy level, shown in Figure 1, which features a pair of two one-way lanes within a
7 × 7 grid, accommodating a maximum of five vehicles (  max = 5,  arrive = 0.3).</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Emergent Communication for non-signalised Intersection Crossing</title>
        <p>Communication is an important aspect in intersection environments where agents need to coordinate
their actions to avoid collisions and optimise trafic flow. Various techniques have been proposed to
enable agents to learn when, what, and to whom to communicate, enhancing cooperation, communication
eficiency and task performance.</p>
        <p>One of the pioneering works in the field, CommNet [ 18] introduced a multi-pass communication
framework for exchanging continuous-valued messages containing averaged transmissions of encoded
hidden states, either by broadcasting or to agents within a certain range; communication in the latter
case can be represented as a dynamic graph. Testing in the non-signalised intersection with agent
visibility set to zero, CommNet achieved a 90% average success rate of no collisions within an episode
indicating that agents successfully use the emerging communication to coordinate. In IC3Net [24], a
gating mechanism to control the communication action (i.e. whether the agent will communicate or
not at the next step) and individualised rewards are additionally introduced, extending applicability to
non-cooperative scenarios and improving scalability. Focusing on selective communication, TARMAC
[22] uses attention to determine the recipients of goal-specific messages and enables dynamic team
sizes inside which agents can communicate. Leveraging the representation power of graphs, [28]
models agent relationships using a complete graph and two-stage — hard and soft — attention to detect
and assess the importance of interactions between agents, allowing the model to dynamically adapt
to the complexities of the environment by focusing on relevant interactions. The authors further
extend this approach to a communication model (GA-COMM) by allowing each agent to attend to the
messages of others when making decisions. Similarly, MAGIC [23] employs graph-attention to target
communication and for message processing. Evaluating in the non-signalised intersection showed high
success rates in dificult settings with reduced visibility for both approaches.</p>
        <p>These approaches assume reliable communication, which is not realistic for practical scenarios.
Messages can be limited in size, due to bandwidth limitations, and range. Additionally, unreliable
connections can introduce various forms of interference such as noise corrupting message content,
delays, message jumbling -messages reach the agents mixed-up in content- and losses, disrupting
information exchange and accuracy. These constraints can impact the learning performance of agents
and task success, presenting a challenge that needs to be addressed. Several studies have proposed
techniques that optimise the communication process by either addressing specific agents [ 22] or utilising
Networked MARL (NMARL) [32] allowing communication with neighbouring agents only. Other studies
focus on reducing communication overhead by enabling agents to choose whether to communicate or
not [24, 33, 34]. While such work addresses communication quality from the perspective of eficiency,
it similarly assumes limited-capacity yet reliable error-free communication and does not consider
underlying communication channel characteristics such as noise.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Learning Communication in the Presence of Noise</title>
        <p>When considering noise in emergent communication, studies can be broadly categorised into two
groups: (1) those leveraging communication to minimise the impact of noise in agent observations: for
example by sharing information, agents can reach an agreement on the state of a complex environment
[35]; and (2) those that address communication channel noise which afects the reliability of exchanged
messages. We will focus on the latter category.</p>
        <p>This group of studies relates to Levels A and C of Shannon and Weaver[36], referred to as the
’technical’ and ’efectiveness’ problems, focusing on how agents can achieve coordination by learning to
communicate over a noisy channel. In particular, [37] address a cooperative task involving two agents
communicating over a noisy link. The authors employ a joint strategy for simultaneously learning
both communication and action selection, leading to improved performance and resulting in a learnt
communication scheme that incorporates both data compression and error protection. The coordination
task in this setup, however, does not explicitly depend on communication, as agents can independently
be trained to navigate to the goal -which is fixed to a specific position across all episodes. Similarly, [ 38]
explore the problem of a guide coordinating a scout over a noisy communication link. Here, the optimal
policy is learnt by taking into account channel limitations instead of assuming perfect communication
and subsequently employing a communication protocol to convey the actions of the guide. While this
approach demonstrates efective learning under noise, its applicability to larger environments involving
more complex tasks is not described by the authors. Scaling to a larger number of agents would require
innovations in the structure of the learnt communication model [39].</p>
        <p>The literature so far has shown that agents can efectively coordinate in the presence of noisy
communication. Notably, the ‘language’ that emerges in such conditions is distinct from that which
arises in scenarios with reliable communication [26] and outperforms conventional communication
[40]. Nonetheless, these studies typically focused on relatively simple tasks involving two agents and a
single form of noise. A question that arises is how these adaptive communication strategies scale to
complex scenarios of increasing dificulty with multiple agents coordinating while communication is
afected by a range of interferences.</p>
        <p>In the next part of this paper, we investigate how emergent communication models applied in a
non-signalised intersection environment perform in the presence of unreliable communication.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>For a comprehensive evaluation of how diferent emergent communication mechanisms cope with
unreliable communication, we compare state-of-the-art EC models CommNet, TarMac, IC3Net and
GA-Comm in multi-agent cooperative scenarios on the commonly used trafic junction environment
[18], by introducing noise and random message drops to address the following questions:
• What are the efects of unreliable communication on task performance for the above-mentioned
models in a cooperative intersection scenario? Essentially, how do models cope in the presence
of noise?</p>
      <sec id="sec-3-1">
        <title>3.1. Design</title>
        <p>• What is the impact of each type of communication disturbance on model performance?
We further discuss the design and experimental setup for evaluating in the presence of unreliable
communication, this includes detailing the training and testing settings, the noise models employed to
simulate communication disturbances, and evaluation metrics used to assess model performance.
To maintain consistency we have adopted parameter values and a training method that align with
the original studies. Models are trained using multi-threaded synchronous policy gradient [24], each
thread runs batch learning with a batch size of 500 and performs 10 weight updates per epoch. The
RMSProp optimizer is employed with a learning rate  of 1 × 10−3, a discount factor  of 1.0, and a value
coeficient  of 1 × 10−2. More specifically, an additional value head is used in the policy network to
estimate a value function,   (  ), at each agent’s observation   . Along with optimising the discounted
total rewards, the training process also minimises the squared error of the estimated value, which acts
a baseline, against the Monte Carlo predicted value. This process is balanced by coeficient  . The
overall loss function (⋅) , and the policy function,   (  |  ), share most of the parameters  and  , except

those in the policy and value head. Each agent’s LSTM hidden state size is set to 128 dimensions and
subsequently, the message size is 128 dimensions. We use two rounds of communication as empirically
this has shown to provide better performance and training speed in reliable communication conditions
[41]. Finally, we run the trafic junction experiment for
2000 epochs and for a single seed. Post-training
we assess how each trained model generalises in the presence of unreliable communication for 1000
epochs.</p>
        <p>We use success rate as an evaluation metric signifying no collisions within an episode and limit agent
vision to size 1 to increase dificulty and promote communication among the agents. Each experiment
employs curriculum learning to facilitate training. More specifically,  arrive is maintained at the initial
value for the first 250 epochs, after which it is gradually increased between epochs 250 and 1250 to its
ifnal value</p>
        <p>0.3, where it is maintained until the end, as shown in Table 1</p>
        <p>For validation, we employ 1000 epochs under a dificulty setting analogous to training, featuring two
one-way lanes and an identical vision range. The dificulty increase through
 arrive is proportionally
aligned with the training phase to focus on model’s ability to cope in the presence of communication
noise. Specifically, we adjust  arrive for the initial 125 epochs as easy, the next 500 epochs as harder,
and the final 375 epochs as the hardest. This scaling preserves the ratio of dificulty progression
during training, with the Easy phase representing 12.5% of the total epochs, the Harder phase 50%, and
the Hardest phase 37.5%. These settings are designed to isolate the impact of communication noise
on performance by maintaining a consistent dificulty increase, therefore any observed performance
degradation is attributed to the introduction of noise rather than the dificulty levels.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Noise Models</title>
        <p>As discussed in Section 2.2 messages can become corrupted by noise, lost, jumbled in content or delayed.
We address the first two cases, where noise is introduced into the communication tensor, altering message
values, and where entire messages or parts of them are lost. Noise is applied to the communication
tensor, and the efects on model success rates are measured against noiseless communication. The
specific parameters and probabilities associated with noise and message drops are detailed in Table 2,
which provides an overview of the diferent noise models and their impact on communication.
• Gaussian noise is added to the signal with a mean of  and a controlled intensity level represented
by the standard deviation  , denoted as  ( = 0,  ) . Noise intensity is regulated by adjusting  ,
which afects the spread or ’width’ of the noise injected into the signal. This type of noise is used
to simulate noise that comes from natural sources, such as thermal noise or interference.
• Uniform Noise, also known as random valued noise, distributes noise uniformly across a given
range creating a uniform distribution of noise values. The noise distribution is generated within
the range [, ] , where  is low and  is high. This noise model is used to simulate random events,
such as random bit errors in a communication system.
• Partial Message Drop simulates the scenario where parts of the messages are lost during
transmission with probability  partial. A mask is generated using a Bernoulli distribution with a
probability equal to  partial for each element of the communication tensor, c. The mask is then
applied to the communication tensor, resulting in a partial loss of message parts.
• Whole Message Drop simulates the condition where all messages to one or more agents are
dropped entirely with probability  whole. A mask is created with probabilities equal to  whole for
each agent and applied to the communication tensor c. This results in the potential loss of whole
messages to an agent or agents.
• Combined Partial and Total Message Loss simulates the scenario where both partial and total
message drops may occur, denoted by  both which is the product of  whole and  partial:
 both =  whole ×  partial
We test with a wide range of noise intensities and message loss frequencies, as shown in Table 2,
to evaluate and compare the impact of each noise type on task performance and model robustness.
Disturbances are tested independently, meaning that Gaussian, Uniform noise and message loss
are not introduced simultaneously.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Analysis</title>
      <p>In this section, we first evaluate the training performance of CommNet [ 18], TarMac [22], IC3Net [24],
and GA-Comm [28] in the trafic junction task under ideal communication conditions. Subsequently,
model performance is bench-marked in scenarios with and without the presence of unreliable
communication. We investigate the models’ ability to overcome noise and identify which models are more
robust to various communication disturbances. Finally, we analyse the efect of diferent noise types on
task success, particularly focusing on the comparative impact of each noise type on model performance.</p>
      <sec id="sec-4-1">
        <title>4.1. Training and Validation</title>
        <p>Training and testing on the trafic junction task at the first level of dificulty and with a maximum of
ifve agents produced high average success rates for most of the models. Notably, TARMAC shows some
performance degradation after epoch 250, the point at which dificulty progressively increases. Figure
2a (a) illustrates the success rates across the first 1000 epochs of training.
(a) Success rates during training for the trafic junction
task.</p>
        <p>(b) Success rates during testing assuming reliable
communication.</p>
        <p>High average success rates were also maintained throughout the testing phase including during
intervals of increased dificulty. From epochs 125 to 625 there is a gradual increase in vehicle add
rates, and from epoch 625 until the end, vehicle add rates reach their maximum. The narrow range
between minimum and maximum success rates throughout all models during testing indicates consistent
performance across epochs. CommNet and GA-Comm showed the best performance in both training
and testing. In contrast, TARMAC exhibits the lowest average success rate in training and the largest
range between minimum and maximum in both training and testing, see Figure 2b (b). This variation
suggests that TARMAC might be more sensitive to changes in  arrive or that more epochs are required
to stabilise its performance, which will be explored in future work. The success rate values obtained
from testing the trained models with reliable communication will serve as a benchmark to quantify the
impact of communication disturbances. This will help identify each model’s ability to generalise in the
presence of unreliable communication.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Unreliable Communication</title>
        <p>Following the evaluation of model performance in ideal communication conditions, we introduce noise
and message drops of various intensities (as detailed in Table 2) into the communication tensor. These
values, designed to simulate low to high levels of interference, were chosen to provide a balance between
realistic noise levels and challenging the model’s ability to adapt. Beginning with CommNet, which
previously demonstrated the highest average success rate in testing without noise, we identify the
level of performance degradation caused by the diferent types of unreliable communication and their
comparative impact on model performance. This analysis will assess the model’s ability to generalise
under adverse communication conditions, providing a basis for comparing success rate degradation
among all models tested.</p>
        <p>Testing CommNet with varying intensities of Gaussian noise revealed an increasing, yet modest,
performance degradation as noise intensity rises, see Table 3 . At a noise intensity of 0.3, the success
rate slightly dips, indicating a minor degradation of 0.04%. Increasing noise to 0.5 leads to a more
noticeable 0.1% degradation and at 0.8 the success rate further decreases reflecting a 0.33% degradation,
highlighting a non-linear relationship with increasing noise intensity. This is an expected behaviour
as higher noise intensities can have a compounding efect leading to more significant performance
degradation. These results further highlight the resilience of CommNet to Gaussian noise, maintaining
high average success rates even at increased noise levels. Uniform noise presents a diferent impact
pattern. The degradation caused by uniform noise is consistently less than that of Gaussian noise at
equivalent intensities. This suggests that CommNet is more robust to uniform noise even at higher
intensities, possibly due to its uniformity which allows the model to better adjust, whereas Gaussian
noise introduces more uncertainty.</p>
        <p>Message loss significantly impacts performance, shown in Figure 3a. Partial message drops result in
a degradation substantially higher compared to uniform noise at close intensity, with smaller increases
in intensity leading to a significant rise in performance degradation. Whole message loss is even more
detrimental. These results indicate that CommNet is significantly more sensitive to message drops than
noise, with whole message loss being particularly detrimental, suggesting that message integrity is
more crucial for model success given that even small intensity can lead to a considerable performance
decline. Co-occurring partial and whole message loss at a combined probability of 0.12 resulted in a
performance degradation comparable to a 0.3 probability of whole message loss, suggesting that even
a relatively low combined probability can have a substantial impact. A larger combined drop (at 0.42
probability) causes a degradation higher than any individual noise or message drop scenario. This
indicates that the compound efect of partial and whole message loss presents a significant challenge
to model performance. Overall, CommNet remains robust, maintaining high success rates even with
high-intensity Gaussian and uniform noise introduced to the communication signal.</p>
        <p>Under reliable communication conditions, CommNet and IC3Net perform comparably. With the
introduction of noise, IC3Net’s performance shows a slightly steeper decline with the most notable
diference in the case of combined message loss which led to a less pronounced decrease compared to
CommNet, see Table 4. IC3Net’s hard attention, allowing agents to decide whether to communicate,
possibly makes the model more resilient to ’empty’ message tensors (masked by zeros in both hard
attention and message drops). Similar to CommNet, IC3Net is generally robust to noise but highly
vulnerable to information loss.</p>
        <p>GA-Comm also demonstrates resilience to noise, showing only slight performance losses. However,
with combined message loss, GA-Comm’s performance significantly drops, see Figure 3b. This drastic
decrease contrasts with CommNet and IC3Net which maintain higher success rates under the same
conditions. TARMAC reports a lower success rate under ideal communication conditions, however,
degradation from noise and message drops is less pronounced. Even at combined message loss scenarios
the degradation is almost negligible. TARMAC’s communication mechanism architecture possibly
makes it robust against all the tested noise types, maintaining consistent performance where other
models exhibit more significant losses.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Noise Comparative Impact</title>
        <p>Comparing the impact of noise across models (see Figure 3b) reveals distinct patterns of robustness
and vulnerability. Gaussian noise, with its inherent unpredictability, shows a gradual and slightly more
pronounced impact on performance than uniform noise as intensity increases. Models possibly adjust
better to the latter due to its uniformity which leads to slightly lower degradation (with the exception
of IC3Net at 0.5 intensity) compared to Gaussian noise at the same intensity. However, both noise
types show mild degradation efects on all models (Table 4). Message loss presents a diferent scenario.
Partial message loss at a medium intensity of 0.4 causes a smaller degradation compared to whole
message loss at a slightly lower intensity of 0.3 which led to significant performance drops. Increasing
the combined message loss intensity from 0.12 to 0.42 (a 250% increase) resulted in a disproportional
increase in degradation. For instance, in CommNet, degradation rose from 4.87% to 8.095%, representing
approximately a 66% increase in degradation. This corresponds to a rate of increase in degradation of
about 26.4% relative to the rate of increase in intensity. These observations suggest that noise impact
depends more on the specific noise characteristics than on a simple linear progression. TARMAC’s
resilience -0.013% degradation- further indicates that architecture also plays a crucial role in model
response to diferent noise types and intensities.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This work explored the impact of diferent types of unreliable communication, namely noise introduced
to the communication tensor and message loss, on the performance of emergent communication models
in a non-signalised intersection environment. The results indicate that models are able to generalise to
both Gaussian and uniform noise at low to medium intensities, suggesting that ’simple’ noise can be
ifltered out by the inherent mechanisms of reinforcement learning. Message loss had a disproportionately
greater impact on success rate compared to noise altering the conveyed information. The observed
relationship between disturbance intensity and performance degradation does not appear to be linear,
especially considering the disproportionate increase in degradation with whole and combined message
drops. Model performance is varyingly afected by diferent disturbances and intensities, indicating
complex dynamics rather than a straightforward linear relationship. These dynamics are likely further
influenced by the specific model architecture and task dificulty.</p>
      <p>Future work will explore the influence of training duration and extend to investigating the efects of
message jumbling and delays, covering medium and hard levels of the intersection environment. This
will help us firstly identify the extent to which additional training stabilises performance, especially in
the presence of noise, and secondly understand the compound impact of increased task complexity and
communication disruptions. Given the detrimental efects of reduced message integrity on task success,
delays and jumbling could present similar challenges and potentially cause even greater performance
degradation.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgements</title>
      <p>This work was supported by the Science Foundation Ireland Centre for Research Training in Advanced
Networks for Sustainable Societies (ADVANCE CRT), under the Grant number 18/CRT/6222, and by the
Science Foundation Ireland CONNECT Research centre Phase 2, Grant 13/RC/2077_P2. For the purpose
of Open Access, the author has applied a CC BY public copyright license to any Author Accepted
Manuscript version arising from this submission.
positioning, in: 2015 International Conference on Computing, Networking and Communications
(ICNC), 2015, pp. 573–578. doi:10.1109/ICCNC.2015.7069408.
[7] R. Hult, G. R. Campos, P. Falcone, H. Wymeersch, An approximate solution to the optimal
coordination problem for autonomous vehicles at intersections, in: 2015 American Control
Conference (ACC), 2015, pp. 763–768. doi:10.1109/ACC.2015.7170826.
[8] L. Chen, C. Englund, Cooperative intersection management: A survey, IEEE Transactions on</p>
      <p>Intelligent Transportation Systems 17 (2016) 570–586. doi:10.1109/TITS.2015.2471812.
[9] A Finite State Machine Based Automated Driving Controller and its Stochastic
Optimization, volume Volume 2: Mechatronics; Estimation and Identification; Uncertain
Systems and Robustness; Path Planning and Motion Control; Tracking Control
Systems; Multi-Agent and Networked Systems; Manufacturing; Intelligent Transportation
and Vehicles; Sensors and Actuators; Diagnostics and Detection; Unmanned, Ground
and Surface Robotics; Motion and Vibration Control Applications of Dynamic Systems
and Control Conference, 2017. URL: https://doi.org/10.1115/DSCC2017-5209. doi:10.1115/
DSCC2017- 5209.
arXiv:https://asmedigitalcollection.asme.org/DSCC/proceedingspdf/DSCC2017/58288/V002T07A002/2376129/v002t07a002-dscc2017-5209.pdf,
v002T07A002.
[10] N. Li, H. Chen, I. Kolmanovsky, A. Girard, An explicit decision tree approach for automated driving,
in: Dynamic systems and control conference, volume 58271, American Society of Mechanical
Engineers, 2017, p. V001T45A003.
[11] X. Lin, J. Zhang, J. Shang, Y. Wang, H. Yu, X. Zhang, Decision making through occluded
intersections for autonomous driving, in: 2019 IEEE Intelligent Transportation Systems Conference
(ITSC), 2019, pp. 2449–2455. doi:10.1109/ITSC.2019.8917348.
[12] S. Gronauer, K. Diepold, Multi-agent deep reinforcement learning: a survey, Artificial Intelligence</p>
      <p>Review 55 (2022) 895–943.
[13] L. Buşoniu, R. Babuška, B. D. Schutter, Multi-agent reinforcement learning: An overview,
Innovations in multi-agent systems and applications-1 (2010) 183–221.
[14] J. Paulos, S. W. Chen, D. Shishika, V. Kumar, Decentralization of multiagent policies by learning
what to communicate, in: 2019 International Conference on Robotics and Automation (ICRA),
IEEE, 2019, pp. 7990–7996.
[15] C. Yu, X. Wang, J. Hao, Z. Feng, Reinforcement learning for cooperative overtaking, in: Proceedings
of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS
’19, International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 2019,
p. 341–349.
[16] R. Lowe, Y. WU, A. Tamar, J. Harb, O. Pieter Abbeel, I. Mordatch, Multi-agent actor-critic for
mixed cooperative-competitive environments, in: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach,
R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems,
volume 30, Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper/2017/file/
68a9750337a418a86fe06c1991a1d64c-Paper.pdf.
[17] J. Foerster, I. A. Assael, N. De Freitas, S. Whiteson, Learning to communicate with deep multi-agent
reinforcement learning, Advances in neural information processing systems 29 (2016).
[18] S. Sukhbaatar, R. Fergus, et al., Learning multiagent communication with backpropagation,</p>
      <p>Advances in neural information processing systems 29 (2016).
[19] W. Kim, M. Cho, Y. Sung, Message-dropout: An eficient training method for multi-agent deep
reinforcement learning, in: Proceedings of the AAAI conference on artificial intelligence, volume 33,
2019, pp. 6079–6086.
[20] N. Jaques, A. Lazaridou, E. Hughes, C. Gulcehre, P. Ortega, D. Strouse, J. Z. Leibo, N. De
Freitas, Social influence as intrinsic motivation for multi-agent deep reinforcement learning, in:
K. Chaudhuri, R. Salakhutdinov (Eds.), Proceedings of the 36th International Conference on
Machine Learning, volume 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 3040–3049.</p>
      <p>URL: https://proceedings.mlr.press/v97/jaques19a.html.
[21] D. Simões, N. Lau, L. P. Reis, Multi-agent actor centralized-critic with communication,
Neurocomputing 390 (2020) 40–56.
[22] A. Das, T. Gervet, J. Romof, D. Batra, D. Parikh, M. Rabbat, J. Pineau, TarMAC: Targeted multi-agent
communication, in: K. Chaudhuri, R. Salakhutdinov (Eds.), Proceedings of the 36th International
Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, PMLR,
2019, pp. 1538–1546. URL: https://proceedings.mlr.press/v97/das19a.html.
[23] Y. Niu, R. R. Paleja, M. C. Gombolay, Multi-agent graph-attention communication and teaming.,
in: AAMAS, 2021, pp. 964–973.
[24] A. Singh, T. Jain, S. Sukhbaatar, Learning when to communicate at scale in multiagent cooperative
and competitive tasks, in: International Conference on Learning Representations, 2018.
[25] R. Wang, X. He, R. Yu, W. Qiu, B. An, Z. Rabinovich, Learning eficient multi-agent communication:
An information bottleneck approach, in: International Conference on Machine Learning, PMLR,
2020, pp. 9908–9918.
[26] T.-Y. Tung, S. Kobus, J. P. Roig, D. Gündüz, Efective communications: A joint learning and
communication framework for multi-agent reinforcement learning over noisy channels, IEEE
Journal on Selected Areas in Communications 39 (2021) 2590–2603.
[27] J. S. P. Roig, D. Gündüz, Remote reinforcement learning over a noisy channel, in: GLOBECOM
2020-2020 IEEE Global Communications Conference, IEEE, 2020, pp. 1–6.
[28] Y. Liu, W. Wang, Y. Hu, J. Hao, X. Chen, Y. Gao, Multi-agent game abstraction via graph attention
neural network, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34,
2020, pp. 7211–7218.
[29] T. Eccles, Y. Bachrach, G. Lever, A. Lazaridou, T. Graepel, Biases for emergent communication in
multi-agent reinforcement learning, Advances in neural information processing systems 32 (2019).
[30] M. Tan, Multi-agent reinforcement learning: Independent vs. cooperative agents, in: Proceedings
of the tenth international conference on machine learning, 1993, pp. 330–337.
[31] F. Qureshi, D. Terzopoulos, Smart camera networks in virtual reality, Proceedings of the IEEE 96
(2008) 1640–1656.
[32] K. Zhang, Z. Yang, H. Liu, T. Zhang, T. Basar, Fully decentralized multi-agent reinforcement
learning with networked agents, in: J. Dy, A. Krause (Eds.), Proceedings of the 35th International
Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, PMLR,
2018, pp. 5872–5881. URL: https://proceedings.mlr.press/v80/zhang18n.html.
[33] J. Jiang, Z. Lu, Learning attentional communication for multi-agent cooperation, Advances in
neural information processing systems 31 (2018).
[34] H. Mao, Z. Zhang, Z. Xiao, Z. Gong, Y. Ni, Learning agent communication under limited bandwidth
by message pruning, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34,
2020, pp. 5142–5149.
[35] C. Luo, X. Liu, X. Chen, J. Luo, Multi-agent fault-tolerant reinforcement learning with noisy
environments, in: 2020 IEEE 26th International Conference on Parallel and Distributed Systems
(ICPADS), 2020, pp. 164–171. doi:10.1109/ICPADS51040.2020.00031.
[36] C. E. Shannon, A mathematical theory of communication, The Bell system technical journal 27
(1948) 379–423.
[37] A. Mostaani, O. Simeone, S. Chatzinotas, B. Ottersten, Learning-based physical layer
communications for multiagent collaboration, in: 2019 IEEE 30th Annual International Symposium on
Personal, Indoor and Mobile Radio Communications (PIMRC), IEEE, 2019, pp. 1–6.
[38] J. S. P. Roig, D. Gündüz, Remote reinforcement learning over a noisy channel, in: GLOBECOM
2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1–6. doi:10.1109/GLOBECOM42002.
2020.9322408.
[39] J. Blumenkamp, A. Prorok, The emergence of adversarial communication in multi-agent
reinforcement learning, in: J. Kober, F. Ramos, C. Tomlin (Eds.), Proceedings of the 2020 Conference
on Robot Learning, volume 155 of Proceedings of Machine Learning Research, PMLR, 2021, pp.
1394–1414. URL: https://proceedings.mlr.press/v155/blumenkamp21a.html.
[40] A. Mostaani, O. Simeone, S. Chatzinotas, B. Ottersten, Learning-based physical layer
communications for multiagent collaboration, in: 2019 IEEE 30th Annual International Symposium on
Personal, Indoor and Mobile Radio Communications (PIMRC), 2019, pp. 1–6. doi:10.1109/PIMRC.
2019.8904190.
[41] Y. Niu, R. Paleja, M. Gombolay, Multi-agent graph-attention communication and teaming, in:
Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems,
AAMAS ’21, International Foundation for Autonomous Agents and Multiagent Systems, Richland,
SC, 2021, p. 964–973.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          , M. Ma,
          <article-title>Intelligent transportation system(its): Concept, challenge and opportunity, in: 2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieee international conference on intelligent data and security (ids</article-title>
          ),
          <year>2017</year>
          , pp.
          <fpage>167</fpage>
          -
          <lpage>172</lpage>
          . doi:
          <volume>10</volume>
          .1109/BigDataSecurity.
          <year>2017</year>
          .
          <volume>50</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Malikopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Cassandras</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. J. Zhang,</surname>
          </string-name>
          <article-title>A decentralized energy-optimal control framework for connected automated vehicles at signal-free intersections</article-title>
          ,
          <source>Automatica</source>
          <volume>93</volume>
          (
          <year>2018</year>
          )
          <fpage>244</fpage>
          -
          <lpage>256</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hult</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Steinmetz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hammarstrand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Falcone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wymeersch</surname>
          </string-name>
          ,
          <article-title>Coordination of cooperative autonomous vehicles: Toward safer and more eficient road transportation</article-title>
          ,
          <source>IEEE Signal Processing Magazine</source>
          <volume>33</volume>
          (
          <year>2016</year>
          )
          <fpage>74</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Guney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. A.</given-names>
            <surname>Raptis</surname>
          </string-name>
          ,
          <article-title>Scheduling-based optimization for motion coordination of autonomous vehicles at multilane intersections</article-title>
          ,
          <source>Journal of Robotics</source>
          <year>2020</year>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. Y. J.</given-names>
            <surname>Ha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Labi</surname>
          </string-name>
          ,
          <article-title>Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles</article-title>
          ,
          <source>ComputerAided Civil and Infrastructure Engineering</source>
          <volume>36</volume>
          (
          <year>2021</year>
          )
          <fpage>838</fpage>
          -
          <lpage>857</lpage>
          . URL: https://onlinelibrary. wiley.com/doi/abs/10.1111/mice.12702. doi:https://doi.org/10.1111/mice.12702. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/mice.12702.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wymeersch</surname>
          </string-name>
          , G. R. de Campos,
          <string-name>
            <given-names>P.</given-names>
            <surname>Falcone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Svensson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Ström</surname>
          </string-name>
          ,
          <article-title>Challenges for cooperative its: Improving road safety through the integration of wireless communications, control, and</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>