<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Journal of Systems Science 53 (2022)
2800-2813. URL: https://doi.org/10.1080/00207721.2022.2056772. doi:10.1080/00207721.2022.
2056772. arXiv:https://doi.org/10.1080/00207721.2022.2056772.
[20] M. Lujak</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1080/00207721.2022</article-id>
      <title-group>
        <article-title>Introducing Multiagent Systems to AV Visual Perception Sub-tasks: A proof-of-concept implementation for bounding-box improvement</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alaa Daoud</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Corentin Bunel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maxime Guériau</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INSA Rouen Normandie, Univ Rouen Normandie</institution>
          ,
          <addr-line>Univ Le Havre Normandie, Normandie Univ, LITIS UR 4108, F-76000 Rouen</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>2</volume>
      <fpage>2800</fpage>
      <lpage>2813</lpage>
      <abstract>
        <p>Object detection is a pivotal task in computer vision, with applications spanning from autonomous driving to surveillance. Traditionally, methods like Non-Maximum Suppression (NMS) and its variants have been used to refine object detection outputs. Fusing predictions from diferent object detection models using confidence scores to average overlapping bounding boxes from multiple detection models has demonstrated superior performance over conventional methods. In this work, we employ multiple agents, each responsible for handling individual bounding boxes, to generate an improved fused prediction. This agent-based adaptation aims to leverage decentralized processing to potentially increase the system's eficiency and adaptability across various object detection scenarios, particularly in autonomous vehicle (AV) perception systems. We develop two distinct behaviors for the bounding box agents: one replicating the state-of-the-art Weighted Boxes Fusion (WBF) method in a decentralized manner, and the other introducing competitive behavior where agents interact based on Intersection over Union (IoU) and confidence values. We evaluate the performance of our approach using the COCO dataset, demonstrating the flexibility and potential of integrating MAS into object detection workflows including those for AV perception systems.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Autonomous driving</kwd>
        <kwd>perception systems</kwd>
        <kwd>bounding-box refinement</kwd>
        <kwd>Multiagent Systems</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Autonomous vehicles and intelligent transport systems depend on advanced computer vision
technologies, with object detection being a critical task. This enables vehicles to recognize and respond
to surrounding objects efectively, with region proposal identifying potential object locations early in
the detection process, crucial for timely responses in autonomous driving [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Traditional techniques
like Non-Maximum Suppression (NMS) often struggle to balance precision and recall, especially in
dynamic environments. Solovyev et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] introduced Weighted Boxes Fusion (WBF), using confidence
scores to average overlapping bounding boxes from multiple detection models, demonstrating superior
performance over conventional methods.
      </p>
      <p>
        The integration of Multiagent Systems (MAS) into object detection workflows ofers new perspectives
to address traditional challenges [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. MAS provide dynamic and adaptable decision-making capabilities,
enhancing autonomous vehicles’ ability to handle complex, unpredictable road conditions. MAS support
distributed and adaptive processing [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], complementing modern GPU-based computer vision. By
distributing tasks across agents, MAS enhances system flexibility and resilience, especially in dynamic
environments like autonomous driving or video surveillance [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ]. Each agent manages a subset of
tasks, improving resilience to errors [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>
        MAS can adjust strategies based on scenarios [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], adapting parameters for bounding box fusion
based on context, scene complexity, or environmental changes [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Agents operate independently on
diferent hardware, optimizing processing power and allowing system scalability [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Local decisions
are combined through a global process, enhancing accuracy [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. MAS can continually learn from
their environment and from the interactions between agents [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. This potential for adaptive learning
motivates the agentification approach, as it opens the possibility for future enhancements. By achieving
an agentified method, we can later integrate learning capabilities to further improve adaptability
and performance in evolving object detection scenarios. Agent-based approaches are well-suited for
integrating diverse models and data sources [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], which is essential for the ensemble approaches used
in WBF where predictions from diferent models are combined.
      </p>
      <p>Agentifying output refinement methods such as NMS or WBF involves assigning individual agents to
handle specific bounding boxes, enabling dynamic adjustment based on individual box characteristics.
This approach addresses real-time processing requirements and improves scalability and fault tolerance
by decentralizing the decision-making process [18, 19, 20]. In this work, we aim to design and implement
a proof-of-concept system integrating MASs into the process of improving bounding boxes in object
detection. We will develop two behaviors for the bounding box agents: one replicating the
state-ofthe-art Weighted Boxes Fusion (WBF) method in a decentralized manner, and the other introducing
competitive behavior where agents interact based on Intersection over Union (IoU) and confidence
values. Finally, we will deploy the system and assess its performance using the COCO dataset, testing
various levels of competition and cooperation between agents. The remainder of this paper is structured
as follows: Section 2 presents the related work in object detection, multiagent systems, and their
integration. Section 3 details the system architecture and design principles of the AWBF method.
Section 4 describes the implementation of the proof-of-concept system and the development of agent
behaviors. Section 5 discusses the experimental setup and evaluation using the COCO dataset. Section
6 presents the results and analysis of the experimental evaluation. Section 7 concludes the paper with a
summary of findings and future work directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Object detection is a fundamental task in computer vision, critical for intelligent transportation systems
(ITS) applications such as autonomous driving, trafic monitoring, and surveillance. The integration
of MASs into object detection workflows ofers significant potential to enhance system eficiency,
robustness, and adaptability. This section reviews recent advancements in object detection techniques
relevant to the ITS, with a focus on bounding box fusion and the role of MASs.</p>
      <sec id="sec-2-1">
        <title>2.1. Bounding Box Improvement Techniques</title>
        <p>Wang et al. (2019) presented the Multi-Stage Complementary Fusion (MCF3D) network, an
end-toend architecture for 3D object detection that integrates LiDAR and RGB data. This network employs
attention mechanisms and prior knowledge to achieve state-of-the-art results, enhancing the detection
accuracy necessary for autonomous driving applications [21].</p>
        <p>Qian et al. (2020) proposed an improved object detection method for remote sensing images,
incorporating a novel bounding box regression loss and a multi-level features fusion module. This method
enhances the precision of object localization, which is crucial for applications such as trafic monitoring
and vehicle detection [22].</p>
        <p>
          Solovyev et al. (2021) introduced the Weighted Boxes Fusion (WBF) method, which averages
overlapping bounding boxes from multiple detection models using confidence scores. This approach
demonstrated superior performance over traditional techniques, highlighting the efectiveness of fusion
methods in improving object detection accuracy [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. This method is particularly relevant for ITS
applications where robust and accurate object detection is paramount for safety and eficiency.
        </p>
        <p>Zhang and Wu (2022) proposed a multi-view feature adaptive fusion framework that enhances 3D
object detection by optimizing depth feature fusion and loss function design. This approach improves
the regression accuracy of bounding boxes, which is essential for ITS applications where precise object
localization is critical [23].</p>
        <p>Liu et al. (2023) developed the Fusion network by Box Matching (FBMNet) for multi-modal 3D
detection. This method aligns features at the bounding box level, providing stability in challenging
scenarios such as asynchronous sensors and misaligned sensor placements, common issues in ITS
applications [24].</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Multiagent Systems in Object Detection</title>
        <p>Introducing MAS to object detection and computer vision systems is not a new idea. For example
Choksuriwong et al. (2005) developed a MAS for image understanding that localizes and recognizes
objects using a distributed system implemented on a cluster computer. This approach leverages invariant
features and supervised classification to improve object recognition accuracy, which is vital for trafic
monitoring systems [25]. However, the application of MAS in these areas has decreased recently with
the advancements in machine learning techniques and their improved performance in handling object
detection tasks. Despite this shift, some researchers have continued to explore the potential of MAS in
object detection through various approaches.</p>
        <p>Jiang et al. (2019) proposed a multi-agent deep reinforcement learning (MADRL) approach for
multi-object tracking, using YOLO V3 for object detection and Independent Q-Learners (IQL) for policy
learning. This method achieves better performance in precision, accuracy, and robustness compared
to other state-of-the-art methods, which is particularly beneficial for real-time trafic monitoring and
surveillance [26].</p>
        <p>Fekir and Benamrane (2015) introduced a MAS for boundary detection and object tracking using
active contours and multi-resolution treatment. This system improves object boundary detection and
tracking through cooperative agent strategies, enhancing the accuracy and eficiency of ITS applications
such as vehicle and pedestrian tracking [27].</p>
        <p>Vincent et al. (2022) described a MAS using stereovision for perception, enabling agents to collaborate
and enhance scene understanding through graph matching algorithms. This approach addresses
challenges in correspondence identification and non-covisibility, critical for ITS applications such as
multi-vehicle coordination and trafic management [28].</p>
        <p>Mahmoudi et al. (2013) utilized a MAS for object recognition in complex urban areas, leveraging
WorldView-2 satellite imagery and digital surface models. This system improves object recognition
accuracy through knowledge-based reasoning and cooperative agent capabilities, essential for urban
trafic monitoring and smart city applications [29].</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Positioning Our Proposal</title>
        <p>In light of the existing work, our proposal aims to integrate the strengths of both bounding box fusion
techniques and MASs to develop a more robust and eficient object detection framework tailored for ITS
applications. Our approach leverages the distributed processing capabilities of MASs to enhance the
accuracy and scalability of bounding box fusion methods. By incorporating advanced fusion techniques
and adaptive agent strategies, our system aims to address the limitations of existing methods, such as
handling dynamic environments and improving detection precision. Our contributions include:
1. A multi-agent based framework for bounding box improvement that dynamically assigns agents
to handle specific bounding boxes.
2. Integration of advanced fusion techniques, such as Weighted Box Fusion (WBF) and
Non</p>
        <p>Maximum Suppression (NMS), to enhance detection accuracy in various ITS scenarios.
3. Implementation of adaptive agent strategies / behaviors that allow the switch between cooperation
and competition dynamically, ensuring robust performance in real-world ITS applications.</p>
        <p>To the best of our knowledge, we are among the first to propose integrating MAS into specific
computer vision sub-tasks such as bounding box filtering and fusion. This approach aims to exploit the
advantages of MAS to enhance the accuracy, eficiency, and adaptability of object detection systems in
ITS applications.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. System Architecture for AWBF</title>
      <p>The agentified Weighted Boxes Fusion (WBF) system integrates multiple agents, each handling individual
bounding boxes from various detection models. This Multiagent System (MAS) enhances the eficiency
and accuracy of bounding box fusion through distributed processing and specialized agent roles. A
central blackboard mechanism facilitates information sharing and coordination.</p>
      <p>MAS ofers decentralized decision-making and dynamic adaptability, enhancing resilience and
flexibility in handling varied scenarios [30].</p>
      <p>Blackboard
Model 1
Model 2
Model 3</p>
      <p>Model1 specific
agent
Model2 specific
agent
Model3 specific
agent</p>
      <p>Coordinator Agent
BB Agent1 (bus)</p>
      <p>BB Agent 3 (car)
BB Agent 2 (bus)</p>
      <p>BB Agent 4 (people)</p>
      <p>BB Agent 5 (people)
DATA Processing Agent</p>
      <p>BBB1B2</p>
      <p>BB3</p>
      <p>BB4 BB5</p>
      <p>The blackboard acts as a global communication hub, simplifying data interactions and providing a
robust framework for synchronized information exchange among agents [31]. Specific agent roles, from
bounding box processing to model-specific adaptations, optimize performance and accuracy by
leveraging domain-specific knowledge and algorithms [ 32]. Feedback mechanisms enable dynamic adaptation,
allowing agents to adjust strategies based on performance and data input changes, maintaining high
accuracy in dynamic environments [33].</p>
      <sec id="sec-3-1">
        <title>3.1. Overview of Agent Roles</title>
        <p>The system includes various agents with specific responsibilities:
• Bounding-Box Agents: Handle individual bounding boxes, analyze, and propose fusions with
overlapping boxes.
• Model-specific Agents: Manage bounding boxes from specific detection models. Can be seen as
interfaces between the MAS and CV models. Each agent extracts bounding box proposals from
its respective model to ensure compatibility and apply model-specific behaviors and adjustments.
• Coordinator Agents: Oversee the fusion process, resolve conflicts between bounding-box agents,
and make final decisions on merged bounding boxes.</p>
        <p>• Data Processing Agents: Optionally handle image preprocessing and result postprocessing.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Blackboard Information Sharing System</title>
        <p>The blackboard serves as a shared information space for communication and data exchange:
• Data Repository: Central storage for bounding box data, including coordinates, confidence
scores, and model origins.
• Communication Medium: Allows agents to read and write data, maintaining system modularity
and scalability.
• Coordination Facilitator: Coordinates actions among agents, especially in resolving fusion
conflicts.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Processing Workflow</title>
        <p>The workflow involves:
1. Data Input and Distribution: Model-specific Agents extract bounding boxes from diferent
models and transfer them to Data processing agents who prepare and distribute bounding boxes
to bounding box agents.
2. Bounding Box Analysis and Posting: Bounding box agents analyze and post findings to the
blackboard, proposing fusions.
3. Review and Fusion: Coordinator agents review and finalize fusion decisions, consulting
modelspecific agents as needed.
4. Final Processing and Output: Data processing agents optimize the fused bounding boxes for
downstream applications.
5. Feedback and Adaptation: The system adapts to changes by updating agent strategies or
parameters based on performance metrics.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Implementation and Development of Agent Behaviors</title>
      <p>Our implementation is developed in Python, utilizing the existing WBF codebase to maintain consistency
in data processing. By forking the original WBF repository, we leverage developed libraries, utilities,
and functions, ensuring the use of the exact same logic in data processing. This allowed us to focus
on integrating MAS features without reinventing the core bounding box fusion logic. We built an
ad-hoc MAS framework tailored to our requirements. The agents interact via a shared blackboard for
communication, and the system supports both centralized and decentralized processing. Following
the system architecture described in the previous section, one can implement diverse behaviors and a
variety of solution method logics by only changing the decision logic of the bounding box agent and
adjusting the coordination mechanism.</p>
      <p>Model-specific Agents interact with existing object detection models (e.g., YOLO, Faster R-CNN) to
receive and process bounding boxes. Model-specific agents convert detection outputs into a standard
format used by the system.</p>
      <p>Main implementation challenges included managing computation time, communication overhead,
and integrating the MAS with existing computer vision models. Future improvements will focus on
developing variety of agent behaviors with optimized parameters for computational and accuracy
performance, enhancing the system’s scalability, robustness, and adaptability, exploring further integration
with advanced machine learning models and real-world deployment scenarios.</p>
      <sec id="sec-4-1">
        <title>4.1. Agent Behaviors</title>
        <p>We developed two distinct agent behaviors to demonstrate the versatility and potential of MAS in object
detection. The first behavior replicates the Weighted Boxes Fusion (WBF) in a decentralized manner,
while the second introduces a competitive interaction among agents.</p>
        <sec id="sec-4-1-1">
          <title>4.1.1. Behavior 1: Decentralized Weighted Boxes Fusion (WBF)</title>
          <p>This behavior replicates the state-of-the-art WBF method in a decentralized manner. Each agent
processes bounding boxes independently and posts results to a shared blackboard (see Algorithm 1),
improving system resilience. The agent determines overlapping boxes as candidates for fusion by
calculating the Intersection over Union (IoU). Boxes are considered for fusion if their IoU exceeds a
certain threshold. The IoU calculation is given by:
 (1, 2) =
(1 ∩ 2)
(1 ∪ 2)
Algorithm 1 Decentralized WBF Algorithm (AWBF) - BoundingBox Agent behavior
1: Input: Bounding boxes , confidence scores , labels 
2: Read , ,  from the blackboard
3: Determine overlapping boxes as candidates for fusion
4: Filter candidates using IoU metric
5: Apply WBF on the final set of candidates
6: Post fused boxes to the blackboard
7: Output: Fused bounding boxes
Algorithm 2 Competitive Interaction Algorithm - BoundingBox Agent  behavior</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Behavior 2: Competitive Interaction</title>
          <p>In this behavior, agents compete based on a new metric that we introduce as Intersection over Box
area(IoB). The IoBs for two boxes,  and , are calculated separately as:
| =
( ∩ )
()
, | =
( ∩ )
()
Attacking or cooperating with other agents depending on calculated strengths. The strength of an
attack of  on  and defense of  against  are defined by:</p>
          <p>attack(, ) = confidence  × |, defense(, ) = confidence  × |
The decision rule is based on the diference between attack and defense strengths and a decision
threshold ( ): Result(, ) = attack(, ) − defense(, )</p>
          <p>⎧⎪Result(, ) &gt;  :  wins and  is removed
winner determination ⎨Result(, ) &lt; −  :  wins and  is removed</p>
          <p>⎪⎩otherwise :  and  fuse using WBF
The least case represent the area where agents can cooperate as their strengths are close. Threshold 
can determine the level of cooperativeness, and thus the value (1 −  ) refers to the competitiveness
level. ( = 1) indicates full cooperativeness settings, reverting to AWBF. Contrarily, ( = 0) indicates
full competitiveness unless attack and defense strengths are equal.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>Illustrative Example: Bounding Box Fusion for Bicycle Detection</title>
          <p>To illustrate the AWBF and competitive behavior in action, we consider the detection of bicycles in
image "138639" from COCO dataset using two ad-hoc models. (see Figure 2)
(a) Model 1 Proposals
(b) Model 2 Proposals
(c) Combined Proposals
(d) AWBF Result
(e) Competitive Results
The bounding boxes from the two models are as follows:
• Model 1: {‘box’: [0.192, 0.752, 0.312, 0.873], ‘score’: 0.9, ‘label’: 2}
• Model 2: {‘box’: [0.203, 0.756, 0.314, 0.875], ‘score’: 0.5, ‘label’: 2}
The Intersection over Union for the bounding boxes is: IoU = aarreeaaooffouvneirolanp ≈ 0.85
If we apply the WBF method:</p>
          <p>WBFbox =
Given ( = 0.3), the agent 1 wins and 2 is removed as (Result &gt;  ). Increasing the  value to 4,
the conflict result will fall into the cooperation range and thus we return back to AWBF.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Evaluation</title>
      <p>To evaluate our methods, we conducted extensive tests using the COCO dataset. Our primary objective
was to demonstrate the proof of concept without optimizing parameters or model weights beyond the
default settings provided by the WBF code. Therefore, our results focus on comparing performance
metrics rather than optimizing for maximum accuracy.</p>
      <p>The evaluation metrics used were those recommended and specified by COCO
dataset. Namely, Average Precision and Average Recall:
- Average Precision (AP) Reveals the model’s ability to make accurate positive predictions. It is
calculated at diferent Intersection over Union (IoU) thresholds.</p>
      <p>• AP@[IoU=0.50:0.95]: This is the average AP over ten IoU thresholds (0.50 to 0.95 with a
step size of 0.05).
• AP[small]: AP for small objects (area &lt; 32 pixels).
• AP[large]: AP for large objects ( area ≥
• AP[medium]: AP for medium sized objects (32 ≤
96 pixels).</p>
      <p>area ≤
96 pixels).
- Average Recall (AR)</p>
      <p>Measuring the sensitivity by focusing on the model’s ability to correctly
identify positive samples from the entire pool of positive instances.</p>
      <p>• AR@[IoU=0.50:0.95]: This is the average recall over ten IoU thresholds (0.50 to 0.95 with a
step size of 0.05).
• AR@0.50: This is the average recall at an IoU threshold of 0.5.
• AR@0.75: This is the average recall at an IoU threshold of 0.75.
• AR[small]: AR for small objects.
• AR[medium]: AR for medium objects.</p>
      <p>• AR[large]: AR for large objects.</p>
      <p>The results from test runs over the entire dataset are shown in the Table 1. Notably, the results
demonstrate that AWBF outperforms individual models whose outputs were used in the fusion process.
Although our results did not surpass those of the centralized
WBF, they were mostly comparable.</p>
      <p>Specifically, our approach performed better than WBF on AP-small and AR@10 at an IoU of 0.5.</p>
      <p>Medium</p>
      <p>Small</p>
      <p>Medium
Benchmarking on COCO dataset</p>
      <sec id="sec-5-1">
        <title>5.1. WBF Performance on Diferent Dataset Sizes:</title>
        <p>Wen running experiments on subsets of the COCO dataset with diferent sizes, we observed that the
centralized WBF method performs better with larger datasets but shows reduced eficiency on smaller
datasets (see Figure 3). This can be explained by several factors:
• Law of Large Numbers: As the dataset size increases, the averaging process tends to smooth
out random errors and fluctuations, leading to improved performance for the centralized WBF
method.
• Error Compensation: With more data points, errors in individual detections can compensate
for each other, leading to more accurate fusion results.
• Increased Data Redundancy: Larger datasets contain more redundant information, reinforcing
correct detections and diluting the impact of incorrect ones.</p>
        <p>AVG PRECISION @ IOU [ 0.5 , 0,95]</p>
        <p>ALL AREA SIZES</p>
        <p>AVG PRECISION @ IOU = 0.5</p>
        <p>ALL AREA SIZES</p>
        <p>AVG PRECISION @ IOU = 0.75</p>
        <p>ALL AREA SIZES
0 100 200 NB3E0X0AMPLES 400 500 600
0 100 200 NB EXAMPLES 400 500 600
300
0 100 200 NB EXAMPLES 400 500 600</p>
        <p>300
400 500 600
0 100 200 NB E3X0A0MPLES 400 500 600
AVG PRECISION @ IOU [ 0.5 , 0,95]</p>
        <p>LARGE BOXES
AVG RECALL @ IOU = 0.5</p>
        <p>ALL AREA SIZES</p>
        <p>AVG RECALL @ IOU =0,75</p>
        <p>ALL AREA SIZES
0 100 200 NB3E0X0AMPLES 400 500 600
0 100 200 NB EXAMPLES 400 500 600</p>
        <p>300
AVG RECALL @ IOU [ 0.5 , 0,95]</p>
        <p>MEDIUM AREA BOXES</p>
        <p>AVG RECALL @ IOU [ 0.5 , 0,95]</p>
        <p>LARGE BOXES
1 1
0,8 0,8
0,6 0,6
0,4 0,4
0,2 0,2
0 0 100 200 NB3E0X0AMPLES 400 500 600 0 0 100 200 300</p>
        <p>NB EXAMPLES
AVG PRECISION @ IOU [ 0.5 , 0,95] AVG PRECISION @ IOU [ 0.5 , 0,95]</p>
        <p>SMALL BOXES MEDIUM AREA BOXES
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0</p>
        <p>AVG RECALL @ IOU [ 0.5 , 0,95]</p>
        <p>ALL AREA SIZES
1
0,8
0,6
0,4
0,2
0 0 100 200 300 400 500 600</p>
        <p>AVG RECALL @ IOU [ 0.5 , 0,95]</p>
        <p>SMALL BOXES
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
1
0,8
0,6
0,4
0,2
0
0 100 200 NB EXAMPLES 400 500 600
300
0 100 200 NB EXAMPLES 400 500 600
300
0 100 200 NB EXAMPLES 400 500 600</p>
        <p>300
AWBF</p>
        <p>WBF</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. AWBF Performance on Diferent Dataset Sizes:</title>
        <p>The AWBF method exhibited more robust and stable performance across varying dataset sizes as
Figure 3 shows, which can be attributed to the distributed processing and the redundancy: Each agent
processes bounding boxes independently and in parallel, reducing the impact of individual errors
and improving overall robustness. Also, Each agent’s localized decision-making can lead to better
performance, especially in smaller datasets where individual detections have a higher impact.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Competitive Behavior Experiments:</title>
        <p>We also evaluated the competitive behavior using the default parameters. While the initial results did
not match the quality of WBF, they demonstrated the potential for diverse agent behaviors. By adjusting
the value of T, which controls the level of cooperativeness (1 - competitiveness). We conducted multiple
tests on a subset of 500 COCO images, varying the competitiveness level. We observed an interesting
trend, as the results showed that increasing competitiveness improved precision (see Figure 4).</p>
        <p>AP increased with higher competitiveness, likely because competition removed lower-scoring boxes,
reducing false positives and improving precision. Recall remained stable as even with fewer boxes,
suficient accurate boxes were retained.</p>
        <p>To summarize, theses evaluations demonstrated the flexibility and potential of integrating MAS into
object detection workflows. While the competitive agent behavior requires further optimization, the
initial results validate our approach and open avenues for more sophisticated multi-agent behaviors in
future work.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this work, we presented a proof-of-concept implementation integrating MAS into object detection
workflows, specifically focusing on improving bounding box predictions, an essential component of
autonomous vehicle perception systems. By leveraging the decentralized processing capabilities of MAS,
we demonstrated two distinct agent behaviors: Decentralized (Agentified) Weighted Boxes Fusion and
Competitive Interaction. Our experimental evaluation using the COCO dataset showed that while the
decentralized WBF approach performed comparably to the centralized WBF, the competitive behavior
illustrated the potential for further optimization and innovation in agent-based object detection systems.
The results indicate that MAS can ofer robust and adaptable solutions for object detection tasks,
particularly in dynamic and complex environments like AV perception and intelligent transportation systems.
Future work will focus on refining agent behaviors, enhancing system scalability, and integrating more
advanced machine learning models to further improve performance and adaptability for AV applications.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work is funded by the French National Research Agency as part of the MultiTrans project under
reference ANR-21-CE23-0032.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Girshick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <surname>Faster</surname>
          </string-name>
          r-cnn:
          <article-title>Towards real-time object detection with region proposal networks</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>39</volume>
          (
          <year>2017</year>
          )
          <fpage>1137</fpage>
          -
          <lpage>1149</lpage>
          . doi:
          <volume>10</volume>
          .1109/TPAMI.
          <year>2016</year>
          .
          <volume>2577031</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Solovyev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gabruseva</surname>
          </string-name>
          ,
          <article-title>Weighted boxes fusion: Ensembling boxes from diferent object detection models</article-title>
          ,
          <source>Image and Vision Computing</source>
          <volume>107</volume>
          (
          <year>2021</year>
          )
          <article-title>104117</article-title>
          . URL: https://www. sciencedirect.com/science/article/pii/S0262885621000226. doi:https://doi.org/10.1016/j. imavis.
          <year>2021</year>
          .
          <volume>104117</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Babuschkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mathieu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jaderberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Czarnecki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dudzik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Georgiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Powell</surname>
          </string-name>
          , et al.,
          <article-title>Alphastar: Mastering the real-time strategy game starcraft ii</article-title>
          ,
          <source>DeepMind blog 2</source>
          (
          <year>2019</year>
          )
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.-Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          , H.-Y. Cheng, C.-
          <string-name>
            <surname>C. Yu</surname>
          </string-name>
          ,
          <article-title>Real-time object detection and tracking for unmanned aerial vehicles based on convolutional neural networks</article-title>
          ,
          <source>Electronics</source>
          <volume>12</volume>
          (
          <year>2023</year>
          )
          <article-title>4928</article-title>
          . doi:
          <volume>10</volume>
          .3390/ electronics12244928.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Weiß</surname>
          </string-name>
          ,
          <article-title>A multiagent perspective of parallel and distributed machine learning</article-title>
          ,
          <source>in: Proceedings of the Second International Conference on Autonomous Agents, AGENTS '98</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>1998</year>
          , p.
          <fpage>226</fpage>
          -
          <lpage>230</lpage>
          . URL: https://doi.org/10.1145/280765. 280806. doi:
          <volume>10</volume>
          .1145/280765.280806.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. E. D.</given-names>
            <surname>Narinx</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Real-Time Multi-Camera Depth Estimation ASIC with Custom On-Chip Embedded</surname>
          </string-name>
          <string-name>
            <given-names>DRAM</given-names>
            ,
            <surname>Ph</surname>
          </string-name>
          .D. thesis, École Polytechnique Fédérale de Lausanne, Lausanne,
          <year>2019</year>
          . URL: http: //infoscience.epfl.ch/record/273168. doi: https://doi.org/10.5075/epfl-thesis-
          <volume>7163</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <article-title>Vehicle video surveillance system based on image fusion and parallel computing</article-title>
          ,
          <source>International Journal of Circuit Theory and Applications</source>
          <volume>49</volume>
          (
          <year>2020</year>
          )
          <fpage>1532</fpage>
          -
          <lpage>1547</lpage>
          . doi:
          <volume>10</volume>
          .1002/cta.2907.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Akil</surname>
          </string-name>
          , L. Perroton,
          <article-title>Special issue on parallel computing for real-time image processing</article-title>
          ,
          <source>Journal of Real-Time Image Processing</source>
          <volume>6</volume>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11554-011-0192-y.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ma</surname>
          </string-name>
          , K. Xue,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Distributed multiagent control approach for multitarget tracking</article-title>
          ,
          <source>Mathematical Problems in Engineering</source>
          <year>2015</year>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . doi:
          <volume>10</volume>
          .1155/
          <year>2015</year>
          /903682.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Caballero-Águila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hermoso-Carazo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Linares-Pérez</surname>
          </string-name>
          ,
          <article-title>Networked distributed fusion estimation under uncertain outputs with random transmission delays, packet losses and multi-packet processing</article-title>
          ,
          <source>Signal Processing 156</source>
          (
          <year>2019</year>
          )
          <fpage>71</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Dynamic task allocation method for heterogenous multiagent system in uncertain scenarios of agricultural field operation</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          <volume>2356</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -6596/2356/1/012049.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <article-title>Neural networks-based distributed adaptive control of nonlinear multiagent systems</article-title>
          ,
          <source>IEEE Transactions on Neural Networks and Learning Systems</source>
          <volume>31</volume>
          (
          <year>2020</year>
          )
          <fpage>1010</fpage>
          -
          <lpage>1021</lpage>
          . doi:
          <volume>10</volume>
          .1109/TNNLS.
          <year>2019</year>
          .
          <volume>2915376</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Soon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. K.</given-names>
            <surname>Quah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tandianus</surname>
          </string-name>
          ,
          <article-title>Multi-view fusion-based 3d object detection for robot indoor scene perception</article-title>
          ,
          <source>Sensors</source>
          (Basel, Switzerland)
          <volume>19</volume>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .3390/s19194092.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>R. de Azevedo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cintuglu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Mohammed</surname>
          </string-name>
          ,
          <article-title>Multiagent-based optimal microgrid control using fully distributed difusion strategy</article-title>
          ,
          <source>IEEE Transactions on Smart Grid</source>
          <volume>8</volume>
          (
          <year>2017</year>
          )
          <fpage>1997</fpage>
          -
          <lpage>2008</lpage>
          . doi:
          <volume>10</volume>
          .1109/TSG.
          <year>2016</year>
          .
          <volume>2587741</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <article-title>Behavior prediction for unmanned driving based on dual fusions of feature and decision</article-title>
          ,
          <source>IEEE Transactions on Intelligent Transportation Systems</source>
          <volume>22</volume>
          (
          <year>2021</year>
          )
          <fpage>3687</fpage>
          -
          <lpage>3696</lpage>
          . doi:
          <volume>10</volume>
          .1109/TITS.
          <year>2020</year>
          .
          <volume>3037926</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kaabouch</surname>
          </string-name>
          , W.-C. Hu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Niemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kommeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-P.</given-names>
            <surname>Hameri</surname>
          </string-name>
          ,
          <article-title>Improving energy-eficiency of scientific computing clusters, in: Energy-Aware Systems and Networking for Sustainable Initiatives</article-title>
          ,
          <source>IGI Global</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          . doi:
          <volume>10</volume>
          .4018/978-1-
          <fpage>4666</fpage>
          -1842-
          <volume>8</volume>
          .
          <year>ch001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Korus</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>Huang, Multi-scale analysis strategies in prnu-based tampering localization</article-title>
          ,
          <source>IEEE Transactions on Information Forensics and Security</source>
          <volume>12</volume>
          (
          <year>2017</year>
          )
          <fpage>809</fpage>
          -
          <lpage>824</lpage>
          . doi:
          <volume>10</volume>
          .1109/TIFS.
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>