Introducing Multiagent Systems to AV Visual Perception Sub-tasks: A proof-of-concept implementation for bounding-box improvement

Introducing Multiagent Systems to AV Visual Perception Sub-tasks: A proof-of-concept implementation for bounding-box improvement AlaaDaoud INSA Rouen Normandie Univ Rouen Normandie Univ Le Havre Normandie Normandie Univ

LITIS UR 4108 F-76000 Rouen France

CorentinBunel INSA Rouen Normandie Univ Rouen Normandie Univ Le Havre Normandie Normandie Univ

LITIS UR 4108 F-76000 Rouen France

MaximeGuériau INSA Rouen Normandie Univ Rouen Normandie Univ Le Havre Normandie Normandie Univ

LITIS UR 4108 F-76000 Rouen France

Introducing Multiagent Systems to AV Visual Perception Sub-tasks: A proof-of-concept implementation for bounding-box improvement 1613-0073 27A196BD2326C1D729C924DA5526DEE7 GROBID - A machine learning software for extracting information from scholarly documents Autonomous driving perception systems bounding-box refinement Multiagent Systems

Object detection is a pivotal task in computer vision, with applications spanning from autonomous driving to surveillance. Traditionally, methods like Non-Maximum Suppression (NMS) and its variants have been used to refine object detection outputs. Fusing predictions from different object detection models using confidence scores to average overlapping bounding boxes from multiple detection models has demonstrated superior performance over conventional methods. In this work, we employ multiple agents, each responsible for handling individual bounding boxes, to generate an improved fused prediction. This agent-based adaptation aims to leverage decentralized processing to potentially increase the system's efficiency and adaptability across various object detection scenarios, particularly in autonomous vehicle (AV) perception systems. We develop two distinct behaviors for the bounding box agents: one replicating the state-of-the-art Weighted Boxes Fusion (WBF) method in a decentralized manner, and the other introducing competitive behavior where agents interact based on Intersection over Union (IoU) and confidence values. We evaluate the performance of our approach using the COCO dataset, demonstrating the flexibility and potential of integrating MAS into object detection workflows including those for AV perception systems.

Introduction

Autonomous vehicles and intelligent transport systems depend on advanced computer vision technologies, with object detection being a critical task. This enables vehicles to recognize and respond to surrounding objects effectively, with region proposal identifying potential object locations early in the detection process, crucial for timely responses in autonomous driving [1]. Traditional techniques like Non-Maximum Suppression (NMS) often struggle to balance precision and recall, especially in dynamic environments. Solovyev et al. [2] introduced Weighted Boxes Fusion (WBF), using confidence scores to average overlapping bounding boxes from multiple detection models, demonstrating superior performance over conventional methods.

The integration of Multiagent Systems (MAS) into object detection workflows offers new perspectives to address traditional challenges [3,4]. MAS provide dynamic and adaptable decision-making capabilities, enhancing autonomous vehicles' ability to handle complex, unpredictable road conditions. MAS support distributed and adaptive processing [5], complementing modern GPU-based computer vision. By distributing tasks across agents, MAS enhances system flexibility and resilience, especially in dynamic environments like autonomous driving or video surveillance [6,7,8]. Each agent manages a subset of tasks, improving resilience to errors [9,10].

MAS can adjust strategies based on scenarios [11,12], adapting parameters for bounding box fusion based on context, scene complexity, or environmental changes [13]. Agents operate independently on different hardware, optimizing processing power and allowing system scalability [14]. Local decisions are combined through a global process, enhancing accuracy [15]. MAS can continually learn from 13th International Workshop on Agents in Traffic and Transportation (ATT 2024) held in conjunction with ECAI 2024 alaa.daoud@insa-rouen.fr (A. Daoud); corentin.bunel@insa-rouen.fr (C. Bunel); maxime.gueriau@insa-rouen.fr (M. Guériau) 0000-0002-3640-327X (A. Daoud); 0000-0001-6637-9795 (C. Bunel); 0000-0002-8742-6623 (M. Guériau) their environment and from the interactions between agents [16]. This potential for adaptive learning motivates the agentification approach, as it opens the possibility for future enhancements. By achieving an agentified method, we can later integrate learning capabilities to further improve adaptability and performance in evolving object detection scenarios. Agent-based approaches are well-suited for integrating diverse models and data sources [17], which is essential for the ensemble approaches used in WBF where predictions from different models are combined.

Agentifying output refinement methods such as NMS or WBF involves assigning individual agents to handle specific bounding boxes, enabling dynamic adjustment based on individual box characteristics. This approach addresses real-time processing requirements and improves scalability and fault tolerance by decentralizing the decision-making process [18,19,20]. In this work, we aim to design and implement a proof-of-concept system integrating MASs into the process of improving bounding boxes in object detection. We will develop two behaviors for the bounding box agents: one replicating the state-ofthe-art Weighted Boxes Fusion (WBF) method in a decentralized manner, and the other introducing competitive behavior where agents interact based on Intersection over Union (IoU) and confidence values. Finally, we will deploy the system and assess its performance using the COCO dataset, testing various levels of competition and cooperation between agents. The remainder of this paper is structured as follows: Section 2 presents the related work in object detection, multiagent systems, and their integration. Section 3 details the system architecture and design principles of the AWBF method. Section 4 describes the implementation of the proof-of-concept system and the development of agent behaviors. Section 5 discusses the experimental setup and evaluation using the COCO dataset. Section 6 presents the results and analysis of the experimental evaluation. Section 7 concludes the paper with a summary of findings and future work directions.

Related Work

Object detection is a fundamental task in computer vision, critical for intelligent transportation systems (ITS) applications such as autonomous driving, traffic monitoring, and surveillance. The integration of MASs into object detection workflows offers significant potential to enhance system efficiency, robustness, and adaptability. This section reviews recent advancements in object detection techniques relevant to the ITS, with a focus on bounding box fusion and the role of MASs. Wang et al. (2019) presented the Multi-Stage Complementary Fusion (MCF3D) network, an end-toend architecture for 3D object detection that integrates LiDAR and RGB data. This network employs attention mechanisms and prior knowledge to achieve state-of-the-art results, enhancing the detection accuracy necessary for autonomous driving applications [21]. Qian et al. (2020) proposed an improved object detection method for remote sensing images, incorporating a novel bounding box regression loss and a multi-level features fusion module. This method enhances the precision of object localization, which is crucial for applications such as traffic monitoring and vehicle detection [22]. Solovyev et al. (2021) introduced the Weighted Boxes Fusion (WBF) method, which averages overlapping bounding boxes from multiple detection models using confidence scores. This approach demonstrated superior performance over traditional techniques, highlighting the effectiveness of fusion methods in improving object detection accuracy [2]. This method is particularly relevant for ITS applications where robust and accurate object detection is paramount for safety and efficiency.

Bounding Box Improvement Techniques

Zhang and Wu (2022) proposed a multi-view feature adaptive fusion framework that enhances 3D object detection by optimizing depth feature fusion and loss function design. This approach improves the regression accuracy of bounding boxes, which is essential for ITS applications where precise object localization is critical [23]. Liu et al. (2023) developed the Fusion network by Box Matching (FBMNet) for multi-modal 3D detection. This method aligns features at the bounding box level, providing stability in challenging scenarios such as asynchronous sensors and misaligned sensor placements, common issues in ITS applications [24].

Multiagent Systems in Object Detection

Introducing MAS to object detection and computer vision systems is not a new idea. For example Choksuriwong et al. (2005) developed a MAS for image understanding that localizes and recognizes objects using a distributed system implemented on a cluster computer. This approach leverages invariant features and supervised classification to improve object recognition accuracy, which is vital for traffic monitoring systems [25]. However, the application of MAS in these areas has decreased recently with the advancements in machine learning techniques and their improved performance in handling object detection tasks. Despite this shift, some researchers have continued to explore the potential of MAS in object detection through various approaches.

Jiang et al. ( 2019) proposed a multi-agent deep reinforcement learning (MADRL) approach for multi-object tracking, using YOLO V3 for object detection and Independent Q-Learners (IQL) for policy learning. This method achieves better performance in precision, accuracy, and robustness compared to other state-of-the-art methods, which is particularly beneficial for real-time traffic monitoring and surveillance [26].

Fekir and Benamrane (2015) introduced a MAS for boundary detection and object tracking using active contours and multi-resolution treatment. This system improves object boundary detection and tracking through cooperative agent strategies, enhancing the accuracy and efficiency of ITS applications such as vehicle and pedestrian tracking [27].

Vincent et al. ( 2022) described a MAS using stereovision for perception, enabling agents to collaborate and enhance scene understanding through graph matching algorithms. This approach addresses challenges in correspondence identification and non-covisibility, critical for ITS applications such as multi-vehicle coordination and traffic management [28].

Mahmoudi et al. ( 2013) utilized a MAS for object recognition in complex urban areas, leveraging WorldView-2 satellite imagery and digital surface models. This system improves object recognition accuracy through knowledge-based reasoning and cooperative agent capabilities, essential for urban traffic monitoring and smart city applications [29].

Positioning Our Proposal

In light of the existing work, our proposal aims to integrate the strengths of both bounding box fusion techniques and MASs to develop a more robust and efficient object detection framework tailored for ITS applications. Our approach leverages the distributed processing capabilities of MASs to enhance the accuracy and scalability of bounding box fusion methods. By incorporating advanced fusion techniques and adaptive agent strategies, our system aims to address the limitations of existing methods, such as handling dynamic environments and improving detection precision. Our contributions include:

1. A multi-agent based framework for bounding box improvement that dynamically assigns agents to handle specific bounding boxes.

Integration of advanced fusion techniques, such as Weighted Box Fusion (WBF) and Non-

Maximum Suppression (NMS), to enhance detection accuracy in various ITS scenarios. 3. Implementation of adaptive agent strategies / behaviors that allow the switch between cooperation and competition dynamically, ensuring robust performance in real-world ITS applications.

To the best of our knowledge, we are among the first to propose integrating MAS into specific computer vision sub-tasks such as bounding box filtering and fusion. This approach aims to exploit the advantages of MAS to enhance the accuracy, efficiency, and adaptability of object detection systems in ITS applications.

System Architecture for AWBF

The agentified Weighted Boxes Fusion (WBF) system integrates multiple agents, each handling individual bounding boxes from various detection models. This Multiagent System (MAS) enhances the efficiency and accuracy of bounding box fusion through distributed processing and specialized agent roles. A central blackboard mechanism facilitates information sharing and coordination.

MAS offers decentralized decision-making and dynamic adaptability, enhancing resilience and flexibility in handling varied scenarios [30]. The blackboard acts as a global communication hub, simplifying data interactions and providing a robust framework for synchronized information exchange among agents [31]. Specific agent roles, from bounding box processing to model-specific adaptations, optimize performance and accuracy by leveraging domain-specific knowledge and algorithms [32]. Feedback mechanisms enable dynamic adaptation, allowing agents to adjust strategies based on performance and data input changes, maintaining high accuracy in dynamic environments [33].

Overview of Agent Roles

The system includes various agents with specific responsibilities:

• Bounding-Box Agents: Handle individual bounding boxes, analyze, and propose fusions with overlapping boxes. • Model-specific Agents: Manage bounding boxes from specific detection models. Can be seen as interfaces between the MAS and CV models. Each agent extracts bounding box proposals from its respective model to ensure compatibility and apply model-specific behaviors and adjustments. • Coordinator Agents: Oversee the fusion process, resolve conflicts between bounding-box agents, and make final decisions on merged bounding boxes. • Data Processing Agents: Optionally handle image preprocessing and result postprocessing.

Blackboard Information Sharing System

The blackboard serves as a shared information space for communication and data exchange:

• Data Repository: Central storage for bounding box data, including coordinates, confidence scores, and model origins. • Communication Medium: Allows agents to read and write data, maintaining system modularity and scalability. • Coordination Facilitator: Coordinates actions among agents, especially in resolving fusion conflicts.

Processing Workflow

The workflow involves:

Implementation and Development of Agent Behaviors

Our implementation is developed in Python, utilizing the existing WBF codebase to maintain consistency in data processing. By forking the original WBF repository, we leverage developed libraries, utilities, and functions, ensuring the use of the exact same logic in data processing. This allowed us to focus on integrating MAS features without reinventing the core bounding box fusion logic. We built an ad-hoc MAS framework tailored to our requirements. The agents interact via a shared blackboard for communication, and the system supports both centralized and decentralized processing. Following the system architecture described in the previous section, one can implement diverse behaviors and a variety of solution method logics by only changing the decision logic of the bounding box agent and adjusting the coordination mechanism. Model-specific Agents interact with existing object detection models (e.g., YOLO, Faster R-CNN) to receive and process bounding boxes. Model-specific agents convert detection outputs into a standard format used by the system.

Main implementation challenges included managing computation time, communication overhead, and integrating the MAS with existing computer vision models. Future improvements will focus on developing variety of agent behaviors with optimized parameters for computational and accuracy performance, enhancing the system's scalability, robustness, and adaptability, exploring further integration with advanced machine learning models and real-world deployment scenarios.

Agent Behaviors

We developed two distinct agent behaviors to demonstrate the versatility and potential of MAS in object detection. The first behavior replicates the Weighted Boxes Fusion (WBF) in a decentralized manner, while the second introduces a competitive interaction among agents.

Behavior 1: Decentralized Weighted Boxes Fusion (WBF)

This behavior replicates the state-of-the-art WBF method in a decentralized manner. Each agent processes bounding boxes independently and posts results to a shared blackboard (see Algorithm 1), improving system resilience. The agent determines overlapping boxes as candidates for fusion by calculating the Intersection over Union (IoU). Boxes are considered for fusion if their IoU exceeds a certain threshold. The IoU calculation is given by:

𝐼𝑜𝑈 (𝐵𝑜𝑥 1 , 𝐵𝑜𝑥 2 ) = 𝑎𝑟𝑒𝑎(𝐵𝑜𝑥 1 ∩ 𝐵𝑜𝑥 2 ) 𝑎𝑟𝑒𝑎(𝐵𝑜𝑥 1 ∪ 𝐵𝑜𝑥 2 )

Algorithm 1 Decentralized WBF Algorithm (AWBF) -BoundingBox Agent behavior Fuse 𝐴 𝑖 and 𝐵 𝑗 using weighted average 14:

end if 15: end for 16: 𝐴 𝑖 posts result to the blackboard 17: Output: Final bounding boxes

Behavior 2: Competitive Interaction

In this behavior, agents compete based on a new metric that we introduce as Intersection over Box area(IoB). The IoBs for two boxes, 𝐴 and 𝐵, are calculated separately as:

𝐼𝑜𝐵 𝐴|𝐵 = 𝑎𝑟𝑒𝑎(𝐴 ∩ 𝐵) 𝑎𝑟𝑒𝑎(𝐴) , 𝐼𝑜𝐵 𝐵|𝐴 = 𝑎𝑟𝑒𝑎(𝐴 ∩ 𝐵) 𝑎𝑟𝑒𝑎(𝐵)

Attacking or cooperating with other agents depending on calculated strengths. The strength of an attack of 𝐴 on 𝐵 and defense of 𝐵 against 𝐴 are defined by:

𝑆 attack (𝐴, 𝐵) = confidence 𝐴 × 𝐼𝑜𝐵 𝐵|𝐴 , 𝑆 defense (𝐵, 𝐴) = confidence 𝐵 × 𝐼𝑜𝐵 𝐴|𝐵

The decision rule is based on the difference between attack and defense strengths and a decision threshold

Illustrative Example: Bounding Box Fusion for Bicycle Detection

To illustrate the AWBF and competitive behavior in action, we consider the detection of bicycles in image "138639" from COCO dataset using two ad-hoc models. (see Figure 2) The bounding boxes from the two models are as follows:

• For competitive behavior, the attack and defence strengths rely on the calculation of the Intersection over Box values:

Given (𝑇 = 0.3), the agent 𝐵1 wins and 𝐵2 is removed as (Result > 𝑇 ). Increasing the 𝑇 value to 4, the conflict result will fall into the cooperation range and thus we return back to AWBF.

Experimental Evaluation

To evaluate our methods, we conducted extensive tests using the COCO dataset. Our primary objective was to demonstrate the proof of concept without optimizing parameters or model weights beyond the default settings provided by the WBF code. Therefore, our results focus on comparing performance metrics rather than optimizing for maximum accuracy.

Evaluation Metrics

The evaluation metrics used were those recommended and specified by COCO dataset. Namely, Average Precision and Average Recall:

-Average Precision (AP) Reveals the model's ability to make accurate positive predictions. It is calculated at different Intersection over Union (IoU) thresholds.

• AP@[IoU=0.50:0.95]: This is the average AP over ten IoU thresholds (0.50 to 0.95 with a step size of 0.05). • AP@0.50: This is the AP at an IoU threshold of 0.5.

• AP@0.75: This is the AP at an IoU threshold of 0.75.

• AP[small]: AP for small objects (area < 32 pixels).

• AP[medium]: AP for medium sized objects (32 ≤ area ≤ 96 pixels).

• AP[large]: AP for large objects ( area ≥ 96 pixels).

-Average Recall (AR) Measuring the sensitivity by focusing on the model's ability to correctly identify positive samples from the entire pool of positive instances.

• AR@[IoU=0.50:0.95]: This is the average recall over ten IoU thresholds (0.50 to 0.95 with a step size of 0.05). • AR@0.50: This is the average recall at an IoU threshold of 0.5.

• AR@0.75: This is the average recall at an IoU threshold of 0.75.

• AR[small]: AR for small objects.

• AR[medium]: AR for medium objects.

• AR[large]: AR for large objects.

The results from test runs over the entire dataset are shown in the Table 1. Notably, the results demonstrate that AWBF outperforms individual models whose outputs were used in the fusion process. Although our results did not surpass those of the centralized WBF, they were mostly comparable. Specifically, our approach performed better than WBF on AP-small and AR@10 at an IoU of 0.5.

Avg. Precision

Avg. Recall @[0.5 , 0.95] @[0.5] @[0.75] Small Medium Large @[0. Wen running experiments on subsets of the COCO dataset with different sizes, we observed that the centralized WBF method performs better with larger datasets but shows reduced efficiency on smaller datasets (see Figure 3). This can be explained by several factors:

• Law of Large Numbers: As the dataset size increases, the averaging process tends to smooth out random errors and fluctuations, leading to improved performance for the centralized WBF method. • Error Compensation: With more data points, errors in individual detections can compensate for each other, leading to more accurate fusion results. • Increased Data Redundancy: Larger datasets contain more redundant information, reinforcing correct detections and diluting the impact of incorrect ones.

AWBF Performance on Different Dataset Sizes:

The AWBF method exhibited more robust and stable performance across varying dataset sizes as Figure 3 shows, which can be attributed to the distributed processing and the redundancy: Each agent processes bounding boxes independently and in parallel, reducing the impact of individual errors and improving overall robustness. Also, Each agent's localized decision-making can lead to better performance, especially in smaller datasets where individual detections have a higher impact.

Competitive Behavior Experiments:

We also evaluated the competitive behavior using the default parameters. While the initial results did not match the quality of WBF, they demonstrated the potential for diverse agent behaviors. By adjusting the value of T, which controls the level of cooperativeness (1 -competitiveness). We conducted multiple tests on a subset of 500 COCO images, varying the competitiveness level. We observed an interesting trend, as the results showed that increasing competitiveness improved precision (see Figure 4). AP increased with higher competitiveness, likely because competition removed lower-scoring boxes, reducing false positives and improving precision. Recall remained stable as even with fewer boxes, sufficient accurate boxes were retained.

To summarize, theses evaluations demonstrated the flexibility and potential of integrating MAS into object detection workflows. While the competitive agent behavior requires further optimization, the initial results validate our approach and open avenues for more sophisticated multi-agent behaviors in future work.

Conclusion

In this work, we presented a proof-of-concept implementation integrating MAS into object detection workflows, specifically focusing on improving bounding box predictions, an essential component of autonomous vehicle perception systems. By leveraging the decentralized processing capabilities of MAS, we demonstrated two distinct agent behaviors: Decentralized (Agentified) Weighted Boxes Fusion and Competitive Interaction. Our experimental evaluation using the COCO dataset showed that while the decentralized WBF approach performed comparably to the centralized WBF, the competitive behavior illustrated the potential for further optimization and innovation in agent-based object detection systems. The results indicate that MAS can offer robust and adaptable solutions for object detection tasks, particularly in dynamic and complex environments like AV perception and intelligent transportation systems. Future work will focus on refining agent behaviors, enhancing system scalability, and integrating more advanced machine learning models to further improve performance and adaptability for AV applications.

Figure 1 :1Figure 1: AWBF Agents, colors correspond to the models who generate initial bounding boxes

(𝑇 ): Result(𝐴, 𝐵) = 𝑆 attack (𝐴, 𝐵) − 𝑆 defense (𝐵, 𝐴) , 𝐵) > 𝑇 : 𝐴 wins and 𝐵 is removed Result(𝐴, 𝐵) < −𝑇 : 𝐵 wins and 𝐴 is removed otherwise : 𝐴 and 𝐵 fuse using WBF The least case represent the area where agents can cooperate as their strengths are close. Threshold 𝑇 can determine the level of cooperativeness, and thus the value (1 − 𝑇 ) refers to the competitiveness level. (𝑇 = 1) indicates full cooperativeness settings, reverting to AWBF. Contrarily, (𝑇 = 0) indicates full competitiveness unless attack and defense strengths are equal.

Figure 2 :2Figure 2: Visualization of Bounding Box Proposals and Fusion Results on COCO#138639 (images are cropped to emphasize the area of interest)

Model 1 :1{'box': [0.192, 0.752, 0.312, 0.873], 'score': 0.9, 'label': 2} • Model 2: {'box': [0.203, 0.756, 0.314, 0.875], 'score': 0.5, 'label': 2} The Intersection over Union for the bounding boxes is: IoU = area of overlap area of union ≈ 0.85 If we apply the WBF method: WBF box = 0.9 • [0.192, 0.752, 0.312, 0.873] + 0.5 • [0.203, 0.756, 0.314, 0.875] 0.9 + 0.5 WBF box ≈ [0.196, 0.754, 0.313, 0.874] , WBF score = 0

Figure 3 :3Figure 3: COCO evaluation metrics evolution as function of the size of test set -Evaluating the AWBF method performance on subsets of COCO dataset

Figure 4 :4Figure 4: Precision and recall values evolution with the increase of competition level (decrease in cooperation threshold T) in the agent behavior

1 .1Data Input and Distribution: Model-specific Agents extract bounding boxes from different models and transfer them to Data processing agents who prepare and distribute bounding boxes to bounding box agents. 2. Bounding Box Analysis and Posting: Bounding box agents analyze and post findings to the blackboard, proposing fusions. 3. Review and Fusion: Coordinator agents review and finalize fusion decisions, consulting modelspecific agents as needed. 4. Final Processing and Output: Data processing agents optimize the fused bounding boxes for downstream applications. 5. Feedback and Adaptation: The system adapts to changes by updating agent strategies or parameters based on performance metrics.

Bounding boxes 𝐵, confidence scores 𝑆, labels 𝐿 2: 𝐴 𝑖 reads overlapping boxes from the blackboard 3: for each overlapping box 𝐵 𝑖 do Calculate attack strength 𝑆 attack = confidence 𝐴 𝑖 × 𝐼𝑜𝐵 𝐵 𝑗 6: Calculate defense strength 𝑆 defense = confidence 𝐵 𝑗 × 𝐼𝑜𝐵 𝐴 𝑖

2: Read 𝐵, 𝑆, 𝐿 from the blackboard3: Determine overlapping boxes as candidates for fusion4: Filter candidates using IoU metric5: Apply WBF on the final set of candidates6: Post fused boxes to the blackboard7: Output: Fused bounding boxesAlgorithm 2 Competitive Interaction Algorithm -BoundingBox Agent 𝐴 𝑖 behavior4:Calculate 𝐼𝑜𝑈 and 𝐼𝑜𝐵 between 𝐴 𝑖 and 𝐵 𝑗5:

1: Input: Bounding boxes 𝐵, confidence scores 𝑆, labels 𝐿 1: Input: 7: Calculate result 𝑅 = 𝑆 attack − 𝑆 defense 8: if 𝑅 > 𝑇 then 9: 𝐴 𝑖 wins and 𝐵 𝑗 is removed 10: else if 𝑅 < −𝑇 then 11: 𝐵 𝑗 wins and 𝐴 𝑖 is removed 12: else 13:

WBF Performance on Different Dataset Sizes:5.1.5 , 0.95] @[0.5] @[0.75] Small Medium LargeEffNetB00.3360.5150.3540.1250.3880.5280.2880.440.4670.1930.550.688EffNetB0-m0.3350.5160.3510.1290.3890.5240.2880.4410.4670.1980.550.687EffNetB10.3920.5810.4180.1860.4470.5710.3220.5010.5320.2940.5990.735EffNetB1-m0.3920.5810.4170.1840.4470.5710.3230.5020.5310.2790.6020.735EffNetB20.4250.6170.4530.2380.4790.5910.340.5370.5690.3470.6320.75EffNetB2-m0.4260.6170.4540.240.4810.5930.3410.5370.5690.3580.6340.748EffNetB30.4590.650.4910.280.5030.6160.3590.5690.6040.4040.6540.77EffNetB3-m0.4550.6460.4870.2820.4940.6180.3570.5660.60.4120.650.766EffNetB40.490.6850.5290.3340.5380.640.3750.5980.6340.4640.6820.782EffNetB4-m0.4880.6840.5240.330.5330.6420.3730.5960.6330.4680.680.783EffNetB50.5050.70.5440.3430.5490.6460.3830.6190.6560.50.6980.791EffNetB5-m0.5020.6960.5390.3350.5460.6450.3790.6140.6510.4840.6920.789EffNetB60.5130.7050.5550.3520.5560.6520.3870.6260.6640.5050.7030.795EffNetB6-m0.5110.7010.5510.3410.5550.6540.3840.6230.660.4890.7040.805EffNetB70.5210.710.5620.370.5620.660.390.6330.6710.5170.7110.801EffNetB7-m0.5190.710.5580.3640.5620.6590.3880.630.6680.5090.710.803DetRS0.5150.710.6540.3180.5650.6760.3840.6280.6710.4790.7230.828DetRS-m0.5150.7070.5640.3160.5630.6770.3840.6290.6730.4860.7210.834resnet500.4960.6970.5380.2990.5430.6560.3780.6070.640.4570.6860.8resnet50-m0.4960.6940.5350.2960.5450.6570.3790.610.6420.4640.6890.799yolo0.50.6780.5460.3360.5440.6440.3810.6280.6880.5330.7340.826WBF0.6730.8940.7090.6050.7310.8460.4710.6270.8460.80.850.867AWBF0.610.660.6250.610.7660.6750.3950.6760.7450.6640.7060.819Table 1Benchmarking on COCO dataset

Acknowledgments

This work is funded by the French National Research Agency as part of the MultiTrans project under reference ANR-21-CE23-0032.

Faster r-cnn: Towards real-time object detection with region proposal networks SRen KHe RGirshick JSun 10.1109/TPAMI.2016.2577031 IEEE Transactions on Pattern Analysis and Machine Intelligence 39 2017 Weighted boxes fusion: Ensembling boxes from different object detection models RSolovyev WWang TGabruseva 10.1016/j.imavis.2021.104117 Image and Vision Computing 107 104117 2021 Alphastar: Mastering the real-time strategy game starcraft ii OVinyals IBabuschkin JChung MMathieu MJaderberg WMCzarnecki ADudzik AHuang PGeorgiev RPowell DeepMind blog 2 20 2019 Real-time object detection and tracking for unmanned aerial vehicles based on convolutional neural networks S.-YYang H.-YCheng C.-CYu 10.3390/electronics12244928 Electronics 12 4928 2023 A multiagent perspective of parallel and distributed machine learning GWeiß 10.1145/280765.280806 doi:10.1145/280765.280806 Proceedings of the Second International Conference on Autonomous Agents, AGENTS '98 the Second International Conference on Autonomous Agents, AGENTS '98

New York, NY, USA

Association for Computing Machinery 1998 A Real-Time Multi-Camera Depth Estimation ASIC with Custom On-Chip Embedded DRAM JE DNarinx 10.5075/epfl-thesis-7163 2019 Lausanne École Polytechnique Fédérale de Lausanne Ph.D. thesis Vehicle video surveillance system based on image fusion and parallel computing SLiu CLyu HGong 10.1002/cta.2907 International Journal of Circuit Theory and Applications 49 2020 Special issue on parallel computing for real-time image processing MAkil LPerroton 10.1007/s11554-011-0192-y Journal of Real-Time Image Processing 6 2011 Distributed multiagent control approach for multitarget tracking LMa KXue PWang 10.1155/2015/903682 Mathematical Problems in Engineering 2015 2015 Networked distributed fusion estimation under uncertain outputs with random transmission delays, packet losses and multi-packet processing RCaballero-Águila AHermoso-Carazo JLinares-Pérez Signal Processing 156 2019 Dynamic task allocation method for heterogenous multiagent system in uncertain scenarios of agricultural field operation YLiang KZhou CWu 10.1088/1742-6596/2356/1/012049 Journal of Physics: Conference Series 2356 2022 Neural networks-based distributed adaptive control of nonlinear multiagent systems QShen PShi JZhu SWang YShi 10.1109/TNNLS.2019.2915376 IEEE Transactions on Neural Networks and Learning Systems 31 2020 Multi-view fusion-based 3d object detection for robot indoor scene perception LWang RLi JSun XLiu LZhao SHSoon CKQuah BTandianus 10.3390/s19194092 Sensors 19 2019 Multiagent-based optimal microgrid control using fully distributed diffusion strategy RAzevedo MCintuglu TMa OMohammed 10.1109/TSG.2016.2587741 IEEE Transactions on Smart Grid 8 2017 Behavior prediction for unmanned driving based on dual fusions of feature and decision SZhong MWei SGong KXia YFu QFu HYin 10.1109/TITS.2020.3037926 IEEE Transactions on Intelligent Transportation Systems 22 2021 Improving energy-efficiency of scientific computing clusters NKaabouch W.-CHu TNiemi JKommeri A.-PHameri 10.4018/978-1-4666-1842-8.ch001 Energy-Aware Systems and Networking for Sustainable Initiatives IGI Global 2012 Multi-scale analysis strategies in prnu-based tampering localization PKorus JHuang 10.1109/TIFS.2016.2636089 IEEE Transactions on Information Forensics and Security 12 2017 Multiagent systems: A survey from a machine learning perspective PStone MVeloso Autonomous Robots 8 2000 A survey on fault-tolerant consensus control of multi-agent systems: trends, methodologies and prospects CGao XHe HDong HLiu GLyu 10.1080/00207721.2022.2056772 arXiv: International Journal of Systems Science 53 2022 Scalable distributed decision-making and coordination in large and complex systems: Methods, techniques, and models MLujak SGiordani AOmicini SOssowski Complexity 2020. 2020 Mcf3d: Multi-stage complementary fusion for multi-sensor 3d object detection JWang MZhu DSun BWang WGao HWei 10.1109/ACCESS.2019.2927012 IEEE Access 7 2019 Object detection in remote sensing images based on improved bounding box regression and multi-level features fusion XQian SLin GCheng XYao HRen WWang 10.3390/rs12010143 Remote Sensing 12 2020 3d object detection based on multi-view adaptive fusion YZhang HWu 10.1109/IPEC54454.2022.9777488 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC) 2022. 2022 ZLiu XYe ZZou XHe XTan EDing JWang XBai arXiv:2305.07713 Multi-modal 3d object detection by box matching 2023 Multi-agents system for image understanding AChoksuriwong CRosenberger WSmari 10.1109/KIMAS.2005.1427070 International Conference on Integration of Knowledge Intensive Multi-Agent Systems 2005. 2005 Multi-agent deep reinforcement learning for multi-object tracker MJiang THai ZPan HWang YJia CDeng 10.1109/ACCESS.2019.2901300 IEEE Access 7 2019 Multi agent system for boundary detection and object tracking in image sequence based on active contours AFekir NBenamrane 10.3233/MGS-150230 Multiagent Grid Syst 11 2015 Multi-agent system perception with stereovision GVincent EPatten GOhmes NCouch 10.1145/3545947.3573289 doi:10.1145/3545947.3573289 Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 2, SIGCSE 2023 the 54th ACM Technical Symposium on Computer Science Education V. 2, SIGCSE 2023

New York, NY, USA

Association for Computing Machinery 2023 1235 Object oriented image analysis based on multi-agent recognition system FTabibMahmoudi FSamadzadegan PReinartz 10.1016/j.cageo.2012.12.007 Computers & Geosciences 54 2013 GCoulouris JDollimore TKindberg Distributed Systems: Concepts and Design Addison-Wesley 2011 5 ed The blackboard model of problem solving and the evolution of blackboard architectures HPNii 10.1609/aimag.v7i2.537 AI Magazine 7 38 1986 An Introduction to MultiAgent Systems MWooldridge 2009 John Wiley & Sons SRussell PNorvig Artificial Intelligence: A Modern Approach Prentice Hall 2010 3 ed