<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Developing Targeted Communication through a Trust Factor in Multi-Agent Reinforcement Learning</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Simone</forename><forename type="middle">Di</forename><surname>Rienzo</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Francesco</forename><surname>Frattolillo</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roberto</forename><surname>Cipollone</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andrea</forename><surname>Fanti</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nicolo</forename><forename type="middle">'</forename><surname>Brandizzi</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Fraunhofer IAIS</orgName>
								<address>
									<addrLine>Schloss Birlinghoven, 1</addrLine>
									<postCode>53757</postCode>
									<settlement>Sankt Augustin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Luca</forename><surname>Iocchi</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<addrLine>Via Ariosto, 25</addrLine>
									<postCode>00185</postCode>
									<settlement>Roma RM</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Developing Targeted Communication through a Trust Factor in Multi-Agent Reinforcement Learning</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">E459D9D959CBC6E3743BF00829044D78</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:13+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Multi-Agent Systems</term>
					<term>Reinforcement Learning</term>
					<term>Trust Factor</term>
					<term>Computational Modeling</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The concept of trust has long been studied, initially in the context of human interactions and, more recently, in human-machine or human-agent interactions. Despite extensive studies, defining trust remains challenging due to its inherent complexities and the diverse factors that influence its dynamics in multi-agent environments. This paper focuses on a specific formalization of a trust factor: predictive reliability, defined as the ability of agents to accurately forecast the actions of their peers in a shared environment. By realizing this trust factor within the framework of multi-agent reinforcement learning (MARL), we integrate it as a criterion for agents to assess and select collaborators. This approach enhances the functionality of MARL systems, promoting improved cooperation and overall effectiveness.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction and Background</head><p>With the advent of artificial intelligence, the number of applications that require co-existence and the interaction between intelligent agents and humans is increasing over time. Such applications include autonomous vehicles <ref type="bibr" target="#b0">[1]</ref>, industrial robotics <ref type="bibr" target="#b1">[2]</ref>, healthcare robotics <ref type="bibr" target="#b2">[3]</ref>, service robotics <ref type="bibr" target="#b3">[4]</ref>, agricultural robotics <ref type="bibr" target="#b4">[5]</ref>, and many more <ref type="bibr" target="#b5">[6]</ref>. In this context, the concept of trust becomes essential, as it fosters cooperation and collaboration between humans and robots, enhancing efficiency and user satisfaction. It instills confidence in the reliability and predictability of robotic systems, which is crucial for their acceptance and adoption.</p><p>Trust is a concept that has been defined numerous times in the literature <ref type="bibr" target="#b6">[7]</ref>, yet there is still no single universally accepted definition. However, the numerous factors that influence trust are much easier to study when analyzed separately. Such factors may be associated with the trustor, which is the person that trusts, with the trustee, the ones being trusted, or could be dependent on the context <ref type="bibr" target="#b7">[8]</ref>. In this article, to formalize one of such "trust factors", we take inspiration from the definition given in Gambetta <ref type="bibr" target="#b8">[9]</ref>:</p><p>MultiTTrust: 3rd Workshop on Multidisciplinary Perspectives on Human-AI Team, June 11, 2024, Malmo, Sweden Envelope dirienzo.1844531@studenti.uniroma1.it (S. D. Rienzo); frattolillo@diag.uniroma1.it (F. Frattolillo); cipollone@diag.uniroma1.it (R. Cipollone); fanti@diag.uniroma1.it (A. Fanti); brandizzi@diag.uniroma1.it (N. Brandizzi); iocchi@diag.uniroma1.it (L. Iocchi) Orcid 0000-0002-2040-3355 (F. Frattolillo); 0000-0002-0421-5792 (R. Cipollone); 0009-0003-0764-3965 (A. Fanti); https://orcid.org/0000-0002-3191-6623 (N. Brandizzi); 0000-0001-9057-8946 (L. Iocchi) "trust (or, symmetrically, distrust) is a particular level of the subjective probability with which an agent assesses that another agent or group of agents will perform a particular action, both before he can monitor such action (or independently of his capacity ever to be able to monitor it) and in a context in which it affects his own action"</p><p>According to this notion, the trust between two agents can be correlated to the trustor's expectations about the choices made by the trustee in a context of mutual interaction. We formalized this definition in a Multi-Agent Reinforcement Learning (MARL) setting. Here, autonomous agents can benefit from reasoning about other agents' intentions, and they can use this information to improve their performance and select which agent to communicate with.</p><p>Problem and Solution Formulation we consider the common scenario in which agents do not have complete knowledge of the environment which is formalized by the Decentralized Partially Observable Markov Decision Process (DEC-POMDP) <ref type="bibr" target="#b9">[10]</ref> framework, defined as a Tuple ⟨𝐷, 𝒮 , 𝒜 , 𝑇 , 𝑅, Ω, 𝑂⟩, where 𝐷 is the number of agents; 𝒮 is the set of environment states shared by all agents; 𝒜 is the set of joint actions; 𝑇 ∶ 𝒮 ×𝒜 ×𝒮 → [0, 1] is the transition function; 𝑅 ∶ 𝒮 × 𝒜 → ℝ is the reward function; Ω is the set of joint observations and 𝑂 ∶ 𝒮 × Ω → [0, 1] is a set of observation probabilities returning the probability of joint observation.</p><p>A MARL solution for a DEC-POMDP is a set of 𝐷 functions, called policies 𝜋 𝑖 ∶ Ω 𝑖 → 𝒜 𝑖 , which map the local observations of each agent to its actions, in order to maximize the expected joint sum of discounted rewards: ∑ 𝑇 𝑘=𝑡 𝛾 𝑘 𝑅(𝑠 𝑘 , 𝑎 𝑘 ), where 0 ≤ 𝛾 &lt; 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Trust factors</head><p>We refer to Frattolillo et al. <ref type="bibr" target="#b10">[11]</ref> for a general definition of trust factors in MARL systems. Specifically, any trust factor is computed with respect to a specific Trustor X, a Trustee Y, and a task Γ:</p><formula xml:id="formula_0">TrustFactor(𝑋 |𝑌 , Γ) = 𝑓 (𝑜 𝑋 , 𝑎 𝑌 , 𝑟, [𝑏 𝑋 →𝑌 ], [𝑐 𝑋 →𝑌 ], [𝑐 𝑌 →𝑋 ])<label>(1)</label></formula><p>In this template, 𝑜 𝑋 denotes the observation of the trustor, 𝑎 𝑌 is the action of the trustee, 𝑟 is the immediate reward, and 𝑏 𝑋 →𝑌 is the current belief that the trustor currently maintains with respect to the trustee. Finally, 𝑐 𝑋 →𝑌 and 𝑐 𝑌 →𝑋 represent known facts that result from communication from trustor to trustee and vice versa. The brackets denote that these are optional components.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Method</head><p>In this work, we propose a specific instantiation of the function 𝑓 in eq. ( <ref type="formula" target="#formula_0">1</ref>) to capture the dependency identified above between the actions of the trustee and the trustor's expectations. Specifically, we model one of these trust factors as the ability of one agent, acting as a trustor, to predict the actions of another agent, acting as trustee. Therefore, among all agents, we select one to be considered as the primary agent who, in addition to learning its own policy, learns how to predict the actions computed by other agents. Specifically, for each trustee 𝑖, the primary agent estimates the Trust Score defined as the number of correctly predicted actions over the number of true actions. For predicting the others' actions, we adopt a simple neural network that we define PredNet, which is trained in a supervised fashion and that takes as input the observations of other agents and returns a prediction about their actions. The action predicted by the PredNet is concatenated to the state of the primary agent; this allows us to influence its decisions based on other agents' intentions. All agents used in the experiment are trained in a decentralized way through a MARL algorithm called Independent PPO <ref type="bibr" target="#b11">[12]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head><p>The environment used is a customized version of the Level-based Foraging (LBF) <ref type="bibr" target="#b12">[13]</ref>. This is a grid-world multi-agent environment in which agents should navigate and cooperate to collect food, which can be collected only if the sum of the levels of agents is equal to or higher than the level of the food. In our experiments, the primary agent selects which agent to communicate with among the agents in its field of view based on their trust score learned during the training. We did some experiments in a three agents environment, where the other two agents are defined respectively as trustable and unreliable. The trustable agent executes its actions according to its learned policy, and its goal is to cooperate with the primary agent. On the other side, the unreliable agent performs actions according to a bad policy that, with a certain probability, leads to incorrect action performed. The results of the experiment are shown in Figure <ref type="figure" target="#fig_0">1</ref>. Here, the trust score with respect to the trustable agent (b) is much higher than the one referred to the unreliable agent (c), and additionally, the average return of the primary agents is drastically better when relying on the former (a). In conclusion, we showed that using the trust score as a mechanism to select which agent to communicate with improves the performance in the case where an agent is not reliable.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: (a) comparison between the mean score of the primary agent when it uses the predictions from the trustable agent and the unreliable one. Trust score with respect to the (b) trustable agent and (c) unreliable agent</figDesc><graphic coords="3,107.63,84.19,125.01,93.76" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Acknowledgments</head><p>This work is supported by the Air Force Office of Scientific Research under award number FA8655-23-1-7257.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A review on autonomous vehicles: Progress, methods and challenges</title>
		<author>
			<persName><forename type="first">D</forename><surname>Parekh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Poddar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rajpurkar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chahal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">P</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Cho</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics11142162</idno>
		<ptr target="https://www.mdpi.com/2079-9292/11/14/2162.doi:10.3390/electronics11142162" />
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Survey on human-robot collaboration in industrial settings: Safety, intuitive interfaces and applications</title>
		<author>
			<persName><forename type="first">V</forename><surname>Villani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Pini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Leali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Secchi</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.mechatronics.2018.02.009</idno>
		<ptr target="https://doi.org/10.1016/j.mechatronics.2018.02.009" />
	</analytic>
	<monogr>
		<title level="j">Mechatronics</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="page" from="248" to="266" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<author>
			<persName><forename type="first">M</forename><surname>Kyrarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Lygerakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rajavenkatanarayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sevastopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">R</forename><surname>Nambiappan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">K</forename><surname>Chaitanya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Babu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mathew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Makedon</surname></persName>
		</author>
		<idno type="DOI">10.3390/technologies9010008</idno>
		<ptr target="https://www.mdpi.com/2227-7080/9/1/8.doi:10.3390/technologies9010008" />
	</analytic>
	<monogr>
		<title level="m">A survey of robots in healthcare</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">9</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A survey on the application trends of home service robotics</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A</forename><surname>Zachiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Andrikopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gornez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Nakamura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nikolakopoulos</surname></persName>
		</author>
		<idno type="DOI">10.1109/ROBIO.2018.8665127</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Robotics and Biomimetics (ROBIO)</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="1999" to="2006" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Human-robot interaction in agriculture: A survey and current challenges</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Vasconez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A</forename><surname>Kantor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Auat Cheein</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.biosystemseng.2018.12.005</idno>
		<ptr target="https://doi.org/10.1016/j.biosystemseng.2018.12.005" />
	</analytic>
	<monogr>
		<title level="j">Biosystems Engineering</title>
		<imprint>
			<biblScope unit="volume">179</biblScope>
			<biblScope unit="page" from="35" to="48" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A survey of multi-agent human-robot interaction systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dahiya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Dautenhahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Smith</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.robot.2022.104335</idno>
		<ptr target="https://doi.org/10.1016/j.robot.2022.104335" />
	</analytic>
	<monogr>
		<title level="j">Robotics and Autonomous Systems</title>
		<imprint>
			<biblScope unit="volume">161</biblScope>
			<biblScope unit="page">104335</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A survey on trust in autonomous systems</title>
		<author>
			<persName><forename type="first">S</forename><surname>Shahrdar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Menezes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nojoumian</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-01177-2_27/TABLES/4</idno>
		<ptr target="https://link.springer.com/chapter/10.1007/978-3-030-01177-2_27.doi:10.1007/978-3-030-01177-2_27/TABLES/4" />
	</analytic>
	<monogr>
		<title level="j">Advances in Intelligent Systems and Computing</title>
		<imprint>
			<biblScope unit="volume">857</biblScope>
			<biblScope unit="page" from="368" to="386" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">How and why humans trust: A meta-analysis and elaborated model</title>
		<author>
			<persName><forename type="first">P</forename><surname>Hancock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T</forename><surname>Kessler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Stowers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Brill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">R</forename><surname>Billings</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">E</forename><surname>Schaefer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Szalma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Frontiers in psychology</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Gambetta</surname></persName>
		</author>
		<title level="m">Can We Trust Trust?, Trust: Making and Breaking Cooperative Relations</title>
				<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="213" to="237" />
		</imprint>
		<respStmt>
			<orgName>Department of Sociology, University of Oxford</orgName>
		</respStmt>
	</monogr>
	<note>electronic edition</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The Complexity of Decentralized Control of Markov Decision Processes</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Bernstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Givan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Immerman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zilberstein</surname></persName>
		</author>
		<idno type="DOI">10.1287/moor.27.4.819.297</idno>
		<idno>doi:</idno>
		<ptr target="10.1287/moor.27.4.819.297" />
	</analytic>
	<monogr>
		<title level="j">Mathematics of Operations Research</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="819" to="840" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Towards computational models for reinforcement learning in human-ai teams</title>
		<author>
			<persName><forename type="first">F</forename><surname>Frattolillo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Brandizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cipollone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Luca</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3634/paper9.pdf" />
	</analytic>
	<monogr>
		<title level="m">2nd International Workshop on Multidisciplinary Perspectives on Human-AI Team Trust</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Is independent learning all you need in the starcraft multi-agent challenge?</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>De Witt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Makoviichuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Makoviychuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">H S</forename><surname>Torr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Whiteson</surname></persName>
		</author>
		<idno>CoRR abs/2011.09533</idno>
		<ptr target="https://arxiv.org/abs/2011.09533.arXiv:2011.09533" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Shared experience actor-critic for multi-agent reinforcement learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Christianos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Albrecht</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
