<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Rule-based Shield Synthesis for Partially Observable Monte Carlo Planning</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Giulio</forename><surname>Mazzi</surname></persName>
							<email>giulio.mazzi@univr.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Verona</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alberto</forename><surname>Castellini</surname></persName>
							<email>alberto.castellini@univr.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Verona</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Farinelli</surname></persName>
							<email>alessandro.farinelli@univr.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università degli Studi di Verona</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Rule-based Shield Synthesis for Partially Observable Monte Carlo Planning</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4C8996540FE4F5DB37A17C947EC968A6</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T06:56+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>POMCP</term>
					<term>SMT</term>
					<term>Shielding</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Partially Observable Monte-Carlo Planning (POMCP) is a powerful online algorithm able to generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. The lack of an explicit representation however hinders policy interpretability and makes policy verification very complex. In this work, we propose two contributions. The first is a method for identifying unexpected actions selected by POMCP with respect to expert prior knowledge of the task. The second is a shielding approach that prevents POMCP from selecting unexpected actions. The first method is based on Maximum Satisfiability Modulo Theory (MAX-SMT). It inspects traces (i.e., sequences of belief-action-observation triplets) generated by POMCP to compute the parameters of logical formulas about policy properties defined by the expert. The second contribution is a module that uses online the logical formulas to identify anomalous actions selected by POMCP and substitutes those actions with actions that satisfy the logical formulas fulfilling expert knowledge. We evaluate our approach in two domains. Results show that the shielded POMCP outperforms the standard POMCP in a case study in which a wrong parameter of POMCP makes it select wrong actions from time to time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Planning in partially observable environments while satisfying safety guarantees is a challenging problem. Partially Observable Markov Decision Processes (POMDPs) <ref type="bibr" target="#b0">[1]</ref> is a popular framework to model systems with uncertainty. Computing an optimal solution for POMDPs is hard <ref type="bibr" target="#b1">[2]</ref>. However, it is possible to compute an approximate solution, and state-of-the-art algorithms achieve great performance in real-world instances of POMDPs. A pioneering algorithm for this purpose is Partially Observable Monte-Carlo Planning (POMCP) <ref type="bibr" target="#b2">[3]</ref> which uses a particle filter to represent the belief and a Monte-Carlo Tree Search based strategy to compute the policy online. The online nature of the policy, however, makes the task of analyzing the decisions taken by POMCP very difficult <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6]</ref>. In general, with a high number of particles POMCP yields great performance, but sometimes the simulation does not properly assess the risk of certain actions, especially if the number of particles used in the simulation is limited due to engineering constraints. Moreover, in POMCP the policy is never fully computed or stored, where (x 1 = x 2 ) ∧ (x 3 = x 4 ) ∧ (x 3 &gt; 0.9); hence it is very difficult to identify the reasons for possible unexpected decisions of the system. However, Explainability <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref> is becoming a key feature of artificial intelligence systems since in these contexts humans need to understand why specific decisions are taken by the agent.</p><p>In this work, we present a methodology for generating a safety mechanism from high-level descriptions of the desired behavior of a POMCP-generated policy. In this approach, a human expert provides qualitative information on a property of the system, enriched with an indication of the expected behavior that the system should have in specific situations (e.g., "the robot should move fast if it is highly confident that the path is not cluttered"). With this information, our methodology analyzes a set of execution traces of the system and provides quantitative details of these statements by analyzing the execution of the system (e.g., "the robot moves fast if its confidence of being in an uncluttered segment is at least 93.4%). The proposed approach formalizes the problem of parameters computation as a MAX-SMT problem which allows to express complex logical formulas and to compute optimal assignments when the template is not fully satisfiable (which happens in the majority of cases in real policy analysis). This quantitative answer is then used to synthesize a shield, namely a safety mechanism that forces the POMCP to satisfy the constraints expressed by the expert. The shield works alongside the Monte Carlo Tree Search (MCTS) by preemptively blocking actions that violate the rules.</p><p>In summary, we propose an SMT-based methodology that combines a logic-based description of a system with the real execution traces of a POMCP policy to create a set of rules describing the behaviors of an agent. This description can be used to synthesize a shield. we empirically evaluate the shielding mechanism in two domains, namely, the well-known Tiger problem and a robotic navigation problem, showing that it can exploit the knowledge provided by the expert to achieve higher performance than standard POMCP when its parameters are imprecise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methodology Overview</head><p>The proposed methodology is summarized in Figure <ref type="figure" target="#fig_0">1</ref>. It leverages the expressiveness of logical formulas to represent specific properties of the system under investigation, and this representation is used to automatically generate a shield, a security mechanism that forces the POMCP system to satisfy a set of high-level requirements. As a first step, a logical formula with free variables is defined (see box 2 in Figure <ref type="figure" target="#fig_0">1</ref>) to describe a property of interest of the Experimental Results. The first column shows the different values of the RewardRange c. The second (third) column shows the average return (time) achieved by the original POMCP and the relative standard deviation. The Shield section shows the average return and time achieved by POMCP using a shield (column four and six), values in bold show a statistically significant difference with respect to the shield counterpart (according to a paired t-test with 95% confidence level). Column RI shows the relative increase in performance between the two original and shielded POMCP. Finally, column #SA shows how many times the shield alters the decision during the execution policy under investigation. This formula, called rule template, defines a relationship between some properties of the belief (e.g., the probability to be in a specific state) and an action. Free variables in the formula allow the expert to avoid quantifying the limits of this relationship. These limits are then computed by analyzing a set of observed traces (see box 1). For Instance, to describe the behavior of the Tiger problem we can use the rule template presented in Figure <ref type="figure" target="#fig_1">2</ref>. The first template (𝑟 𝐿 ) says that we must listen when the confidence in finding a treasure is below a certain threshold for both doors (i.e., left or right). The other two (𝑟 𝑂𝑅 , 𝑟 𝑂𝐿 ) say that we must open the proper door when the confidence of finding the treasure is above a certain threshold. By defining a rule template the expert provides useful prior knowledge about the structure of the investigated property. This is combined with the real execution of a POMCP system collected into a trace. The methodology computes a rule (i.e., a rule template with all the free variables instantiated) using a MAX-SMT based algorithm. This algorithm finds a model for the free variables that explain as many of the decisions taken by POMCP as possible while satisfying the requirement defined in the template (box 3 of Figure <ref type="figure" target="#fig_0">1</ref>). A set of rules is then used to create a shield, a safety mechanism that we integrate into POMCP to preemptively block actions that do not respect the details defined by the expert with the template (box 4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>We test our methodology in two domains, namely, the standard POMDP domain Tiger <ref type="bibr" target="#b9">[10]</ref> and a robotic-inspired problem (velocity regulation) in which a robot travels a pre-specified path divided into segments with a (hidden) difficulty. The goal is to travel as fast as possible while avoiding collisions. The higher the speed, the higher the reward, but a higher speed suffers a greater risk of collision. A full description of the problem is presented in <ref type="bibr" target="#b10">[11]</ref>. To test the robustness of the shield in different scenarios, we injected an error in the POMCP implementation of the two domains. We modify the RewardRange parameter (called 𝑐 in the following) in POMCP. This parameter is used by UCT to balance exploration and exploitation. If this value is lower than the correct one the algorithm could find a reward that exceeds the maximum expected value leading to a wrong state, namely, the agent believes to have identified the best possible action and it stops exploring new actions. This is an interesting error because it is hard to detect, it randomly affects the exploration-exploitation trade-off without introducing any systematic mistakes. The code of the shielding mechanism is available at https://github.com/GiuMaz/XPOMCP.</p><p>In Tiger, the average return achieved using the shield is the same in all four cases, and this is also identical to the return achieved by the correct policy. This is because in tiger we can write a shield that perfectly recreates the behavior of the correct policy, a goal that is difficult to achieve in real-world problems. This is particularly interesting because the shields in the cases of 𝑐 ∈ {80, 60, 40} are obtained by using traces generated with a POMCP implementation that does make some mistakes. As a consequence, the Execution traces contain wrong decisions. However, the combination of insight provided by the expert with the MAX-SMT-based analysis of the traces results in a shield with extremely good performances.</p><p>In velocity regulation, the first row shows that the usage of a shield can improve the performance even when c is correct (i.e., 𝑐 = 103). In this case, the shield intervenes only 7 times (over the 3500 analyzed steps), yielding a 5.38% increment in the return. This happens because the shield blocks the rare cases in which the POMCP simulations are not enough to properly assess the risk of moving at high speed. When c decreases, the shield intervenes more often (see column #SA) since the error due to the limited number of simulations is combined with the errors generated by an incorrect value of c. Table <ref type="table" target="#tab_0">1</ref>.b also shows that a higher number of interventions leads to a bigger relative increase in the performance (column RI ). The difference is statistically significant in the case of 𝑐 ∈ {103, 90, 70}, and show that the introduction of the shield improves the performance up to the 81%, even in cases in which the shield is trained using traces generated by a POMCP process that makes some mistakes. In the case of 𝑐 = 50 the return increase but the difference is not statistically significant. The shield intervenes 171 times by blocking risky high-speed moves, but unlike Tiger, in which we use a rule for every possible action, here POMCP made many wrong decisions when it moves at low or medium speed (for example, by moving slowly when the path is clear).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions and Future Work</head><p>In this work, we present a methodology that generates a shielding mechanism for POMCP exploiting a high-level representation of expected policy behavior provided by human experts. The shielding mechanism preemptively blocks unexpected actions. We aim to further improve the integration between POMCP and the shielding mechanism (e.g., by considering the effect of shielding on other actions besides the first one of the simulation) and at developing an approach for synthesizing logical rules online, i.e., while the POMCP algorithm is running.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Methodology overview.</figDesc><graphic coords="2,89.29,84.19,208.35,113.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Tiger rule template</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell></cell><cell cols="2">No Shield</cell><cell></cell><cell>Shield</cell><cell></cell><cell></cell></row><row><cell>c</cell><cell>return</cell><cell>time (s)</cell><cell>return</cell><cell>RI</cell><cell>time (s)</cell><cell>#SA</cell></row><row><cell>110</cell><cell cols="2">3.702(±0.623) 0.066(±0.027)</cell><cell>3.702(±0.623)</cell><cell cols="2">0.00% 0.065(±0.029)</cell><cell>0</cell></row><row><cell>80</cell><cell cols="3">3.593(±0.632) 0.067(±0.030) 3.702 (± 0.623)</cell><cell cols="2">3.03% 0.061(±0.027)</cell><cell>4</cell></row><row><cell>60</cell><cell cols="3">3.088(±0.673) 0.060(±0.025) 3.702 (± 0.623)</cell><cell cols="3">19.88% 0.061(±0.027) 121</cell></row><row><cell cols="7">40 −4.173(±1.101) 0.035(±0.017) 3.702 (± 0.623) 188.71% 0.052(±0.023) 647</cell></row><row><cell></cell><cell></cell><cell></cell><cell>a) Tiger</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>No Shield</cell><cell></cell><cell></cell><cell>Shield</cell><cell></cell><cell></cell></row><row><cell>c</cell><cell>return</cell><cell>time (s)</cell><cell>return</cell><cell>RI</cell><cell>time (s)</cell><cell>#SA</cell></row><row><cell cols="4">103 24.716(±3.497) 10.166(±0.682) 26.045 (± 3.640)</cell><cell cols="2">5.38% 10.118(±0.238)</cell><cell>7</cell></row><row><cell cols="4">90 18.030(±3.794) 10.173(±0.234) 22.680 (± 3.524)</cell><cell cols="2">25.79% 10.166(±0.241)</cell><cell>12</cell></row><row><cell>70</cell><cell cols="2">4.943(±5.260) 10.278(±0.234)</cell><cell>8.970 (± 4.556)</cell><cell cols="2">81.46% 10.377(±0.230)</cell><cell>51</cell></row><row><cell>50</cell><cell cols="2">0.692(±5.051) 10.374(±0.230)</cell><cell cols="4">1.638(±4.525) 136.53% 10.435(±0.336) 171</cell></row><row><cell></cell><cell></cell><cell cols="2">b) Velocity Regulation</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes</title>
		<author>
			<persName><forename type="first">A</forename><surname>Cassandra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Littman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">L</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Thirteenth Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="54" to="61" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The Complexity of Markov Decision Processes</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">H</forename><surname>Papadimitriou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Tsitsiklis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Math. Oper. Res</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="441" to="450" />
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Monte-Carlo Planning in large POMDPs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Silver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Veness</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 23</title>
				<editor>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Lafferty</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><forename type="middle">K I</forename><surname>Williams</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Taylor</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Zemel</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Culotta</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="2164" to="2172" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Explaining the influence of prior knowledge on POMCP policies</title>
		<author>
			<persName><forename type="first">A</forename><surname>Castellini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Marchesini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farinelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th European Conference on Multi-Agents Systems</title>
		<title level="s">Lecture Notes in Artificial Intelligence</title>
		<meeting>the 17th European Conference on Multi-Agents Systems</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">12520</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Online monte carlo planning for autonomous robots: Exploiting prior knowledge on task similarities</title>
		<author>
			<persName><forename type="first">A</forename><surname>Castellini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Marchesini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farinelli</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 6th Italian Workshop on Artificial Intelligence and Robotics (AIRO 2019@AI*IA2019)</title>
		<title level="s">CEUR Workshop Proceedings, CEUR-WS</title>
		<meeting>the 6th Italian Workshop on Artificial Intelligence and Robotics (AIRO 2019@AI*IA2019)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">2594</biblScope>
			<biblScope unit="page" from="25" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Influence of State-Variable Constraints on Partially Observable Monte Carlo Planning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Castellini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chalkiadakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farinelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 28-th International Joint Conference on Artificial Intelligence, IJCAI-19</title>
				<meeting>28-th International Joint Conference on Artificial Intelligence, IJCAI-19</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="5540" to="5546" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Gunning</surname></persName>
		</author>
		<title level="m">DARPA&apos;s Explainable Artificial Intelligence (XAI) Program</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note>ii-ii</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Fox</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Magazzeni</surname></persName>
		</author>
		<idno>CoRR abs/1709.10256</idno>
		<title level="m">Explainable Planning</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Towards Explainable AI Planning as a Service</title>
		<author>
			<persName><forename type="first">M</forename><surname>Cashmore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Collins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Krarup</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Krivic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Magazzeni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2nd ICAPS Workshop on Explainable Planning</title>
				<meeting><address><addrLine>XAIP</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Planning and Acting in Partially Observable Stochastic Domains</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">P</forename><surname>Kaelbling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Littman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Cassandra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">101</biblScope>
			<biblScope unit="page" from="99" to="134" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Identification of Unexpected Decisions in Partially Observable Monte Carlo Planning: A Rule-Based Approach</title>
		<author>
			<persName><forename type="first">G</forename><surname>Mazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Castellini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farinelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">the 21th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS &apos;21</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
