=Paper=
{{Paper
|id=Vol-3284/1530
|storemode=property
|title=LTLf Synthesis Under Environment Specifications
|pdfUrl=https://ceur-ws.org/Vol-3284/1530.pdf
|volume=Vol-3284
|authors=Antonio Di Stasio
|dblpUrl=https://dblp.org/rec/conf/ictcs/000122
}}
==LTLf Synthesis Under Environment Specifications==
LTLπ Synthesis Under Environment Specifications
(Short Paper)
Antonio Di Stasio
Sapienza University of Rome, Italy
Abstract
In this communication we present recent advances in the field of synthesis for agent goals specified in
Linear Temporal Logic on finite traces under environment specifications. In synthesis, environment
specifications are constraints on the environments that rule out certain environment behavior. To solve
synthesis of LTLπ goals under environment specifications, we could reduce the problem to LTL synthesis.
Unfortunately, while synthesis in LTLπ and in LTL have the same worst-case complexity (both are
2EXPTIME-complete), the algorithms available for LTL synthesis are much harder in practice than those
for LTLπ synthesis. We report recent results showing that when the environment specifications are in
form of fairness, stability, or GR(1) formulas, we can avoid such a detour to LTL and keep the simplicity
of LTLπ synthesis. Furthermore, even when the environment specifications are specified in full LTL we
can partially avoid this detour.
Keywords
Linear Temporal Logic on Finite Traces, Synthesis, Automata-Theoretic Approach
1. Introduction
Program synthesis is one of the most ambitious and challenging problem of CS and AI. Reactive
synthesis is a class of program synthesis problems which aims at synthesizing a program for
interactive/reactive ongoing computations [1, 2]. We have two sets of Boolean variables π³ and
π΄, controlled by the environment and the agent, respectively, and a specification π over π³ βͺ π΄
in terms of Linear Temporal Logic (ltl) [3]. The synthesis has to generate a program, aka a
strategy, for the agent such that for every environment strategy the simultaneous execution of
the two strategies generate a trace that satisfies π [2, 4, 5].
Recently, the problem of reactive synthesis has been studied in the case the specification is
expressed in ltl over finite traces (ltlπ ) and its variants [6]. This logic is particularly fitting for
expressing temporally-extended goals in Planning since it retains the fact that ultimately the
goal must be achieved and the plan terminated [7].
In the classical formulation of the problem of reactive synthesis the environment is free to
choose an arbitrary move at each step, but in AI typically we have a model of world, i.e., of the
environment behavior, e.g., encoded in a planning domain [8, 9, 10], or more generally directly in
temporal logic [11, 12, 13, 14]. In other words, we are interested in synthesis under environment
specifications [15, 16, 17, 18, 19, 20, 21], which can be reduced to standard synthesis of the
Proceedings of the 23rd Italian Conference on Theoretical Computer Science, Rome, Italy, September 7-9, 2022
" distasio@diag.uniroma1.it (A. Di Stasio)
0000-0001-5475-2978 (A. Di Stasio)
Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
implication: πΈππ£ β πΊπππ where πΈππ£ is the specification of the environment (the environment
specification) and πΊπππ is the specification of the task of the agent [13, 16]. The agent has
to realize its task πΊπππ only on those traces that satisfy the environment specification πΈππ£.
Specifically, we focus on synthesis under environment specifications for ltlπ goals. However,
while it is natural to consider the task specification πΊπππ as an ltlπ formula, requiring that also
πΈππ£ is an ltlπ formula is often too strong, since accomplishing the agent task may require
an unbounded number of environment moves and such number is decided by the agent that
determines when its task is finished. As a result πΈππ£ typically needs to be expressed over ltl
not ltlπ [17, 21]. So, even when focusing on ltlπ , what we need to study is the case where we
have the task πΊπππ expressed in ltlπ and the environment specification πΈππ£ expressed in ltl.
One way to handle this case is to translate πΊπππ into ltl [22] and then do ltl synthesis for
πΈππ£ β πΊπππ, see e.g. [17]. But, while synthesis in ltlπ and in ltl have the same worst-case
complexity, being both 2exptime-complete [2, 6], the algorithms available for ltl synthesis are
much harder in practice than those for ltlπ synthesis as experimentally shown [23, 24, 25].
For these reasons, several specific forms of ltl environment specifications have been consid-
ered. In this communication we reports recent results showing how to avoid the detour to LTL
synthesis in cases where the environment specifications are in the form of fairness or stability
[21], or gr(1) formulas [20], or specified in both ltlπ (e.g., for representing nodeterministic
planning domains or other safety properties) and ltl (e.g., for liveness/fairness), but separating
the contributions of the two by limiting the second one as much as possible [18].
2. ltl and ltlπ
ltl is one of the most popular logics for temporal properties [3]. Given a set of propositions π«,
the formulas of ltl are generated by the following grammar:
π ::= π | (π1 β§ π2 ) | (Β¬π) | (βπ) | (π1 π° π2 )
where π β π«. We use common abbreviations for eventually β’π β‘ true π° π and always as
β‘π β‘ Β¬β’Β¬π.
ltl formulas are interpreted over infinite traces π β (2π« )π . A trace π = π0 π1 . . . is a
sequence of propositional interpretations (sets), where for every π β₯ 0, ππ β 2π« is the π-th
interpretation of π. Intuitively, ππ is interpreted as the set of propositions that are π‘ππ’π at instant
π. Given π, we define when an ltl formula π holds at position π, written as π, π |= π, inductively
on the structure of π, as follows:
β’ π, π |= π iff π β ππ (for π β π«);
β’ π, π |= π1 β§ π2 iff π, π |= π1 and π, π |= π2 ;
β’ π, π |= Β¬π iff π, π ΜΈ|= π;
β’ π, π |= βπ iff π, π + 1 |= π;
β’ π, π |= π1 π° π2 iff there exists π β€ π such that π, π |= π2 , and for all π, π β€ π < π we
have that π, π |= π1 .
We say π satisfies π, written as π |= π, if π, 0 |= π.
ltlπ is a variant of ltl interpreted over finite traces instead of infinite traces [22]. The syntax
of ltlπ is the same as the syntax of ltl. We define π, π |= π, stating that π holds at position π,
as for ltl, except that for the temporal operators we have:
β’ π, π |= βπ iff π < last(π) and π, π + 1 |= π;
β’ π, π |= π1 π° π2 iff there exists π such that π β€ π β€ last(π) and π, π |= π2 , and for all
π, π β€ π < π we have that π, π |= π1 .
where last(π) denotes the last position (i.e., index) in the finite trace π. In addition, we define
the weak next operator β as abbreviation of βπ β‘ Β¬βΒ¬π. Note that, over finite traces, it does
not holds that Β¬βπ ΜΈβ‘ βΒ¬π, but instead Β¬βπ β‘ βΒ¬π. We say that a trace satisfies an ltlπ
formula π, written as π |= π, if π, 0 |= π.
3. ltlπ Synthesis Under Environment Specifications
Let π³ and π΄ Boolean variables, controlled by the environment and the agent, respectively.
An agent strategy is a function πππ : (2π³ )* β 2π΄ , and an environment strategy is a function
ππππ£ : (2π΄ )+ β 2π³ . A trace is a sequence (π0 βͺ π0 )(π1 βͺ π1 ) Β· Β· Β· β (2π³ βͺπ΄ )π . A trace
(ππ βͺ ππ )π is consistent with an agent strategy πππ if πππ (π) = π0 and πππ (π0 π1 Β· Β· Β· ππ ) =
ππ+1 for every π β₯ 0. A trace (ππ βͺ ππ )π is consistent with and environment strategy if
ππππ£ (π0 π1 Β· Β· Β· ππ ) = ππ for every π β₯ 0. For an agent strategy πππ and an environment
strategy ππππ£ let play(πππ , ππππ£ ) denote the unique trace induced by both πππ and ππππ£ , and
playπ (πππ , ππππ£ ) = (π0 βͺ π0 ), . . . , (ππ βͺ ππ ) be the finite trace up to π.
Let πΊπππ be an ltlπ formula over π³ βͺ π΄. An agent strategy πππ realizes πΊπππ if for every
environment strategy ππππ£ there exists π β₯ 0, chosen by the agent, such that the finite trace
playπ (πππ , ππππ£ ) |= πΊπππ, i.e., πΊπππ is agent realizable.
In standard synthesis the environment is free to choose an arbitrary move at each step, but
in AI typically the agent has some knowledge of how the environment works, which it can
exploit in order to enforce the goal, specified as an ltlπ formula πΊπππ. Here, we specify the
environment behaviour by an ltl formula Env and call it environment specification. In particular,
Env specifies the set of environment strategies that enforce Env [16]. Moreover, we require
that Env must be environment realizable, i.e., the set of environment strategies that enforce Env
is not empty. Formally, given an ltl formula π, we say that an environment strategy enforces π,
written ππππ£ β· π, if for every agent strategy πππ we have play(πππ , ππππ£ ) |= π.
The problem of ltlπ synthesis under environment specifications is to find an agent strategy
πππ such that
βππππ£ β· Env : βπ.playπ (πππ , ππππ£ ) |= πΊπππ.
As shown in [16], this can be reduced to solving the synthesis problem for the implication
Env β πΏπ πΏ(πΊπππ) where πΏπ πΏ(πΊπππ) being a suitable ltlπ -to-ltl transformation [22], which
is 2exptime-complete [2].
3.1. GR(1) Environment Specifications
There have been two success stories with ltl synthesis, both having to do with the form of the
specification. The first is the GR(1) approach: use safety conditions to determine the possible
transitions in a game between the environment and the agent, plus one powerful notion of
fairness, Generalized Reactivity(1), or GR(1). The second, inspired by AI planning, is focusing
on finite-trace temporal synthesis, with ltlπ as the specification language. In [20] we take
these two lines of work and bring them together, devising an approach to solve ltlπ synthesis
under gr(1) environment specification. In more details, to solve the problem we observe that
the agentβs goal is to satisfy Β¬Env πΊπ
(1) β¨ πΊπππ, while the environmentβs goal is to satisfy
Env πΊπ
(1) β§ Β¬πΊπππ. Moreover, πΊπππ can be represented by a deterministic finite automata (dfa)
[6]. Then, focusing on the environment point of view, we show that the problem can be reduced
into a gr(1) game in which the game arena is the complement of the dfa for πΊπππ and Env πΊπ
(1)
is the gr(1) winning condition. Since we want a winning strategy for the agent, we need to deal
with the complement of the gr(1) game to obtain a winning strategy for the antagonist.
This framework can be enriched by adding safety conditions for both the environment and
the agent, obtaining a highly expressive yet still scalable form of ltl synthesis. These two kinds
of safety conditions differ, since the environment needs to maintain its safety indefinitely (as
usual for safety), while the agent has to maintain its safety conditions only until s/he fulfills its
ltlπ goal, i.e., within a finite horizon. Again, the problem can be reduced to a gr(1) game in
which the game arena is the product of the dfa for the environment safety condition and the
complement of the dfa obtained by the product of deterministic automaton of the agent safety
condition and the one for the agent task.
Tool. The two approaches were implemented in a so-called tool GFSynth which is based on
two tools: Syft [25] for the the construction of the corresponding dfas and Slugs [26] to solve
and compute a strategy in a gr(1) game.
3.2. Fairness and Stability Environment Specifications
We now consider two different basic forms of environment specifications: a basic form of
fairness β‘β’πΌ (always eventually πΌ), and a basic form of stability β’β‘πΌ where in both cases
the truth value of πΌ is under the control of the environment, and hence the environment
specifications are trivially realizable by the environment. Note that due to the existence of ltlπ
goals, synthesis under both kinds of environment specifications does not fall under known easy
forms of synthesis, such as gr(1) formulas [27]. For these kinds of environment specifications,
[18] develops a specific algorithm based on using the dfa for the ltlπ goal as the arena and
then computing 2-nested fixpoint properties over such arena.
The algorithm proceeds as follows. First, translate the ltlπ πΊπππ into a dfa π’. Then, in case
of fairness environment specifications, solve the fair dfa game π’, i.e., a game over the dfa π’, in
which the environment (resp. the agent) winning condition is to remain in a region (resp., to
avoid the region) where πΌ holds infinitely often, meanwhile the accepting states are forever
avoidable, by applying a nested fixed-point computation on π’.
Likewise, for stability environment specifications, solve the stable dfa game π’, in which the
environment (resp. the agent) winning condition is to reach a region (resp., to avoid the region)
where πΌ holds forever, meanwhile the accepting states are forever avoidable, by applying a
nested fixed-point computation on π’.
Tool. The fixpoint-based techniques for solving ltlπ synthesis under fainerss and stability
environment specifications are implemented in two tools called FSyft and StSyft, both based on
Syft, a tool for solving symbolic ltlπ synthesis.
3.3. General ltl Environment Specifications
We now consider the general case where the environment specifications are expressed in both
ltlπ and ltl. For this case, in [18] we develop two-stage technique to effectively handle general
ltl environment specifications. This technique takes advantage of the simpler way to handle
ltlπ goals in the first stage and confines the difficulty of handling ltl environment specification
to the bare minimum in stage 2. In particular, given an ltl environment specification and an
ltlπ formula πΊπππ that specifies the agent goal, the problem is to find a strategy for the agent
πππ that realizes πΊπππ under ltl environment specification Env πΏπ πΏ . The algorithm proceeds
by taking the following stages.
β’ Stage 1: Build the corresponding deterministic finite automaton ππΊπππ of πΊπππ and solve
the reachability game for the agent over ππΊπππ . If the agent has a winning strategy in
ππΊπππ , then return it.
β’ Stage 2: If not, computes the following steps:
1. remove from ππΊπππ the agent winning region, obtaining πβ²πΊπππ ;
2. do the product of πβ²πΊπππ with the corresponding deterministic parity automaton
(dpa) of πΈππ£πΏπ πΏ π, obtaining β¬ = πβ² Γ π, and solve the parity game for the
environment over it [28, 29, 30];
3. if the agent has a winning strategy in β¬ then the synthesis problem is realizable
and hence return the agent winning strategy as a combination of the agent winning
strategies in the two stages.
An interesting observation in [18] is that when the part of the environment specifications are
expressed in ltlπ , i.e., the environment specifications have the form πΈππ£ = πΈππ£β β§ πΈππ£π ,
where πΈππ£β can be expressed as ltl formula and πΈππ£π as an ltlπ formula. In this case the
synthesis problem πΈππ£ β πΊπππ becomes (πΈππ£β β§ πΈππ£π ) β πΊπππ which is equivalent to
πΈππ£β β (πΈππ£π β πΊπππ) where (πΈππ£π β πΊπππ) is expressible in ltlπ . In this way πΈππ£π
does not contribute the resulting dpa and can be handled during stage 1 instead of 2 of our
technique. Specifically, it builds a dfa as the union of the dfa ππΈππ£π , i.e., the complement of
the dfa for πΈππ£π , and the dfa πGoal for the goal.
Tool. The two-stage technique was implemented in the tool 2SLS which is based on two tools:
Syft [25] for building the corresponding dfas and OWL [31] a tool for translating LTL into
different types of automata, and thus dpas.
Acknowledgments
We thank our co-authors on the publications mentioned in this communication: Giuseppe De
Giacomo, Lucas Tabajara, Moshe Y. Vardi and Shufang Zhu. This work is partially supported
by the ERC Advanced Grant WhiteMech (No. 834228), by the EU ICT-48 2020 project TAILOR
(No. 952215), by the PRIN project RIPER (No. 20203FFYLK), and by the JPMorgan AI Faculty
Research Award "Resilience-based Generalized Planning and Strategic Reasoningβ.
References
[1] A. Church, Logic, arithmetics, and automata, in: Proc. Int. Congress of Mathematicians,
1963.
[2] A. Pnueli, R. Rosner, On the synthesis of a reactive module, in: POPL, 1989.
[3] A. Pnueli, The temporal logic of programs, in: FOCS, 1977, pp. 46β57.
[4] R. Ehlers, S. Lafortune, S. Tripakis, M. Y. Vardi, Supervisory control and reactive synthesis:
a comparative introduction, Discrete Event Dynamic Systems 27 (2017) 209β260.
[5] B. Finkbeiner, Synthesis of Reactive Systems., Dependable Software Systems Eng. 45 (2016)
72β98.
[6] G. De Giacomo, M. Y. Vardi, Synthesis for LTL and LDL on finite traces, in: IJCAI, 2015.
[7] J. A. Baier, S. A. McIlraith, Planning with first-order temporally extended goals using
heuristic search, in: AAAI, 2006, pp. 788β795.
[8] G. De Giacomo, S. Rubin, Automata-theoretic foundations of FOND planning for LTLπ
and LDLπ goals, in: IJCAI, 2018, pp. 4729β4735.
[9] H. Geffner, B. Bonet, A Coincise Introduction to Models and Methods for Automated
Planning, Morgan&Claypool, 2013.
[10] C. Green, Theorem proving by resolution as basis for question-answering systems, in:
Machine Intelligence, volume 4, American Elsevier, 1969, pp. 183β205.
[11] R. Bloem, R. Ehlers, S. Jacobs, R. KΓΆnighofer, How to handle assumptions in synthesis, in:
SYNT, 2014.
[12] B. Bonet, G. De Giacomo, H. Geffner, S. Rubin, Generalized planning: Non-deterministic
abstractions and trajectory constraints, in: IJCAI, 2017, pp. 873β879.
[13] K. Chatterjee, T. A. Henzinger, B. Jobstmann, Environment assumptions for synthesis, in:
CONCUR, 2008, pp. 147β161.
[14] N. DβIppolito, N. RodrΓguez, S. SardiΓ±a, Fully observable non-deterministic planning as
assumption-based reactive synthesis, J. Artif. Intell. Res. 61 (2018) 593β621.
[15] B. Aminof, G. De Giacomo, A. Murano, S. Rubin, Synthesis under assumptions, in: KR,
2018, pp. 615β616.
[16] B. Aminof, G. De Giacomo, A. Murano, S. Rubin, Planning under LTL environment
specifications, in: ICAPS, 2019, pp. 31β39.
[17] A. Camacho, M. Bienvenu, S. A. McIlraith, Finite LTL synthesis with environment assump-
tions and quality measures, in: KR, 2018, pp. 454β463.
[18] G. De Giacomo, A. Di Stasio, M. Y. Vardi, S. Zhu, Two-stage technique for LTLπ synthesis
under LTL assumptions, in: KR 2020, 2020, pp. 304β314.
[19] G. De Giacomo, A. Di Stasio, G. Perelli, S. Zhu, Synthesis with mandatory stop actions, in:
KR 2021, 2021, pp. 237β246.
[20] G. D. Giacomo, A. D. Stasio, L. M. Tabajara, M. Y. Vardi, S. Zhu, Finite-trace and generalized-
reactivity specifications in temporal synthesis, in: IJCAI 2021, 2021, pp. 1852β1858.
[21] S. Zhu, G. De Giacomo, G. Pu, M. Vardi, LTLπ synthesis with fairness and stability
assumptions, in: AAAI, 2020.
[22] G. De Giacomo, M. Y. Vardi, Linear temporal logic and linear dynamic logic on finite traces,
in: IJCAI, 2013, pp. 854β860.
[23] S. Bansal, Y. Li, L. M. Tabajara, M. Y. Vardi, Hybrid compositional reasoning for reactive
synthesis from finite-horizon specifications, in: AAAI, 2020.
[24] A. Camacho, J. A. Baier, C. J. Muise, S. A. McIlraith, Finite LTL synthesis as planning, in:
ICAPS, 2018, pp. 29β38.
[25] S. Zhu, L. M. Tabajara, J. Li, G. Pu, M. Y. Vardi, Symbolic LTLπ synthesis, in: IJCAI, 2017,
pp. 1362β1369.
[26] R. Ehlers, V. Raman, Slugs: Extensible GR(1) synthesis, in: CAV, 2016, pp. 333β339.
[27] R. Bloem, B. Jobstmann, N. Piterman, A. Pnueli, Y. Saβar, Synthesis of reactive(1) designs,
J. Comput. Syst. Sci. 78 (2012) 911β938.
[28] A. Di Stasio, A. Murano, G. Perelli, M. Vardi, Solving parity games using an automata-based
algorithm, in: CIAA 2016, 2016, pp. 64β76.
[29] M. Jurdzinski, Small progress measures for solving parity games, in: STACS 2000, 2000,
pp. 290β301.
[30] A. D. Stasio, A. Murano, M. Y. Vardi, Solving parity games: Explicit vs symbolic, in: CIAA
2018, 2018, pp. 159β172.
[31] J. KretΓnskΓ½, T. Meggendorfer, S. Sickert, Owl: A library for π-words, automata, and LTL,
in: ATVA, 2018, pp. 543β550.