Bounded-Monitor Placement in Normative
                     Environments

               Guilherme Krzisch †, Nir Oren ‡, and Felipe Meneguzzi †

                † Pontifical Catholic University of Rio Grande do Sul, Brazil
                           ‡ University of Aberdeen, Scotland, UK
      guilherme.krzisch@acad.pucrs.br, n.oren@abdn.ac.uk, felipe.meneguzzi@pucrs.br


       Abstract. In order to sanction non-compliant agents, norm violations must be
       detected, which in turn requires norm monitoring. This paper examines the prob-
       lem of monitor placement within a normative multi-agent system under budgetary
       constraints. More specifically we consider a system containing (1) a set of pos-
       sible monitors able to determine the state of a subset of the domain; (2) costs
       associated with deploying the monitors; and (3) a set of norms for which com-
       pliance must be monitored, and which, if violated, result in a penalty. We seek to
       identify which combination of monitors maximizes the system’s utility. We for-
       malize the problem and evaluate approximate solutions using several heuristics,
       empirically demonstrating their efficiency.


1   Introduction

Since agents within open multi-agent systems cannot be assumed to share goals, ob-
taining desirable outcomes requires a coordination mechanism, and norms [5] are often
used to perform this coordination. While regimented norms – which prevent an agent by
design from undertaking undesirable behavior – are widely often assumed, a substantial
body of work has shown the advantages of using approaches based on enforcement [8,
10]. These latter approaches, which allow norms to be violated, require a mechanism
that monitors for norm compliance or violation and applies sanctions when appropri-
ate [11]. In turn, norm enforcement can be performed either by some organization or by
other autonomous agents in the system [14].
     Norm enforcement assumes that violation and compliance can always be detected
[6, 7, 11]. Such an assumption is, however, clearly unrealistic. First, in a large system,
the cost of monitoring is very high [15]. Second, there are often portions of the environ-
ment that are not fully observable, and which monitors cannot access. There is therefore
a need to investigate how norms should be monitored.
     Previous work on the monitoring of norms has considered how norms should be
modified so as to be monitorable [2], and whether related states can be observed which
may indicate upcoming norm violations [1]. In this paper, we consider instead how
monitors should be deployed within a system, assuming that such deployments have a
cost, so as to monitor the most important norms. Our approach builds on ideas from
the planning literature, and in Section 2, we introduce the necessary concepts from this
domain and formalize the notion of a norm. Section 3 introduces monitors and formally
describes the problem we are addressing. We describe several approaches to addressing
the problem in Section 4. Finally, Section 5 evaluates our solutions, and we conclude
by considering related work (Section 6) and directions for future research (Section 7).


2     Background
In this section we introduce concepts from the planning literature, used to formalise the
system and our solution. Following this, we describe norms within our system.

2.1   Planning
We build on classical planning, which assumes finite, fully observable and determin-
istic systems, and adapt the definitions from Ghallab et al. [9, Ch. 2]. Systems that
follow classical planning semantics assume that actions in the domain cause transitions
between states, and are specified in terms of sets of predicates.
Definition 1 (Predicate and State). A predicate in a first-order language L is com-
posed of a symbol and zero or more terms. Each term can be either a constant or a
variable; a predicate is ground if it does not contain variables. We denote as |L| the
number of ground predicates in this language. A state is a set of ground literals (i.e.,
positive or negative predicates) in L.
    We use the operator |= to specify that a state (a set of predicates) satisfies a logical
formula, i.e., that a formula is a model of the state.
    Classical planning makes a closed-world assumption — that if a state does not spec-
ify a predicate, then this predicate does not hold in that state. We write s |= P where P
is a set of ground predicates, if the conjunction of these ground predicates is satisfied
by s. We formalize action execution, and its effects on the environment as follows.
Definition 2 (Planning Operator and Actions). A planning operator is represented by
a triple o = hname(o), pre(o), ef f (o)i, where name(o) is the description of o. pre(o)
and eff (o) are set of predicates representing the planning operator’s preconditions and
effects. An action a is a ground instance of a planning operator, and is applicable in
state s only if s |= pre(a). The result of applying action a to state s is a new state s0 ,
such that s0 = (s/eff − (a)) ∪ eff + (a), where eff = eff + ∪ eff − (such that eff + ∩
eff − = ∅) and eff − is the set of negated predicates and eff + is the set of positive
predicates.
    We specify the dynamics of our multiagent system in terms of a transition func-
tion following classical planning semantics in Definition 3, and the initial states of the
system in terms of planning problem instances, as per the following definitions.
Definition 3 (Planning Domain). If L is a first-order language with finite sets of predi-
cates and constants, a planning domain in L is a state-transition system Σ = hS, A, γi,
where S ⊆ 2|L| is a subset of all possible states; A is the set of all ground instances of
planning operators; γ(s, a) is a state-transition function defined as follows: if a ∈ A
and a is applicable to s ∈ S, it returns the next state s0 ∈ S, which is the result of
applying action a to state s.
   Note that S is closed under γ, i.e., given a state s ∈ S, all states reachable from
applying action a in s are also in S.

Definition 4 (Planning Problem). A planning problem is defined as a triple P =
hΣ, s0 , gi, where Σ is the state-transition system, s0 ∈ S is the initial state of the
problem and g, the goal, is a set of ground predicates.


2.2    Norms

In open and dynamic societies, self-interested agents cannot be assumed to share the
same set of goals. In this context, norms can be used to regulate and coordinate behavior
[13, Ch. 14]. For our work, we adapt the definition of a norm from [12]. We consider
two norm types: obligations and prohibitions; the former specifies behavior that must
be followed by agents, while the latter specifies behavior that must be avoided.

Definition 5 (Norm). A norm is a tuple n = hµ, χ, ρ, Ci, where:

 – µ ∈ {obligation, prohibition} represents the norm’s modality;
 – χ is a set of ground predicates that represents the context to which a norm applies,
   i.e., a norm is applicable in state s if s |= χ;
 – ρ ∈ A represents the object of the norm’s modality;
 – C is the cost or penalty to the society which occurs if the norm is violated.

Example 1. The following norm requires an agent to drive on the left side of the road
if they are in England; a violation causes harm to the society worth 20 units of utility.

                  n0 = hobligation, at(England ), driveLeft(a, b), 20i

      We now describe when a norm is considered to be violated by an agent.

Definition 6 (Norm Violation). A norm n = hµ, χ, ρ, Ci is violated in state s by an
agent a iff:

 – s |= χ; and
 – agent a either: executes action ρ in state s and µ = prohibition; or does not
   execute action ρ in state s and µ = obligation.

     A violated norm has an undesirable impact on the society, as encoded by its cost C.
To dissuade agents from violating norms, when such a violation is detected, an enforcer
applies a penalty to the agent. In this work, we do not consider the nature of this penalty,
assuming instead that it is sufficiently large to prevent the agent from violating a norm
if such a violation can be detected. We must therefore consider how to place monitors
so that violations are detected, while minimizing monitor costs. We refer to this as the
bounded-monitor placement problem, and describe it in more detail in the next section.
3     Bounded-Monitor Placement Problem
We consider a set of monitors able to determine whether some combination of predi-
cates is, or is not satisfied. Formally, we define a monitor as follows.
Definition 7 (Monitor). A monitor m = hP, Di consists of a set of predicates P , and
a deployment cost D ∈ R. We refer to the predicates of a monitor m as Pm , and to its
cost as Dm .
    A set of monitors can be used to monitor more complex combinations of predi-
cates. Some monitors can conceivably detect the status of a predicate related to multiple
norms, or when combined, can be used to determine the status of a norm that individ-
ual monitors cannot. We formalize the combination of monitors aiming to cover sets of
norms as a Monitor Placement.
Definition 8 (Monitor Placement). A monitor placement MP is a tuple hM1 , M2 i,
where M1 and M2 are sets of monitors, representing the capability of observing predi-
cates in the current and in the next state, assuming an action was performed.
    A monitor placement detects a norm violation hµ, χ, ρ, Ci iff given a state s and s0
such that s0 is the result of applying an action ρ at s; and s |= χ, one of the following
holds:
 1. µ = prohibition and s |=V {Pm1 |m1 ∈ M1 } and s0 |=V {Pm2 |m2 ∈ M2 }.
                                 V                              V
 2. µ = obligation and s |= {Pm1 |m1 ∈ M1 } and s0 6|= {Pm2 |m2 ∈ M2 }.
                                              P
P The cost C of a monitor placement is m∈M1 ∨m∈M2 Dm , while the utility U is
  n∈N C, where N is the set of norms detected by this placement.

  By introducing the concept of an available budget, we define our problem of placing
monitors in a system as follows.
Definition 9 (Bounded-Monitor Placement Problem). A bounded-monitor placement
problem is encoded as a triple hM, N, Bi where M is a set of monitors, N is a set of
norms, and B ∈ R+ is a budget.
   A solution to the problem is a monitor placement MP such that its cost is smaller
than, or equal to, the budget.
   The set of possible solutions for a monitor placement problem is exponential in the
worst case, and in the next section, we suggest several approaches to finding solutions.


4     Solution
4.1   Brute-force
A brute-force approach is a trivial solution to this problem. It considers all possible
combinations of available monitors and returns the best one. We can clearly see that
this is impractical for large problems, as its complexity increases exponentially given
the size of the possible monitors set input: specifically, it has a time complexity of
O(22|M | ). We use this approach as a baseline against which we compare the remaining
heuristics.
4.2   Mapping from norms to monitors
The main drawback of the brute-force approach is that it includes a large number of
irrelevant solutions while enumerating all possible solutions. We can reduce this over-
head by computing a mapping from norms to monitor placements, i.e., by finding the
set of all monitor placements capable of monitoring each norm.
    With this mapping, we can compute possible solutions by choosing one monitor
                                                  Q|N |
placement for each norm. Generalizing, we have i=1 (|MPni | + 1) possible solutions,
where MPni is the set of monitor placements able to monitor norm ni ; since we also
need to consider whether it is practical to monitor a given norm (e.g., when there is
no available budget to do so), we need to add the empty monitor placement set. In the
best case we have only one possible monitor placement for each norm, and in the worst
case we have all possible combinations of available monitors for MP M1 and MP M2 ,
for each norm. Therefore, the number of solutions ranges from 2|N | to 4|M ||N | .
    Given this exponential complexity, it is clear that both the brute force and monitor
mapping approaches cannot scale up to larger problem sets. We must therefore consider
heuristics for addressing the problem, which we describe next.

4.3   Naive Approximate Solution
To improve performance compared to the brute-force approach we introduce a simple
approximate solution whose purpose is to serve as a a baseline for comparing the accu-
racy of the other solutions. This solution iterates over monitors ranked by their expected
probability of detecting norm violations. To rank monitors, we consider the number of
norms that a single monitor can partially detect — a monitor can partially detect a norm
if it has at least one predicate of the norm’s context or of the preconditions of the norm’s
action ρ. The intuition here is that choosing monitors that can partially observe several
norms leads to a final monitor placement that can detect many existing norms.
      This approach, however, does not capture essential parts of the problem. First, it
does not consider the norm’s penalty and monitor’s cost. Second, it has an overly strong
assumption that joining monitors that can partially detect norms will lead to a monitor
placement that can actually detect norms. We can therefore enhance this approach by
using the mapping describe in Section 4.2 together with a greedy search.

4.4   Greedy Solution
We propose two approaches with different heuristics using the mapping structure in-
troduced in Section 4.2. By using a heuristic to rank the best monitor placements, we
can avoid searching through an exponential solution search space. More specifically,
we select the best monitor placement candidate for each norm. The resulting heuristic
sacrifices optimality for efficiency, running in linear time.
     Our base algorithm (used for both of our heuristics) is described in Algorithm 1.
It starts by creating a mapping between norms and monitor placements, as described
in Section 4.2. After this, it adds monitors to an initially empty monitor placement
currentMP , while budget is available. Within each iteration, it selects one norm, gets
a monitor placement able to monitor this norm, and adds the already selected monitors
Algorithm 1 Greedy algorithm
1: procedure F INDA PPROXIMATE S OLUTION(N:Norms)
2:    build mapping from norms to monitor placements
3:    currentMP ← {}
4:    while hasBudget do
5:        n ← extractMaxNorm(N )
6:        mp ← getMinMP (n)
7:        currentMP ← currentMP ∪ mp
      return currentMP


(currentMP ) to the monitor placement. We now describe two heuristics to speed up
this algorithm.

4.4.1 Norm Independence Heuristic Our first heuristic considers norms to be mon-
itored completely independently of each other when choosing monitors in order to sub-
stantially prune the search space of the problem. We first need to define which norm
extractMaxNorm(N) in line 5 of Algorithm 1 chooses. Here, we select the norm with
the highest penalty, in order to increase the value of U (the sum of each individual norm
penalty), i.e., extractMaxNorm(N ) = arg maxn∈N Cn .
    The other decision required is which monitor placement is selected by the get-
MinMP(n) method (line 6). For this, we select the placement with the lowest cost,
as it is a good monitor placement able to monitor norm n. Thus, getMinMP (n) =
arg minMP∈MPn CMP .
    This approach does not consider the intersection of monitors that are able to mon-
itor multiple norms; it selects monitor placements independently of the others. Conse-
quently, we improve this solution in what follows by introducing the concept of current
cost to try to find a better approximate solution for our problem.

4.4.2 Add and Update Heuristic The structure of this heuristic is similar to the
previous approach. However, instead of using the cost of each monitor placement as the
metric to select the one with lowest budget, we use its current cost. The current cost
of a given monitor placement M P is computed as per Formula 1 below, and considers
only the cost of monitors that were not already selected in a previous iteration (and thus
not in the set of monitors of currentMP).
                                                      X
                                  CCM P =                           Dm                 (1)
                                            m∈MM P ∧m6∈McurrentMP

                         getMinMP (n) = arg min CCM P                                 (2)
                                            M P ∈MPn
                                                           Cn
                  extractMaxNorm(N ) = arg max                                        (3)
                                              n∈N      CCgetMinMP(n)

The getMinMP(n) method is implemented as per Formula 2, which we use to compute
the next norm to be monitored using the extractMaxNorm(N) method, implemented in
Formula 3. This heuristic chooses the norm with the highest value based on the ratio
between its penalty and the lowest current cost of the set of monitor placements. By
using the current cost, we disregard the costs of monitors that have already been chosen
in past iterations, yielding better estimates. In the next section we empirically evaluate
the different approaches proposed in this section.


5             Experiments and Results
To empirically evaluate our approaches, we automatically generate sets of norms and
sets of monitors with increasing complexity. In our experiments, we assumed the set
of monitors M2 of a monitor placement can in effect see all actions that were executed
from the states visible by M1 ; i.e., we assume that our monitor placement is always able
to check the next state after applying a given action. While this is a strong assumption
and makes our problem easier to solve, it still captures the exponential nature of the
bounded-monitor placement problem. Our experiments are based on standard planning
problems [], and our main domain is the drink-driving domain, where agents are able
to drive between cities, and there is a norm stating that it is forbidden to drive while
drunk; we also tested with blocksworld, depots, dwr, easy ipc grid, gripper, logistics
and robby domains.
    There are two metrics to consider when analyzing results: time performance and
accuracy. When comparing time performance we use the brute-force approach as a
baseline for the approximate approaches. In Figure 1 we can see that the brute-force ap-
proach becomes intractable for a relative small number of norms, while the approximate
approaches remain linear as the number of norms increases.


            60000                                                                    250
                              Brute-force                                                  Norm Independence
                    Approximate Approach                                                      Add and Update
            50000                                                                    200

            40000
                                                                                     150
                                                                         Time (ms)
Time (ms)


            30000
                                                                                     100
            20000
                                                                                     50
            10000

               0                                                                      0
                         5         10        15           20   25   30                            50           100       150       200   250

                                        Number of Norms                                                          Number of Norms


Fig. 1. Time Efficiency of Brute-Force and                               Fig. 2. Time Efficiency of both greedy ap-
Approximate approaches, with timeout of                                  proaches, smoothed using a sample of 100
60 seconds                                                               data points and interpolated using splines

    Figure 2 shows the time performance of both greedy solutions. The Add and Update
Solution is worse in this metric than the Norm Independence Solution because it needs
to recompute the current cost at each iteration. It is still fast compared to the brute-force
approach, being able to find a solution for a problem with almost 300 norms in under
one second.
    We perform two experiments to compare the accuracy of the results of the approx-
imate approaches. First, in Figure 4, we show the relative accuracy compared with an
optimal solution to the problem using the brute-force approach. Note that, as we are
comparing with the brute-force approach, these results are limited to small problems
that this approach can solve1 . In this experiment, both greedy approaches outperform
the naive solution; between the two greedy approaches, the second one (Add and Update
Heuristic) has, in almost all domains, greater accuracy, and — in all cases — a smaller
standard deviation. For the depots, gripper and logistics domain, the second solution
achieves optimal accuracy; this can be explained as these domains become intractable
even for small number of norms, and thus the number of problems in this experiment,
for these domains, is also small. The increase in performance from the first to the sec-
ond greedy approach is relatively small for these domains; while it is relatively large
for the drinkdriving and robby domains.


           1.2                                                                                    1.2
            1                                                                                      1
           0.8                                                                                    0.8
Accuracy


                                                                                       Accuracy
           0.6                                                                                    0.6
           0.4                                                                                    0.4
           0.2                                                                                    0.2
            0                                                                                      0
                 bl


                        de


                                   dr


                                           dw


                                                   ea


                                                              gr


                                                                     lo


                                                                                ro


                                                                                                        bl


                                                                                                               de


                                                                                                                          dr


                                                                                                                                  dw


                                                                                                                                          ea


                                                                                                                                                     gr


                                                                                                                                                            lo


                                                                                                                                                                       ro
                   oc


                                                                        g


                                                                                                          oc


                                                                                                                                                               g
                                                                                   b


                                                                                                                                                                          b
                                   in


                                                              ip


                                                                                                                          in


                                                                                                                                                     ip
                            po


                                                    sy


                                                                                                                   po


                                                                                                                                           sy
                                               r


                                                                                                                                      r
                                                                      is


                                                                                                                                                             is
                                                                                 by


                                                                                                                                                                        by
                                      k


                                                                 p


                                                                                                                             k


                                                                                                                                                        p
                   ks


                                                                                                          ks
                                                                        tic


                                                                                                                                                               tic
                                                       ip


                                                                                                                                              ip
                                    dr


                                                                                                                           dr
                              ts


                                                                                                                     ts
                                                               er


                                                                                                                                                      er
                      w


                                                                                                             w
                                                          c


                                                                                                                                                 c
                                                                            s


                                                                                                                                                                   s
                                        iv


                                                                                                                               iv
                                                         gr


                                                                                                                                                gr
                       or


                                                                                                              or
                                          in


                                                                                                                                 in
                                                          id


                                                                                                                                                 id
                        ld


                                                                                                               ld
                                          g


                                                                                                                                 g
                                               Domain                                                                                 Domain
                                Naive                   Add and Update                                                 Naive                   Add and Update
                    Norm Independence                                                                      Norm Independence


Fig. 3. Accuracy of approximate ap-                                                    Fig. 4. Accuracy of approximate ap-
proaches, compared to the maximum U                                                    proaches, compared to a brute-force
possible; error bars represent one standard                                            solution; error bars represent one standard
deviation of uncertainty                                                               deviation of uncertainty


    To investigate if the relationships found for the first experiment hold for large prob-
lems, we perform additional experiments, shown in Figure 3. As these problems cannot
be solved in a timely manner using a brute-force approach, in order to calculate their
accuracy we compare them with a perfect solution that could monitor all norms, but
that does not necessarily need to respect the available budget. This perfect solution has
the maximum value of U, which can be unattainable for actual solutions to these prob-
lems; therefore, in this experiment we are interested in the relative accuracy between
the approximate approaches, and not in how close they are to the perfect solution.
    We can see the same pattern in this experiment; the greedy approaches perform
better than the naive solution, with a slight advantage for the Add and Update Heuristic.
The increase in performance from the first to the second greedy approach remains large
for the drinkdriving domain, while for other domains it is relatively small.
    From the experiments we conclude that brute-force solution is intractable for all but
small problems, while approximate approaches can solve large problems. The accuracy
of the greedy approaches is better than the naive solution, with a slight advantage to
the second greedy approach. This advantage is more noticeable for small problems; for
large problems — as we do not have the value for the optimal solution — this difference
is smaller.
    1
           We set a timeout of 30 seconds for this experiment.
6   Related Work

Other authors also dropped the assumption that a monitor has full observability in the
system. Criado’s [4] approach is similar to our work; their norms are more expressive,
and for their solution they use the CEF algorithm which uses a greedy approach. How-
ever, they do not perform an empirical evaluation, or consider whether their approach is
able to actually monitor any norm, as there are no guarantees proven for their proposed
algorithm. Alechina et al. [2] models a set of queries that a monitor can ask in a state,
i.e., a monitor may not be able to distinguish between two states. They then modify
the set of norms to a new set of approximate norms that can be optimally monitored
given a set of monitors and queries, therefore approaching the problem from another
perspective.
     Alechina et al. [1] define norms using LTL (linear temporal logic) formulas. They
introduce the concept of a guard, which uses lookahead mechanisms to detect future
norm violations. The size of the lookahead window is bounded to reduce the amount
of computation in the future (they have complete knowledge of the past), and have
similarities with the concept of monitor cost in our work. The main difference is that,
while we use a combination of monitors to be able to detect a norm violation, they
increase their lookahead window size to increase their monitoring capabilities, also
increasing its computational cost.
     Finally, our work is also similar to that of Bulling et al. [3]; both include the con-
cept of monitors and combination of monitors. While we use a relatively simple norm
formalism and optimize the cost to monitor these norms, their specification uses LTL-
formulas, focusing on its properties and relations. Their current framework does not
include norms or monitors costs.


7   Conclusion and Future Work

In this paper we extend the state of the art in norm monitoring by dropping the as-
sumption that a monitor system has full observability, i.e., that monitors can observe all
actions performed. Adding the notion of a set of available monitors and an associated
cost results in the problem of finding a monitor placement in order to maximize the
number of detectable violations. While brute-force solutions are impractical because
the possible solution search space is exponential in the size of input, we propose heuris-
tics that use a mapping between norms and monitors to find approximate solutions. Our
empirical of runtime performance and accuracy shows that these heuristics are both
practical in computational terms and approach optimal performance for many realistic
domains from the planning literature.
    We aim to extend the work in at least three ways. The first is to allow monitors to
dynamically modify their placement during execution. In this setting, monitors would
be able to observe agent actions and move to locations where more norm violations
occur. The second extension would be to consider different agents being able to perform
concurrent actions, and how to build monitors able to correctly identify which agent
violated a given norm. The third is to allow more complex expressions representing
both what monitors can observe and how monitors can be combined, as currently we
only consider conjunctions of monitors. This can increase the richness of our approach,
and we intend to investigate heuristics for the use of such more expressive monitors.


Acknowledgments
This work is partially supported by grants from CNPq/Brazil numbers 132339/2016-1
and 305969/2016-1.


References
 1. Alechina, N., Bulling, N., Dastani, M., Logan, B.: Practical run-time norm enforcement with
    bounded lookahead. In: Proceedings of the 2015 International Conference on Autonomous
    Agents and Multiagent Systems. pp. 443–451 (2015)
 2. Alechina, N., Dastani, M., Logan, B.: Norm approximation for imperfect monitors. In: Pro-
    ceedings of the 2014 International Conference on Autonomous Agents and Multiagent Sys-
    tems. pp. 117–124 (2014)
 3. Bulling, N., Dastani, M., Knobbout, M.: Monitoring norm violations in multi-agent systems.
    In: Proceedings of the 2013 International Conference on Autonomous Agents and Multiagent
    Systems. pp. 491–498 (2013)
 4. Criado, N.: A practical resource-constrained norm monitor. In: Proceedings of the 2017 Inter-
    national Conference on Autonomous Agents and Multiagent Systems. pp. 1508–1510 (2017)
 5. Dignum, F.: Autonomous agents with norms. Artificial Intelligence and Law 7(1), 69–79
    (1999)
 6. Esteva, M., Rosell, B., Rodriguez-Aguilar, J.A., Arcos, J.L.: Ameli: An agent-based middle-
    ware for electronic institutions. In: Proceedings of the Third International Joint Conference
    on Autonomous Agents and Multiagent Systems. pp. 236–243 (2004)
 7. Garcı́a-Camino, A., Noriega, P., Rodrı́guez-Aguilar, J.A.: Implementing norms in electronic
    institutions. In: Proceedings of the Fourth International Joint Conference on Autonomous
    Agents and Multiagent Systems. pp. 667–673 (2005)
 8. Garcı́a-Camino, A., Rodrı́guez-Aguilar, J.A., Sierra, C., Vasconcelos, W.: Constraint rule-
    based programming of norms for electronic institutions. In: Proceedings of the 2009 Inter-
    national Conference on Autonomous Agents and Multiagent Systems. pp. 186–217 (2009)
 9. Ghallab, M., Nau, D., Traverso, P.: Automated planning: theory & practice. Elsevier (2004)
10. Luck, M., d’Inverno, M.: Constraining autonomy through norms. In: Proceedings of the First
    International Joint Conference on Autonomous Agents and Multiagent Systems. pp. 674–681
    (2002)
11. Modgil, S., Faci, N., Meneguzzi, F., Oren, N., Miles, S., Luck, M.: A framework for moni-
    toring agent-based normative systems. In: Proceedings of the 2009 International Conference
    on Autonomous Agents and Multiagent Systems. pp. 153–160 (2009)
12. Oren, N., Panagiotidi, S., Vázquez-Salceda, J., Modgil, S., Luck, M., Miles, S.: Towards a
    formalisation of electronic contracting environments. In: Coordination, Organizations, Insti-
    tutions and Norms in Agent Systems IV, pp. 156–171 (2009)
13. Ossowski, S.: Agreement technologies, vol. 8. Springer Science & Business Media (2012)
14. Savarimuthu, B.T.R., Cranefield, S.: Norm creation, spreading and emergence: A survey of
    simulation models of norms in multi-agent systems. Multiagent and Grid Systems 7(1), 21–
    54 (2011)
15. Sutinen, J.G., Andersen, P.: The economics of fisheries law enforcement. Land economics
    61(4), 387–397 (1985)