Now or Never: Negotiating Efficiently with Unknown Counterparts Toni Mancini Computer Science Department, Sapienza University of Rome, Italy http://tmancini.di.uniroma1.it Abstract. We define a new protocol rule, Now or Never (NoN), for bilateral negotiation processes which allows self-motivated competitive agents to efficiently carry out multi-variable negotiations with remote untrusted parties, where privacy is a major concern and agents know nothing about their opponent. By building on the geometric concepts of convexity and convex hull, NoN ensures a continuous progress of the negotiation, thus neutralising malicious or inefficient opponents. In par- ticular, NoN allows an agent to derive in a finite number of steps, and independently of the behaviour of the opponent, that there is no hope to find an agreement. To be able to make such an inference, the interested agent may rely on herself only, still keeping the highest freedom in the choice of her strategy. We also propose an actual NoN-compliant strategy for an automated agent and evaluate the computational feasibility of the overall approach on instances of practical size. 1 Introduction Automated negotiation among rational agents is crucial in Distributed Arti- ficial Intelligence domains as, e.g., resource allocation [3], scheduling [16], e- business [7], and applications where: (i) no agent can achieve her own goals without interaction with the others (or she is expected to achieve more utility with interaction), and (ii) constraints of various kinds (e.g., security or privacy) forbid the parties to communicate their desiderata to others (the opponent or a trusted authority), hence centralised approaches cannot be used. We present a framework which allows two self-motivated, competitive agents to negotiate efficiently and find a mutually satisfactory agreement in a particu- larly hostile environment, where each party has no information on constraints, preferences, and willingness to collaborate of the opponent. This means that also the bounds of the domains of the negotiation variables are not common knowl- edge. Our framework deals with negotiations over multiple constrained variables over the type of real numbers, regarding integer or categorical variables as special cases. The present setting is very different from what is often assumed in the liter- ature: the set of possible agreements is infinite and agents do not even know (or probabilistically estimate) possible opponent’s types, variable domain bounds or most preferred values. It is not a split-the-pie game as, e.g., in [6] although with incomplete information, as in [11], and computing equilibrium or evaluating Pareto-optimality is not possible. 47 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts A major problem in our setting is that even termination of the negotiation process is not granted: it is in general impossible for the single agent to recognise whether the negotiation is making some progress, or if the opponent is just wasting time or arbitrarily delaying the negotiation outcome. We solve this problem by proposing a new protocol rule, Now or Never (NoN) (Section 3), explicitly designed as to ensure a continuous progress of the nego- tiation. The rule (whose fulfilment can be assessed independently by each party using only the exchanged information) forces the agents to never reconsider already taken decisions, thus injecting a minimum, but sufficient amount of effi- ciency in the process. This leads to the monotonic shrinking of the set of possible agreements, which in turn allows each agent to derive in a finite number of steps, independently of the behaviour of the opponent, that there is no hope to find an agreement. Furthermore, we discuss the notion of non-obstructionist agents, i.e., agents who genuinely aim at efficiently finding an agreement, even sacrificing their pref- erences (among the agreements they would accept). If both agents are non- obstructionist, the NoN rule guarantees that, whenever the termination con- dition arises, then no agreement actually exists. Hence, in presence of non- obstructionist agents, our approach is both complete and terminates. We also propose (Section 4) a full NoN-compliant strategy for an agent which ensures termination independently of the behaviour of the opponent. The strat- egy, which takes into full account the presence of a utility function on the set of acceptable deals, is inspired to the well-known mechanism of Monotonic Conces- sions (MC) [13] and allows the agent to perform a sophisticated reasoning, based on the evidence collected so far on the behaviour of the opponent, to select the best deals to offer at each step and keep the process as efficient as possible. Section 5 specialises NoN to discrete and categorical variables and Section 6 presents experimental results showing that enforcing the NoN rule in practical negotiation instances is computationally feasible. 2 Preliminaries and Negotiation Framework In the following, we denote with R the set of real numbers and with N+ the set of strictly positive integers. Our framework deals with (possibly multi-deal) negotiations between two agents (agent 0 and agent 1) over multiple constrained variables. Agents do not have any information about constraints, goals, preferences, reasoning capabili- ties, willingness to collaborate, and strategy of the opponent. The only knowledge common to both agents is the set of negotiation variables and the protocol rules. Definition 1 introduces the main concepts of our framework. Some of them are standard in the literature and are adapted to our framework to ease presentation of the following definitions and results. Definition 1 (Negotiation process). A negotiation process is a tuple π = hV, s, k, Ri where V is a finite set of negotiation variables, s ∈ {0, 1} is the agent starting the negotiation, and R is the set of protocol rules. The negotiation space is the multi-dimensional real vector space R|V| . Each point D ∈ R|V| is a deal. A proposal for π is a set of at most k deals or the 48 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts distinguished element ⊥. Value k ∈ N+ is the maximum number of deals that can be included in a single proposal. Negotiation proceeds in steps (starting from step 1) with agents (starting from agent s) alternately exchanging proposals. The proposal exchanged at any step t ≥ 1 is sent by agent ag(t), defined as s if t is odd and 1 − s if t is even. The status of negotiation process π at step t ≥ 1 is the sequence P = P1 , P2 , . . . Pt of proposals exchanged up to step t. At each step, the status of π must satisfy the set R of procotol rules, a set of boolean conditions on sequences of proposals. A strategy for agent A ∈ {0, 1} for π is a function σA that, for each step t such that ag(t) = A and each status P = P1 , P2 , . . . Pt−1 of π at step t − 1, returns the proposal Pt to be sent by agent A at step t, given the sequence of proposals already exchanged (σA is constant for t = 1 and A = s). Our alternating offers [14] based framework primarily focuses on real vari- ables. In Section 5 we discuss how more specialised domains (e.g., integers, cate- gories) can be handled as special cases, and which is the added value of primarily dealing with real variables. Also, as each proposal can contain up to k deals, our framework supports multi-deal negotiations when k > 1. Section 4 discusses the added value given by the possibility of exchanging multi-deal proposals. Protocol rules are important to prevent malicious or inefficient behaviour. Well-designed rules are of paramount importance when the process involves self- motivated and/or unknown/untrusted opponents. For protocol rules to be effec- tive, agents must be able at any time to verify them using the current negotiation status only. We will use Example 1 as a running example throughout the paper. Example 1 (Alice vs. Bob). Alice wants to negotiate with her supervisor Bob to schedule a meeting. At the beginning, agents agree on the relevant variables V: (i) the start day/time t; (ii) the meeting duration d. Deals are assignments of values to variables, as, e.g., D = ht = “Mon 11 am”, d = “30 min”i. Deals can be easily encoded as points in R2 . Definition 2 (Negotiation outcomes). Let π = hV, s, k, Ri be a negotiation process whose status at step T > 1 is P = P1 , P2 , . . . PT . We say that π ter- minates at step T if and only if T is the smallest value such that one of the following two cases holds: – success: PT = {D} ⊆ PT −1 (ag(T ) accepts deal D proposed by ag(T − 1) at step T − 1) – opt-out: PT =⊥ (ag(T ) opts-out). If no such a T exists, then π is non-terminating ( non-term). Success, opt-out and non-term are the possible negotiation outcomes. A negotiation process can be infinite (case non-term) or terminate in a finite number of steps, either with an agreement found (case success, where point D is the agreement) or with a failure (case opt-out, where one of the agents proposes ⊥, which aborts the process). For a deal to be acceptable to an agent, some constraints must be satis- fied. Such constraints, which are private information of the single agent, are formalised by Definition 3. 49 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts Definition 3 (Feasibility region). Let π = hV, s, k, Ri be a negotiation pro- cess. The feasibility region of agent A ∈ {0, 1}, denoted by RA , is the subset of the negotiation space R|V| of deals acceptable to A. For agent A, any deal in RA is better than failure. An agreement is thus any deal D ∈ R0 ∩ R1 . Example 2 (Alice vs. Bob (cont.)). Alice wants the meeting no later than Wednes- day. Normally she needs at least 30 minutes and does not want the meeting to last more than one hour; however, if she has to wait until Wednesday, she would have time during her Tuesday’s trip to prepare new material to show; in this case she wants the meeting to last at least one hour, but no more than 75 minutes. Conversely, Bob has his own, private, constraints. Fig. 1a shows Alice’s feasibility region in a 2D space, as the areas delimited by the three polygons. The region takes into account duties in her agenda (e.g., Alice is busy on Monday from 1pm to 4pm). Agents may have preferences on the deals in their feasibility region. Such preferences are often represented by a private utility function. Fig. 1a shows that, e.g., Alice prefers a long meeting on Monday. We will handle the agent utility function in Section 4 when we present a full strategy for an agent. We assume (as typically done, see, e.g., [5,12]) that agents offer only deals in their feasibility region (i.e., agents do not offer deals they are not willing to accept). This does not limit our approach, as suitable ex-post measures (e.g., penalties) can be set up to cope with the case where a deal offered by an agent (but not acceptable to her) is accepted by the other. 3 Now or Never We are interested in negotiations which are guaranteed to terminate in a finite number of steps (note that, being negotiation variables real-valued, the set of po- tential agreements is infinite), so we want to avoid case non-term of Definition 2. In this section we define a protocol rule, the Now or Never (NoN) rule, which is our key to drive a negotiation process towards termination, avoiding malicious or inefficient agents behaviour. The rule relies on the notions of Definition 4. S Definition 4 (Convex region, convex hull, operator ♦ ). Let Rn be the n-dimensional real vector space (for any n > 0). Region R ⊆ Rn is convex if, for any two points D1 and D2 in R, the straight segment D1 D2 is entirely in R. Given a finite set of points D ⊂ Rn , the convex hull of D, conv(D), is the smallest convex region of Rn containing D. S S finiteSsets of points D, ♦ D is the union of the convex Given a collection of hulls of all sets in D: ♦ D = D∈D {conv(D)}. Convexity arises often in feasibility regions of agents involved in negotiations. An agent feasibility region is convex if, for any two acceptable deals D1 and D2 , all intermediate deals (i.e., those lying on D1 D2 ) are acceptable as well. In some cases [2,12,4] the feasibility region of an agent is entirely convex (consider, e.g., a negotiation instance over a single variable, the price of a good). In other cases 50 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts this does not hold. However, a feasibility region may always be considered as the union of a number of convex sub-regions. Furthermore, in most real cases, this number is finite and small. Also, in most practical situations, the closer two acceptable deals D1 and D2 , the higher the likelihood that intermediate deals are acceptable as well. Example 3 (Alice vs. Bob (cont.)). Knowing that deals ht = “Mon at 11am”, d = “30 min”i and ht = “Wed at 3pm”, d = “1 hour”i are both acceptable to Bob would not be a strong support for Alice to assume that also ht = “Tue at 1pm”, d = “45 min”i would be acceptable to him. On the other hand, if ht = “Mon at 9am”, d = “40 min”i and ht = “Mon at 9.30am”, d = “20 min”i are both acceptable to Bob, it would not be surprising if also ht = “Mon at 9.15am”, d = “30 min”i is acceptable. Before formalising the NoN rule (Definition 6), we introduce it using our example. Example 4 (Alice vs. Bob (cont.)). Steps below are shown in Fig. 1. Steps 1 and 2. Alice starts the negotiation by sending proposal P1 = {Aa1 , Ab1 }. As a reply, she receives P2 = {B2a , B2b } (see Fig. 1b). As none of Bob’s counteroffers, B2a and B2b , belong to conv({Aa1 , Ab1 }) = Aa1 Ab1 , all such deals are removed from further consideration (by exploiting NoN). The rationale is as follows: (a) Bob had no evidence that conv({Aa1 , Ab1 }) includes deals outside RAlice (i.e., at the end of step 1 Bob had no evidence that this portion of RAlice is not convex). (b) Given that Bob has not proposed any such deal therein, then either RBob ∩conv({Aa1 , Ab1 }) = ∅ (in which case, Bob has no interest at all in proposing there), or Bob has chosen not to go for any such a deal now (as, e.g., he currently aims at higher utility). (c) In the latter case, NoN forbids Bob to reconsider that decision anymore (never ). Step 3. Alice, having no evidence that conv({B2a , B2b }) includes deals outside RBob , proposes P3 containing deal Aa3 ∈ conv({B2a , B2b }) ∩ RAlice (see Fig. 1c): by proposing Aa3 she aims at closing the negotiation successfully now, believing that such a deal (intermediate to B2a and B2b ) is likely to be acceptable also to Bob. Alice also includes in P3 deal Ab3 . Step 4. It’s Bob’s turn again. By receiving P3 = {Aa3 , Ab3 }, Bob knows that such deals belong to RAlice . Assume that Bob rejects P3 by sending a counteroffer. As there is no evidence that conv({Aa1 , Aa3 , Ab3 }), conv({Ab1 , Ab3 }), or conv({Ab1 , Aa3 }) (the 3 light-grey areas in Fig. 1c) include deals outside RAlice , NoN forces him to take a decision: either his counteroffer P4 contains some deals in one of such regions, or he must forget those regions forever. Note that NoN does not apply to, e.g., conv({Ab1 , Aa3 , Ab3 }), as this region contains B2b , which was part of a Bob’s proposal already rejected by Alice. Hence, there is already evidence that some of the deals in conv({Ab1 , Aa3 , Ab3 }) are acceptable to Bob and NoN does not forbid agents to further explore that region. 51 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts S S S Legend: ♦ ♦ ♦ Legend: RAlice SNoN (1) = Never (2) Legend: SNever (3) ♦ NoN (2) ♦ NoN (3) u=20 u=5 90' 90' 90' u= 0 40 u=5 A 1b A1b 60' 60' 60' A 1a B2b A1a B2b A 3b 30' 0 30' 30' A3a u=2 u= 10 B2a B 2a m 1pm m m 7pm m m m m 1pm m m 7pm m m m m 1pm m m 7pm m m m 9a p 4p 6p 3p 6p 9a p 4p 6p 3p 6p 9a p 4p 6p 3p 6p 12 12 12 Mon Tue Wed Mon Tue Wed Mon Tue Wed (a) Alice’s region and utility (b) End of step 2 (c) End of step 3 Fig. 1: Alice vs. Bob (Example 4) Definition 5 (Sets Never and NoN ). Let π = hV, s, k, Ri be a negotiation process and P = P1 , P2 , . . . PT its status at step T ≥ 1. For each agent A ∈ {0, 1} and each step 1 ≤ t ≤ T , let dealsA (t) be the set of all the deals in P proposed by A up to step t (included). Sets Never (t) and NoN (t) are defined inductively for each t ≥ 1 as follows: t=1: Never (1) = ∅, NoN (1) = {P1 } t>1: ( S Never (t−2)∪NoN (t−1) if Pt ∩ ♦ NoN (t−1)=∅ Never (t) = Never (t−2)∪{{D} | D∈NoN (t−1)} otherwise  S NoN (t) = D ⊆ dealsag(t) (t)| conv(D) ∩ ♦ Never (t) = ∅ where Pt is the proposal sent by ag(t) at step t and Never (0) = ∅. S At each step t, ♦ NoN (t) represents the region, defined by ag(t)’s deals, for which the other agent 1 − ag(t) needs, in the next step (t + 1) to take a NoN decision: to offer a deal therein (showing to ag(t) that she is S potentially interested to that region) or to neglect that region forever. Similarly, ♦ Never (t) represents the region, defined by (1 − ag(t))’s deals, for which ag(t) has taken S a never decision. Deals therein cannot be offered any more. Note that ♦ Never (t) ⊇ S ♦ Never (t − 2) for all t ≥ 2 (i.e., sequences Never (t) for odd and even values of t are monotonically non-decreasing). Fig. 1 shows NoN and Never regions at all steps of the previous example. Definition 6 formalises our NoN protocol rule, which forbids agents to recon- sider never decisions already taken. Definition 6 (Now or Never rule). Status P = P1 , P2 , . . . PT of negotiation S π = hV, s, k, Ri satisfies the NoN protocol rule if, for all steps 2 ≤ t ≤ T , process Pt ∩ ♦ Never (t − 2) = ∅. Proposition 1 shows that the NoN rule of Definition 6 allows agents to infer when no further agreements are possible. All proofs are omitted for lack of space. Proposition 1 (Termination condition). Let π = hV, s, k, Ri be a nego- tiation process where the NoN rule is enforced and let P = P1 , P2 , . . . PT be the S T ≥ 2. status of π at step S If Rag(T ) ⊆ ♦ Never (T − 1) ∪ ♦ Never (T − 2) and PT is not a singleton {D} ⊆ PT −1 , then: 52 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts (a) there exists no extension P 0 = P1 , P2 , . . . , PT −1 , PT , . . . , PT 0 of P to step T 0 > T such that PT 0 ={D}⊆PT 0 −1 S S for all D ∈ R0 ∩ R1 , there exists 1 < tD < T such that D ∈ ♦ NoN (tD − (b) 1) ∩ ♦ Never (tD ). S S A consequence of (a) is that, if at step T ≥ 2, Rag(T ) ⊆ ♦ Never (T − 1) ∪ ♦ Never (T − 2) and agent ag(T ) cannot or does not want to accept a deal offered in the last incoming proposal PT −1 , she can safely opt-out by proposing PT =⊥, as she has no hope to reach an agreement in the future. Also, from (b), for every mutually acceptable agreement D, there was a step tD < T in which agent ag(tD ) took a never decision on a NoN region containing D. This means that ag(tD ), although knowing that D ∈ Rag(tD ) was likely to be acceptable also to the opponent (because she had no evidence, at that time, that the portions of the opponent region defined by deals in NoN (tD − 1) were not convex), explicitly decided not to take that chance and proposed elsewhere. As a matter of fact, NoN can be thought as a deterrent, for each agent, to de- lay the negotiation by ignoring plausible agreements which, although acceptable to her, do not grant herself the utility she currently aims at. As NoN forbids the agents to propose such deals in the future, any such “obstructionist” behaviour has a price in terms of opportunities that must be sacrificed forever. Definition 7 defines non-obstructionist agents. Definition 7 (Non-obstructionist agent). Let π = hV, s, k, Ri be a negotia- tion process where the NoN rule is enforced. Agent A ∈ {0, 1} is non-obstructionist if her strategy satisfies the following conditions for all t ≥ 2 such that ag(t) = A: 1. if Pt−1S∩ RA 6= ∅, then Pt = {D} ⊆ Pt−1 S 2. else if ♦ NoN (t−1)∩RA 6= ∅, then Pt ∩ ♦ NoN (t−1)6= ∅. A non-obstructionist agent A accepts S any acceptable deal D ∈ RA and takes a now decision at all steps t when ♦ NoN (t−1) intersects RA . Non-obstructionist agents genuinely aim at finding an agreement efficiently, even sacrificing their preferences among deals they would accept. However, they are not necessarily collaborative, as they do not disclose to the opponent their constraints and preferences. Proposition 2 shows that, in a negotiation process between two non-obstruct- ionist agents, if one of the parties reaches the termination condition of Proposi- tion 1, then no agreement exists (i.e., R0 ∩ R1 = ∅). Proposition 2 (Completeness). Let π = hV, s, k, Ri be a negotiation process between two non-obstructionist agents where the NoN rule is enforced. S If π reaches, atS step T − 1 ≥ 2, status P = P1 , P2 , . . . PT −1 s.t. Rag(T ) ⊆ ♦ Never (T − 1) ∪ ♦ Never (T − 2), then R0 ∩ R1 = ∅. 4 A Terminating Strategy Based on Monotonic Concessions Propositions 1 and 2 show that the Now or Never (NoN) rule allows each agent to detect when the negotiation process can be safely terminated, as no agree- ment can be found in the sequel. However, still the termination condition may 53 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts not arise in a finite number of steps. In this section we show that, with NoN, termination can be enforced by any agent alone, without relying on the willing- ness to terminate of the counterpart. To this end, from now on we focus on one agent only, which we call agent A (A can be either 0 or 1). To ease presentation, the other agent, agent 1 − A, will be called agent B. We make some assumptions on the feasibility region of agent A: (a) RA is bounded and defined as the union P1 ∪ · · · ∪ Pq of a finite number q of convex sub-regions; (b) each convex sub-region Pi (1 ≤ i ≤ q) of RA is defined by linear constraints, hence is a (bounded) polyhedron in R|V| . Any bounded feasibility region can be approximated arbitrarily well with a (sufficiently large) union of bounded polyhedra. However, in many practical cases, a finite and small number of polyhedra suffices. Deals in RA may not be equally worth for agent A, who may have a (again, private) utility function uA to maximize. We assume that uA is piecewise-linear and defined (without loss of generality) by a linear function uiA for each poly- hedron Pi of RA (1 ≤ i ≤ q). For this definition to be well founded, if a deal D belongs to two different polyhedra Pi and Pj of RA , it must be uiA (D) = ujA (D). Note that, again, any differentiable utility function can be approximated ar- bitrarily well with a piecewise-linear utility, provided RA is decomposed in an enough number of polyhedra. In this setting, we define a full strategy for agent A for negotiation processes π = hV, s, k, Ri for which k ≥ 2, i.e., in which exchanged proposals can contain multiple deals. Although our strategy is correct independently of the opponent region shape, it is designed for the common cases where agent A believes that the opponent feasibility region is the union of a small number of convex sub- regions (not necessarily polyhedra). Hence, a task of agent A while following the strategy is to discover non-convexities of the opponent region during negotiation and take them into account. Our strategy is inspired by (but different from) the well-known mechanism of Monotonic Concessions (MC) [13]. It has three phases, utility-driven, non- obstructionist, and terminating phases, which are executed in the given order. 4.1 Utility-Driven Phase Agent A keeps and dynamically revises two utility thresholds, α and u, which are, respectively, the responding and the proposing threshold. At each step t such that ag(t) = A, agent A uses: (a) threshold α to decide whether to take a now decision S (if t > 1), by including, in the proposal Pt she will propose next, a deal in ♦ NoN (t − 1) (possibly accepting one deal in Pt−1 ), and (b) threshold u to select the other deals to include in Pt (t ≥ 1). By generalising [6,12], α is a function of the agent A utility of the best deal Dnext that would be chosen in step (b). In particular, α is uA (Dnext ) − span · ξ, where span is the absolute difference of the extreme values of uA in RA and 0 ≤ ξ ≤ 1 is a parameter (possibly varying during the negotiation) called respond policy. Hence, if ξ = 1, agent A accepts all acceptable deals and takes a now decision whenever possible, behaving in a non-obstructionist way (Definition 7). On the other extreme, if ξ = 0 the agent accepts only incoming deals D ∈ RA that are not worse than the best proposal Dnext that would be chosen next in step (b), upon rejection of D. 54 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts Our strategy for this phase is decomposed into responding, proposing and conceding sub-strategies as in [8], after an initialisation phase where agent A sets u to the highest utility of deals in RA (as in the spirit of MC). Responding. At step t ≥ 2, after that agent A has received proposal Pt−1 6=⊥, α proposal Pt is chosen as follows. Let RS A = {D ∈ RA | uA (D) ≥ α}. α (1) If Pt−1 contains deals in RA − ♦ Never (t − 2), then Pt = {D}, where D is one such a deal giving agent A the highest utility (i.e., agent A accepts the best deal D among those acceptable in Pt−1 granting herself at least utility α). Otherwise: S S α (2) Pt contains a deal in ( ♦ NoN (t − 1) ∩ RA ) − ♦ Never (t − 2) if and only if this region is not empty (now decision taken). Given that the closer deals in a set D defining NoN (t − 1) (see Definition 5) the more likely they belong to a single convex sub-region of RB , as for (2) agent A selects a deal with the highest utility in a set D having the minimum diameter. Proposing. At any step t ≥ 1 such that ag(t) = A, if agent A has not accepted an incoming deal (case (1) of the responding sub-strategy), proposal Pt contains u additional deals (as to make |Pt | = k ≥ 2). Let RA = {D ∈ RA | uA (D) ≥ u} (which is again a union of bounded polyhedra, as uA is piecewise-linear). Deals u to be proposed in Pt are carefully selected among vertices of RA S(some of them can be vertices of the overall region RA ) which do not belong to ♦ Never (t − 2), u as agent A needs to comply with the NoN rule. If t > 1, vertices of RA to be proposed will be carefully selected by reasoning on the evidence provided by the past opponent behaviour. The reasoning is as follows. LetSn̂(t) be the minimum number of convex sub-regions that must compose RB − ♦ Never (t − 1), i.e., the opponent region minus the regions for which the opponent has taken a never decision (and in which, by the NoN rule, no agreements can be found in the sequel): n̂(t) is the minimum value such that there exists a n̂(t)-partition {D1 , . . . , Dn̂(t) } of dealsB (t − 1) (i.e., a mapping of S opponent deal to one sub-region) such that for all 1 ≤ j ≤ n̂(t), conv(Dj ) ∩ each ♦ Never (t − 1) = ∅. S Agent A temporarily focuses on n̂(t), assuming that RB − ♦ Never (t − 1) is the union of exactly n̂(t) convex sub-regions. We call this assumption Non- obstructionist Opponent Assumption (NOA). Under NOA, agent A tries to re- gard the past opponent behaviour as non-obstructionist, hence S interprets the already taken never decisions as an admission that RB ∩ ♦ Never (t − 1) = ∅ (Proposition 2). Value n̂(t) is the minimum number of convex sub-regions that must compose RB which is consistent with this (optimistic) hypothesis. Agent A computes the subsets D of S the opponent deals that might belong to the same convex sub-region of RB − ♦ Never (t − 1), provided that NOA is correct. We call these sets of deals Possible Opponent Clusters (POCs):    ∃ n̂(t)-partition D1 , . . . , Dn̂(t) S of dealsB (t−1) K(t)= D⊆dealsB (t−1) (1) s.t. ∀j ∈[1, n̂(t)] conv(Dj )∩ ♦ Never (t−1)=∅ Let proj(R, R0 ) (the projection of region R onto R0 ) be the set of points X for which there exists Y ∈ R such that XY intersects R0 [2]. Region proj(R, R0 ) is an unbounded polyhedron if both R and R0 are polyhedra (see Fig. 2a, where proj(R, R0 ) is the unbounded grey area) and proj(R, R0 ∪ R00 ) = proj(R, R0 ) ∪ 55 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts proj(R, R00 ). Provided that NOA is correct, agent A can derive (Proposition 3) that region \ S Π(t) = proj(conv(D), ♦ Never (t − 1)) D∈K(t) does not contain agreements that can be still reached. Proposition S 3. If, at step t ≥ 3 s.t. ag(t) = A, NOA is correct, then Π(t) ∩ (RB − ♦ Never (t − 1)) = ∅. Example 5 (Alice vs. Bob (cont.)). Consider Fig. 2b. At step 4 Bob sent Alice proposal P4 = {B4a }. At step 5 (Alice’s turn), n̂(5) is 3, as it isS clear that B2a , B2b , a and B4 belong to all-different convex sub-regions of RBob − ♦ Never (4). POCs are K(5) = {{B2a },S{B2b }, {B4a }}. Region Π(5) is the area in light-grey: if NOA is correct (RBob − ♦ Never (4) or, equivalently, RBob if Bob is non-obstructionist, consists Sof exactly 3 convex sub-regions), then no X ∈ Π(5) can belong to RBob − ♦ Never (4). S Besides always ignoring vertices in ♦ Never (t−2) (as to comply with the NoN rule), as a result of Proposition 3 agent A (exploiting NOA) can temporarily u ignore vertices of RA in Π(t) while choosing deals to propose at step t. By exploiting the fail-first principle, we define the following criterion S (best vertex u under NOA) to select the next vertices in RA − Π(t) (and not in ♦ Never (t − 2)) to propose: those that, if rejected, would make the highest number of vertices be excluded in the next step, under NOA. u S Conceding. When no more vertices in RA − Π(t) (and not in ♦ Never (t − 2)) can be proposed, agent A reduces threshold u, if possible, by a given amount ∆u, whose value, possibly varying during time (see, e.g, [5]), depends on the application. Reducing u is in the spirit of MC (where the agent increases during time the opponent utility of the proposed deals). Differently from MC, here agent A reduces own utility of the deals she proposes (with the goal of approaching opponent’s demand), as she has no information about opponent utility. u Let T̂ (ag(T̂ ) = A) be the step in which agent A reduces u and RA becomes equal to RA (i.e., u cannot be further reduced). From step T̂ onwards, the strategy of agent A moves to the non-obstructionist phase. 4.2 Non-Obstructionist Phase Our strategy for this phase is decomposed into responding and proposing sub- strategies. As utility threshold u has already reached its minimum, in this phase there is no conceding sub-strategy. Responding. The responding sub-strategy is identical to that of the utility- driven phase with α = u. Given that in the non-obstructionist phase u is at its minimum, agent A accepts any incoming acceptable deal and takes a now decision whenever possible. Thus, the agent is now certainly non-obstructionist, independently of the value of her respond policy ξ. Proposing. As a result of acting in a non-obstructionist way, from step T̂ on- wards the following result holds: 56 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts S Proposition 4. For each step t ≥ T̂ such that ag(t) = A, RA ∩ ♦ Never (t−2) = S RA ∩ ♦ Never (T̂ − 2). Hence, for each step t ≥ T̂ such that ag(t) = A, if agent A has not accepted an incoming deal, the region in which the additional deals to propose S will be selected (as to make |Pt | = k ≥ 2 whenever possible), i.e., RA − ♦ Never (t − 2), S is steadily equal to RA − ♦ Never (T̂ − 2). S In this phase, agent A aims at proposing vertices of RA − ♦ Never (T̂ − 2) with the goal of eventually covering it within the never set of the opponent, as to reach S the termination condition of Proposition 1. Unfortunately, as both RA and ♦ Never (T̂ − 2) are unions of polyhedra, their difference might not be represented as a union of polyhedra. Anyway, it can be always represented as a union of Not Necessarily Closed polyhedra (i.e., polyhedra possibly defined by some strict inequalities, with some of their faces and vertices not belonging to them). In order to comply with the NoN rule, the agent must not propose S vertices of RA − ♦ Never (T̂ − 2) not belonging to that region, as they would S belong to ♦ Never (T̂ − 2). The problem is solved by computing a suitable under- S S approximation bRA − ♦ Never (T̂ − 2)c ⊆ RA − ♦ Never (T̂ − 2) which can be defined as a union of bounded (and closed) polyhedra. Note that such an under- approximation can be computed in order to make the error err S S RA = (RA − ♦ Never (T̂ − 2)) − bRA − ♦ Never (T̂ − 2)c arbitrarily small. As a special case, if agentSA was non-obstructionist from the err beginning of the negotiation process, RA ∩ ♦ Never (T̂ − 2) = ∅ and RA = ∅. Agent A continues to use both NOA and Π(t) as defined in the utility- S driven phase. In particular, the agent proposes vertices of bRA − ♦ Never (T̂ −2)c which are not in Π(t). When no more such vertices can be proposed, NOA is gradually relaxed (i.e., n̂(t) is gradually increased) and the remaining vertices of S bRA − ♦ Never (T̂ − 2)c are enabled. By construction, n̂(t) cannot grow beyond the number of deals proposed by the opponent so far. If also in that case Π(t) S S covers bRA − ♦ Never (T̂ − 2)c, the agent sets Π(t) to ♦ Never (t − 1), hence assumes that RB consists of at least one convex sub-region not yet disclosed by the opponent (i.e., not containing any of the past incoming deals). As it happens in the utility-driven phase, given that multi-deal proposals are allowed (k ≥ 2), all vertices will be proposed within a finite number of steps inde- pendently of the number of now decisions taken. When all vertices have been pro- posed and no agreement has been reached, agent A enters the terminating phase. 4.3 Terminating Phase In this phase, agent A continues by sending emptySproposals until she S receives and err accepts an acceptable deal or infers RA −RA ⊆ ♦ Never (T −1)∪ ♦ Never (T −2). Proposition 5 states that also this condition will arise in a finite number of steps. Proposition 5. Let π = hV, s, k, Ri be a negotiation process (k ≥ 2) where the NoN rule is enforced. If any agent A ∈ {0, 1} uses the strategy above, then, within a finite number of steps T ≥ T̂ ≥S2 such that ag(T )S= A, either an agreement is err found or condition RA − RA ⊆ ♦ Never (T − 1) ∪ ♦ Never (T − 2) is satisfied. 57 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts S Legend: ♦ Never (4) Π(5) 90' X R' A 1b 60' A3b  A1a B 4a B 2b   30' A a   Y B 2a 3   R     m 1pm m m 7pm m m m    9a p 4p 6p 3p 6p 12 Mon Tue Wed (a) proj(R, R0 ) (c) Categorical variables (grey area) (b) Alice vs. Bob (end of step 4) Fig. 2 Condition of Proposition 5 can be considered the termination condition of err Proposition 1 in case agent A had admissible region RA − RA . Given that err region RA can be chosen as to be arbitrarily small, agent A can terminate the negotiation when this condition is reached. Any possible remaining acceptable err deals would be in the (arbitrarily small) region RA . We stress again that, in case agent A is non-obstructionist from the begin- err ning, for all t ≥ T̂ such that ag(t) = A, RA can be made empty. Hence, as it S S happens for any acceptable deal in RA ∩ Never (t − 2) = RA ∩ ♦ Never (T̂ − 2), ♦ err any acceptable deal in RA can be considered as an opportunity (with arbitrarily S small Euclidean distance to RA ∩ ♦ Never (T̂ − 2)) that agent A had to sacrifice for having behaved in an obstructionist way (at most) up to step T̂ − 2. 5 Handling Discrete and Categorical Variables The NoN rule works also when (some of) the variables are discrete (e.g., integer), if we consider the union of the integer hulls [15] of the polyhedra in the NoN and Never sets of Definition 5. Integer Linear Programming results tell us that the integer hull of a polyhedron can still be represented with linear (plus integrality) constraints. Vertices of this new polyhedron have integer coordinates. Hence, the NoN rule as well as the strategy above and the underlying projection-based rea- soning can be adapted to prune the space of the possible agreements: S only the branches that deal with now decisions need to be refined (deals in ♦ NoN (t − 1) S proposed at step t need to have integer coordinates). Also, RA − ♦ Never (T̂ − 2) err can always be represented by an union of closed polyhedra, hence RA can be always made empty. Categorical variables can be tackled by fixing an ordering of their domain (common to both parties) and mapping them onto integers. Fig. 2c shows Alice’s region in a variation of Example 1 where variables are categorical. 6 Implementation and Experiments Our framework has at its core well-studied tasks in computational geometry and (integer) linear programming [15]. However, to our knowledge, the exact com- plexity of the agent’s reasoning is unknown, as these core tasks must be repeated 58 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts on sets of exchanged deals. Still, existing libraries of computational geometry al- gorithms can manage the size of instances needed for practical scenarios. We have implemented a system that uses the Parma Polyhedra Library (PPL) [1] to compute polyhedra, convex hulls, and projections, and an all-solutions SAT solver [9] to revise n̂(t) and Possible Opponent Clusters (POCs) (the problem is reduced to hypergraph-colouring). Performance on these sub-problems are very good: PPL completes most of the required tasks within very few seconds (on a reasonably small set of variables, e.g., 3–4) and the generated SAT instances are trivial. Although the number of sets in NoN and Never can grow exponentially, by keeping only (depending on the case) their ⊆-maximal or ⊆-minimal members (which is enough to enforce the NoN rule and to perform the needed reasoning), the overall memory requirements become, in the instances we consider below, compatible with the amount of RAM available on an ordinary PC. In the following we present an empirical evaluation of the computational feasibility of the approach. Negotiations have been performed between two iden- tical agents. We evaluated our implementation on both random and structured instances using a single computer (a PC with a dual-core AMD Opteron 3GHz and 8GB RAM) for both agents. At each step, agents can exchange contracts of at most k = 2 deals. Note that, as our approach requires agents to comply with the NoN rule, it cannot be evaluated against other negotiators. Random Instances. We generated 100 random negotiation instances over 3 variables. Feasibility regions are unions of 3 random polyhedra, each with at most 10 vertices. In about 44% of the instances R0 ∩ R1 6= ∅. The average vol- ume of the intersection is 2.19% of the volume of each agent’s region (stddev is 4.5%). Agents have random piecewise-linear utilities and concede constant ∆u = 0.2span each time all vertices of Rau (a ∈ {0, 1}) belong to Π. Such negotiations terminate in < 5 minutes and 20–30 steps. Agreements were found in > 95% of the instances for which R0 ∩R1 6= ∅. Fig. 3 shows average time, success rate (i.e., number of negotiations closed successfully / number of negotiations such that R0 ∩ R1 6= ∅), and average quality of the agreement found for each agent (the quality of an agreement D for agent a ∈ {0, 1} is (ua (D) − La )/(Ha − La ), where Ha and La are, respectively, the highest and lowest values of agent a utility in R0 ∩R1 ) as a function of the respond policies used (ξ0 and ξ1 ). It can be seen that moderate respond policies (intermediate values of ξ) lead to very high probabilities (> 97%) of finding an agreement if one exists; more- over, the quality of such agreements for the two agents is similar if their respond policies are similar (fairness). Conversely, if agents use very different values for ξ, the more conceding agent unsurprisingly gets lower utility with the agreement, but negotiations are more often aborted by the other, more demanding, agent. Structured Instances. We evaluated our system on the 6 scenarios in Table 1a: two scenarios (AB1, AB2) of the Alice vs. Bob example, two scenarios (SU1, SU2) of a negotiation problem regarding the rental of a summerhouse, and two scenarios (EZ1, EZ2) of a variation of the England-Zimbabwe problem of [10] adapted to our domain (real variables and no known bounds for their domains). A description of these negotiation scenarios is omitted for space reasons. Table 1a shows also some relevant properties of these negotiation scenarios. Column “vars” gives the number of negotiation variables. Columns “polys” and “con” give, respectively, the number of polyhedra and the overall number of 59 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts Avg. Negotiation time Success ratio Avg. agr. quality for agent 0 Avg. agr. quality for agent 1 300 1 0.32 280 0.99 0.31 0.98 260 0.97 0.3 240 0.96 0.29 220 0.95 0.94 0.28 200 0.93 0.27 180 0.92 0.91 0.26 160 0 0 0.9 1 0.2 0.4 0.2 0.8 0.6 0.8 1 0.25 1 0.6 0.4 0.6 0.8 0.6 0.8 1 0.6 0.8 0.4 0.4 Respond policy for agent 1 0.8 1 1 Respond policy for agent 1 0.2 0 0.2 0.4 0.4 0.6 0 Respond policy for agent 1 0.2 0 0.2 0 Respond Respond policy for agent 0 policy for agent 0 Respond policy for agent 0 (a) Time (sec) (b) Success rate (c) Avg agreement quality Fig. 3: Experimental results for random instances R0 R1 vol(R0 ∩R1 ) vol(R0 ∩R1 ) agr. time Scenario vars Scenario ξ0 ξ1 steps polys vol(R0 ) vol(R1 ) found (sec) polys con. polys con. AB1 1 1 Y 20 1.09 321 AB1 2 3 12 3 12 < 10−8 % < 10−8 % AB1 0 0 N 24 1.21 450 AB2 2 3 12 5 20 – – AB2 1 1 N 20 1.33 466 SU1 3 3 12 4 16 1.5% 2.0% SU1 0 0 N 417 38.44 20 413 SU2 3 3 12 4 16 – – SU1 0.4 0 Y 347 15.22 5679 EZ1 4 2 8 2 8 0.10% 0.05% SU2 1 1 N 513 63.12 29 148 EZ2 4 2 8 2 8 0.15% 0.04% EZ1 0 0 Y 80 1237 3 115 508 EZ2 0 0 Y 92 2836 8 057 011 (a) Properties of negotiation scenarios (b) Experimental results Table 1: Structured instances linear constraints defining each agent feasibility region, R0 and R1 . The two last columns give the ratio of the volume of R0 ∩ R1 (i.e., the volume of the space of the possible agreements) with respect to the volume of the feasibility region of each agent (“–” means that R0 ∩ R1 is empty, hence no agreement is possible). Table 1b shows some results on the above negotiation scenarios, under dif- ferent values of the respond policies of each agent (ξ0 and ξ1 ). All instances have been run with k (the maximum number of deals in a proposal) equal to 2. For each instance, column “agr. found” tells whether an agreement has been found (an agreement exists if and only if R0 ∩ R1 6= ∅, see Table 1a), column “steps” gives the number of negotiation steps needed to conclude the negotiation pro- cess, column “time” gives the overall negotiation time, and column “polys” gives the overall number of polyhedra computed by PPL during the process. For each negotiation instance, the number of all-SAT instances solved to compute POCs (see formula (1)) is equal to the number of negotiation steps. Our results show that enforcing NoN is computationally feasible: negotiation processes with hundreds of interaction steps could be performed in minutes, even when NoN enforcement and agents reasoning require the computation of millions of polyhedra and the resolution of hundreds of all-SAT instances. 7 Conclusions In this paper we defined a new protocol rule, Now or Never (NoN), for bi- lateral negotiation processes which allows self-motivated competitive agents to 60 T.Mancini Now or Never: Negotiating Efficiently with Unknown Counterparts efficiently carry out multi-variable negotiations with remote untrusted parties, where privacy is a major concern and agents know nothing about their oppo- nent. NoN has been explicitly designed as to ensure a continuous progress of the negotiation, thus neutralising malicious or inefficient opponents. We have also presented a NoN-compliant strategy for an agent that, under mild assumptions on her feasibility region, allows her to derive, in a finite number of steps and independently of the behaviour of her opponent, that there is no hope to find an agreement. We finally evaluated the computational feasibility of the overall approach on random and structured instances of practical size. Acknowledgements. This research was founded by the EU 7th Framework Pro- gramme under grant agreements n. 317761 (SmartHG) and n. 600773 (PAEON). References 1. R. Bagnara, P. M. Hill, and E. Zaffanella. The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. of Comp. Progr., 72(1–2):3–21, 2008. 2. M. Cadoli. Proposal-based negotiation in convex regions. In Proc. of CIA 2003, v. 2782 of LNCS, pp. 93–108. Springer, 2003. 3. S. E. Conry, K. Kuwabara, and R. A. Lesser, V. R. Meyer. Multistage negotiation for distributed constraint satisfaction. IEEE Transactions on Systems, Man and Cybernetics, 21(6):462–477, 1991. 4. S. Costantini, G. De Gasperis, A. Provetti, and P. Tsintza. A heuristic approach to proposal-based negotiation: with applications in fashion supply chain management. Math. Problems in Engin., 2013. Article ID 896312. 5. P. Faratin, C. Sierra, and N. R. Jennings. Negotiation decision functions for au- tonomous agents. Intl. J. Robotics & Autonomous Systems, 24(3–4):159–182, 1998. 6. P. Faratin, C. Sierra, and N. R. Jennings. Using similarity criteria to make issue trade-offs in automated negotiations. Artif. Intell., 142(2):205–237, 2002. 7. N. R. Jennings, T. J. Norman, P. Faratin, P. O’Brien, and B. Odgers. Autonomous agents for business process management. Appl. Artif. Intell., 14(2):145–189, 2000. 8. G. Lai and K. Sycara. A generic framework for automated multi-attribute negoti- ation. Group Decision and Negotiation, 18(2):169–187, 2009. 9. D. Le Berre and A. Parrain. The sat4j library, release 2.2. Journal on Satisfiability, Boolean Modeling and Computation, 7:59–64, 2010. 10. R. Lin, S. Kraus, D. Tykhonov, K. Hindriks, and C. M. Jonker. Supporting the design of general automated negotiators. In Proc. of ACAN 2009, 2009. 11. R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with bounded rational agents in environments with incomplete information using an automated agent. Artif. Intell., 172(6–7):823–851, 2008. 12. T. Mancini. Negotiation exploiting reasoning by projections. In Proc. of PAAMS 2009, Advances in Intelligent and Soft Computing, 2009. Springer. 13. J. S. Rosenschein and G. Zlotkin. Rules of Encounter: Designing Conventions for Automated Negotiations Among Computers. The MIT Press, 1994. 14. A. Rubinstein. Perfect equilibrium in a bargaining model. Econometrica, 50(1):97– 109, 1982. 15. A. Schrijver. Theory of linear and integer programming. John Wiley & Sons, 1998. 16. K. P. Sycara, S. Roth, N. Sadeh, and M. Fox. Distributed constrained heuristic search. IEEE Transactions on Systems, Man and Cybernetics, 21(6):446–461, 1991. 61