Where Porceddu is better than Pasta (DISCUSSION PAPER) Paolo Ciaccia1 , Davide Martinenghi2 , and Riccardo Torlone3 1 2 3 Università di Bologna Politecnico di Milano Università Roma Tre Abstract. Preferences and contexts are fundamental aspects for decid- ing the best choices among possible options. We formalize the problem of propagating preferences from more generic to more specific contexts and study the key properties of propagation within an algebraic framework. 1 Introduction Preferences and their influence on choices have been studied in a variety of fields, including psychology, sociology, economics, artificial intelligence, and data management. In all of the above fields, it is widely recognized that preferences highly depend on the context, as shown in the following example. Example 1. We are ordering food at a restaurant in Italy: normally, then, we prefer pasta to beef. In Sardinia, though, we enjoy “porceddu” more than pasta. During summer, however, our preferred choice is just a fresh salad. As in the example, we consider contexts as states (such as “Italy in summer”) that can be compared via a generalization hierarchy, so that, say, “Sardinia” is more specific than “Italy”. Preferences naturally propagate along the hierarchy, from the more generic to the more specific contexts (a preference defined for Italy normally holds in any Italian place). Things are not that simple, though. Example 2. If we are in Sardinia in the summer, all the preferences given in Example 1 apply, since they refer to more generic contexts. Yet, the preferences defined in “Sardinia” and “Italy in summer” should take precedence over those given for “Italy”, whereas the preference in “Sardinia” is on a par with the pref- erence in “Italy in summer”. Then, porceddu and salad are the best alternatives, since no other food is preferable to them in the given context. Our research provides a principled approach to context-aware preference propa- gation. We develop a general framework whose only requirements are as follows: – the contexts belong to a poset, i.e., a set C with a (strict) partial order relation ăC on its elements: c1 ăC c2 means that the context c1 is more specific than the context c2 (and that c2 is more generic than c1 ); Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). This volume is published and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy. – preferences define a strict partial order ă on the domain of objects O, where o1 ă o2 means that the object o2 is preferable to the object o1 ; – each stated preference is associated with one or more contexts. The basic properties of the propagation process, implicitly at the basis of earlier approaches and naturally arising from the observations made for Example 2, are: 1. coherence: preferences only propagate from more generic contexts; 2. fairness: unordered contexts do not take precedence over each other; 3. specificity: a more specific context takes precedence over a more generic one. Building on these properties, we focus on three main problems: – (CFS) Determine whether there exist well-behaved (i.e., Coherent, Fair and Specific) methods for the propagation of preferences. – (OrdRel) Efficiently determine the preference (order relation) between any two objects in a context according to a given propagation method. – (Best) Establish the best objects in c wrt. the preferences propagated to c. We tackle these problems through an axiomatization of the properties of prop- agation, and do so with an algebraic framework based on two abstract binary operators that express preference propagation by means of Preference Composi- tion (PC) expressions: ` for unordered contexts, and B for ordered contexts. Example 3. A possible PC-expression for propagating preferences to “summer in Sardinia” (c4 ) is c4 Bppc2 `c3 qBc1 q. Here, first the preferences in “Sardinia” (c2 ) and those in “summer in Italy” (c3 ) are combined with `, since the two contexts are unordered. The result is then combined with the preferences in “Italy” (c1 ) using B, since this context is more generic than both c2 and c3 . Finally, the result is combined with c4 using B, since this is the most specific context. This paper summarizes some of the main findings reported in [3]. Here, we only focus on the axiomatization of ` and B and the study of their possible interpretations complying with the stated axioms. One of the main results is that, under mild assumptions, the well-known Pareto and Prioritized composition [1, 4] operators are the only possible interpretations of ` and B, respectively. Another notable achievement is a propagation method (OC) that somehow provides the “ultimate” semantics for preference propagation. However, we refer to [3] for a full account of the solutions to Problems CFS and OrdRel. Here, we only hint at an algorithmic approach for propagating preferences according to the OC method, and characterize its asymptotic complexity, which, remarkably, is independent of the underlying domain size. Finally, we discuss how to determine the best objects according to the propagated preferences (Problem Best). 2 Preliminaries We consider the well-known binary relation model for expressing preferences over a domain of objects O [1, 4, 8], where a preference relation over objects of a domain O is a strict partial order ă on O, associated with an unordered relation „. A refinement of „ allows some unordered objects to be considered as indifferent: an indifference relation « is a reflexive, symmetric, and transitive subset of the unordered relation „ such that if o1 « o2 then, for all o P O, o1 ă o po ă o1 q entails o2 ă o po ă o2 q. If o1 „ o2 , but o1 6« o2 , then o1 and o2 are incomparable, denoted o1 k o2 . Since k“„ ´ «, it suffices to consider the pair xă, «y, collectively called a preference structure. Let θ be one of ă, ą, «, k, where ą denotes the inverse of ă; we say that the order relation between objects o1 and o2 is θ if o1 θo2 . Example 4. Let us consider the objects pasta, beef, salad, and porceddu. A pos- sible preference structure over these objects is: beef ă pasta, beef ă salad, and pasta « salad. In words, pasta and salad are both preferable to beef, whereas pasta and salad are indifferent. It follows that porceddu is incomparable with all other foods, i.e., porceddu k o, for o P tpasta, beef, saladu. For a finite domain O, the best objects (i.e., maximal elements) in O according to ă can be selected by the Best operator βă pOq “ to P O | Eo1 P O, o ă o1 u. For instance, according to the preferences of Example 4, the best objects are βă pOq “ tpasta, salad, porcedduu. For any non-empty domain O, βă pOq is never empty [1]. In case O is infinite, βă is typically applied to a finite subset of O. 3 Preference Propagation Throughout the paper we consider a context poset C and a domain O, and assume that each context c P C is associated with a preference structure xăc , «c y over O, called the ground preference structure in c. We also call xăC , «C y “ txăc , «c y | c P Cu a (preference structure) configuration over C. porceddu pasta beef c1 salad salad porceddu salad porceddu beef pasta c2 c3 beef pasta c4 salad porceddu beef pasta Fig. 1: A configuration for a context poset; each ground preference relation is shown next to its context (no ground preferences in c4 , shown as a white disk). Example 5. The context poset in Figure 1 reports the ground preference struc- tures discussed in Example 2. In context c4 (“summer in Sardinia”) all objects are indifferent. In context c1 (“Italy”), pasta is preferable to beef and beef to salad, while, in context c2 (“Sardinia”), porceddu is preferable to both pasta and beef, and pasta «c2 beef; similarly, pasta «c3 beef and pasta «c4 beef. Preferences propagate along the poset C, and we call complete preference relation in c, denoted by Păc , the result of combining ăc with the ground pref- erences defined in the other contexts in C, according to a propagation method P. Such a method also defines how indifference of objects is propagated to c, de- noted by P«c (and thus also the unordered relation P„c and the incomparability c relation P k ), thereby defining the complete preference structure x Păc , P«c y. In abstract terms, a propagation method P is a function that associates a context poset C, a configuration xăC , «C y over C, and a target context c P C, with a complete preference structure x Păc , P«c y. In this paper, we focus on propagation methods implemented through algebraic expressions. We shall now capture the basic ideas underlying preference propagation. Let us denote with Ctcu the successor poset of c, i.e., poset C 1 “ tc1 P C | c ďC c1 u, where c1 ďC 1 c2 iff c1 ďC c2 . Context c1 is relevant for c when a change in the ground preferences in c1 may affect the propagated preferences in c. Definition 1 (Coherence). A propagation method P is coherent wrt. C if, for every context c in C, the relevant contexts for c according to P are exactly those in Ctcu; P is coherent if it is coherent wrt. every context poset C. When combining preferences in unordered contexts, a cautious approach is to propagate only the preferences that hold in both contexts. The (more flexible) way we undertake just aims to avoid the propagation of conflicting preferences. Definition 2 (Fairness). P is fair for c in C if the following holds for all un- ordered contexts c1 and c2 in Ctcu, all o1 , o2 P O and all configuration xăC , «C y: if i) o2 ăc1 o1 , ii) o1 ăc2 o2 , iii) @ci pcďC ci ăC c1 _ cďC ci ăC c2 q ñ o1 «ci o2 , then o1 and o2 are unordered in the complete preferences for c, i.e., o1 P„c o2 . A method P is fair if it is fair for every context c in every poset C. So, if c1 and c2 disagree on the order of o1 and o2 while o1 and o2 are indifferent in all the more specific contexts, then o1 and o2 are not ordered in Păc . When combining ordered contexts, we give more importance to preferences that hold in contexts “closer” to the target context. Definition 3 (Specificity). P is specific for a context c in C if the following holds for every c1 in Ctcu, every o1 , o2 P O and every configuration xăC , «C y: if i) o1 ăc1 o2 , ii) o1 «ci o2 for each c ďC ci ăC c1 , and iii) it is either o1 ăc2 o2 or o1 «c2 o2 for all c2 P Ctcu such that: 1) c2 is unordered wrt. c1 , and 2) o1 «c3 o2 @c3 such that c ďC c3 ăC c2 , then o1 Păc o2 . A method P is specific if it is specific for every context c in every poset C. Thus if o2 is preferable to o1 in c1 and such objects are indifferent in all the more specific contexts, then this preference does indeed propagate to context c (preferences from more generic contexts are overridden by those in more specific contexts). The preference must not propagate if this violates fairness. Example 6. Consider Figure 1. With a fair propagation method P, pasta and porceddu must be unordered in context c4 , since pasta ăc2 porceddu whereas porceddu ăc3 pasta, and similarly for beef and porceddu, i.e., pasta P„c4 porceddu and beef P„c4 porceddu. If P is specific, then pasta is preferable to beef in con- texts c2 , c3 , and c4 , i.e., beef Păc2 pasta, beef Păc3 pasta, and beef Păc4 pasta. 4 Preference Composition Expressions We now characterize two binary operators, ` (for unordered contexts) and B (for ordered contexts), combining two ground preference structures xăc1 , «c1 y and xăc2 , «c2 y into a new preference structure, denoted xăc1 ` ăc2 , «c1 ` «c2 y and, respectively, xăc1 B ăc2 , «c1 B «c2 y. Clearly, ` is commutative but B is not. Both operators are associative and idempotent and both have the full indifference structure ∅« “ x∅, O ˆ Oy as identity element, i.e., contexts in which all elements are indifferent should not influence the result. Axiomatically: Definition 4 (` operator). A ` operator satisfies the following axioms, for all objects o1 , o2 P O and all preference structures xă1 , «1 y, xă2 , «2 y, xă3 , «3 y: i) xă1 , «1 y ` xă2 , «2 y “ xă2 , «2 y ` xă1 , «1 y (commutativity) ii) pxă1 , «1 y`xă2 , «2 yq `xă3 , «3 y “ xă1 , «1 y` pxă2 , «2 y`xă3 , «3 yq (ass.) iii) xă1 , «1 y ` xă1 , «1 y “ xă1 , «1 y (idempotence) iv) xă1 , «1 y ` ∅« “ xă1 , «1 y (identity element) v) o1 ă1 o2 , o2 ă2 o1 ñ po1 ă1 ` ă2 o2 q ^ po2 ă1 ` ă2 o1 q (fairness) Definition 5 (B operator). A B operator satisfies the following axioms, for all objects o1 , o2 P O and all preference structures xă1 , «1 y, xă2 , «2 y, xă3 , «3 y: i) pxă1 , «1 yBxă2 , «2 yq Bxă3 , «3 y “ xă1 , «1 yB pxă2 , «2 yBxă3 , «3 yq (ass.) ii) xă1 , «1 y B xă1 , «1 y “ xă1 , «1 y (idempotence) iii) xă1 , «1 y B ∅« “ ∅« B xă1 , «1 y “ xă1 , «1 y (identity element) iv) o1 ă1 o2 ñ o1 ă1 B ă2 o2 (specificity) Preference structures can be combined with ` and B to form a PC-expression. Definition 6 (PC-expression). A PC-expression over C is any expression E of the form: E ::“ c | pE ` Eq | pE B Eq | K, where c is a context in C. The base case is the name of some context c, denoting the preference structure xăc , «c y; one can also compose PC-expressions (i.e., preference structures) via ` and B, and denote the full indifference structure ∅« via the K symbol. PC-expressions can be used to compute complete preference structures (and thus to implement propagation methods). Fairness and specificity can be ex- tended to PC-expressions in a straightforward way. 5 Interpreting the Propagation Operators In this section we investigate on possible interpretations of the operators ` and B. We start with a general result about idempotent semirings. A semiring is an algebraic structure with an associative and commutative additive operator (like `) and an associative multiplicative operator (like B), which is both left- and right-distributive over addition. Let pPRO , `, Bq be an algebraic structure, where PRO denotes the set of all preference structures over a domain O. Then, with the additional hypothesis that B distributes over `, pPRO , `, Bq would be a semiring in which both operators are idempotent, i.e., an idempotent semiring. However, the following major result rules out the possibility of using idem- potent semirings for providing an interpretation to the propagation operators, since distributivity of B over ` turns out to be incompatible with the axioms of specificity and fairness of the operators. Theorem 1. No pPRO , `, Bq structure is an idempotent semiring. The cause of incompatibility of distributivity with the axioms of the operators lies only in assuming that B right-distributes over `. For this reason, here we consider the case in which pPRO , `, Bq is an idempotent left near-semiring, that is, an algebraic structure satisfying all requirements of idempotent semirings except for right-distributivity of B. We still assume that B left-distributes over `. Consider Pareto and Prioritized composition as interpretations of ` and B. Definition 7 (Pareto and Prioritized comp.). Let xă1 , «1 y and xă2 , «2 y be two preference structures over a domain O. The Prioritized composition of xă1 , «1 y and xă2 , «2 y, written xă1 , «1 y Bxă2 , «2 y, is defined as o1 ă1 B ă2 o2 ô po1 ă1 o2 q _ po1 ă2 o2 ^ o1 «1 o2 q o1 «1 B «2 o2 ô o1 «1 o2 ^ o1 «2 o2 . and their Pareto composition, written xă1 , «1 y ‘ xă2 , «2 y, is: o1 ă1 ‘ ă2 o2 ô po1 ă1 o2 ^ o1 ă2 o2q_po1 ă1 o2 ^ o1 «2 o2q_po1 «1 o2 ^ o1 ă2 o2q o1 «1 ‘ «2 o2 ô o1 «1 o2 ^ o1 «2 o2 . where o1 and o2 are any two objects in O. Intuitively, Prioritized composition gives precedence to preferences in ă1 , while preferences in ă2 are used only if two objects are indifferent according to ă1 . Conversely, Pareto considers the two preference relations equally important. Example 7. Consider the objects: i) o1 : a comedy with Favino, ii) o2 : a comedy without Favino, iii) o3 : a drama with Favino, iv) o4 : a drama without Favino. Let ă1 be the preference relation corresponding to “I prefer comedies to all other genres” and ă2 to “I prefer movies with Favino to all other movies”. If we consider ă1 , then o1 and o2 are both preferable to o3 and o4 ; also, o1 «1 o2 and o3 «1 o4 . With ă2 , we have instead that o1 and o3 are both preferable to o2 and o4 , with o1 «2 o3 and o2 «2 o4 . Let ăPar “ă1 ‘ ă2 ; then o2 ăPar o1 , o3 ăPar o1 , o4 ăPar o2 , o4 ăPar o3 , with o2 kPar o3 . Let now ăPri “ă1 B ă2 ; then o4 ăPri o3 , o3 ăPri o2 , o2 ăPri o1 . The intuition is that o1 is always the best, since it satisfies both preferences, while o4 is the worst one; o2 and o3 remain unordered if no priority is assumed between ă1 and ă2 (if ‘ is used), whereas o2 is preferred to o3 when ă1 has priority over ă2 (via B). Both Prioritized and Pareto composition preserve strict partial orders (both yield a strict partial order when applied to two strict partial orders), whereas this is not guaranteed by replacing in their definition « with „ [1]. It is known that both are associative and ‘ is also commutative. Both operators are idempotent and have ∅« as the identity. Finally, we can show that B left-distributes over ‘. Theorem 2. pPRO , ‘, Bq is an idempotent left near-semiring. One may wonder whether other interpretations, besides the one based on ‘ and B, exist for the ` and B operators. Our answer is negative for an important class of operators, which we call independent of irrelevant objects. Definition 8 (IIO operator). An operator ˛ is independent of irrelevant ob- jects (IIO) if, for any two objects o and o1 in O, the order relation between o and o1 according to the combined preference structure xă1 , «1 y˛xă2 , «2 y only depends on the order relation between o and o1 according to xă1 , «1 y and xă2 , «2 y. Thus, to determine the order relation between o1 and o2 , an IIO operator does not need to consider any other objects. Both ‘ and B are IIO. The following theorems show that ‘ and B are the only possible IIO interpretations of ` and B: there is no other IIO interpretation of ` and B that satisfies all the axioms. Theorem 3. Operator ‘ is the only IIO ` operator. Theorem 4. Operator B is the only IIO B operator. 6 Solutions and complexities Problem CFS is solved by exhibiting a propagation method, called Object- specific Cover (OC) propagation, based on PC-expressions using ‘ and B. It can be shown that no straightforward combination of the ground preferences in a context poset attains both fairness and specificity (each pair of objects seems to require a dedicated PC-expression). However, as a major result, OC implements propagation through a single PC-expression, denoted RGC pcq, for all pairs of objects; this is obtained, algebraically, by maximally “grouping on the right”. Definition 9 (PC-expression for OC propagation). Let c1 P Ctcu and let tc1 , . . . , ck u be immediately more specific than c1 . The expression RGC pc, c1 q is de- fined as RGC pc, cq “ c; and RGC pc, c1 q “ pRGC pc, c1 q‘. . .‘RGC pc, ck qq Bc1 if c ăC c1 . Let ĉ1 , . . . , ĉn be maximal in Ctcu. Then: RGC pcq “ RGC pc, ĉ1 q‘. . .‘RGC pc, ĉn q. For the poset in Figure 1, RGC pc4 q “ ppc4 Bc2 q‘pc4 Bc3 qq Bc1 ” c4 Bpc2 ‘c3 q Bc1 . Ultimately, OC is the only method guaranteeing both fairness and specificity. Theorem 5. For any method P, if Păc Ă OCăc then P is not specific. Theorem 6. For any coherent method P based on ‘ and B, whose PC-expression only depends on c and C, if OCăc Ă Păc then P is not both fair and specific. So, the OC semantics is “final” and therefore used to study the other problems. For both OrdRel and Best, the exact complexity depends on the underlying context model as well as on the language used for expressing preferences. We remain parametric wrt. these aspects and assume that two context descriptions can be compared in Opδq time, and that Opγq is the complexity of determining the order relation of any two objects according to ground preferences in a context. Examples are the context model in [5–7], for which Opδq “ Opdq, where d is the number of contextual dimensions and the language of [1], where, if preferences are expressed via a CNF with n conjuncts, we have Opγq “ Opn2 q. Problem OrdRel can be solved by first computing the PC-expression RGC pcq given by OC and then using it against the configuration to compute the complete preferences. However, materializing RGC pcq may be inefficient due to repeated sub-expressions, leading, in the worst case, to a size of the PC-expression that is exponential in the number of contexts in Ctcu. Yet, via efficient bookkeeping structures, we can solve Problem OrdRel in Op|A| ¨ pδ ` wpAq ` γqq time, where A is the subset of C with only the contexts with some ground preferences, and wpAq is its width. Problem Best requires at most OpN 2 q comparisons for a set of N objects, leading to a complexity of Op|A| ¨ pδ ` N 2 ¨ pwpAq ` γqqq. 7 Conclusion and future work In this paper we have discussed how preferences propagate when they depend on the context. We have proposed an algebraic model for expressing preference propagation, based on two abstract operators, and have shown that Pareto and Prioritized composition are the only natural possible interpretations satisfying all the required algebraic properties. Finally, we have studied the problem of efficiently computing the best objects according to the preferences propagated to a target context, and have shown that this can be done with polynomial complexity in all the main involved parameters. Further research might analyze the application of the principles of preference propagation to numerical preferences. An example of the possibilities opened by this line of research is discussed in [2], where qualitative preferences expressed through constraints are imposed over numeric (quantitative) attributes. References 1. J. Chomicki. Preference formulas in relational queries. TODS, 28(4):427–466, 2003. 2. P. Ciaccia, D. Martinenghi. Reconciling skyline and ranking queries. PVLDB, 10(11):1454–1465, 2017. 3. P. Ciaccia, D. Martinenghi, and R. Torlone. Foundations of context-aware preference propagation. Journal of the ACM, 67(1):4:1–4:43, 2020. 4. W. Kießling. Foundations of preferences in database systems. VLDB, 311–322, 2002. 5. D. Martinenghi, R. Torlone. Querying context-aware databases. FQAS, 76–87, 2009. 6. D. Martinenghi, R. Torlone. Querying databases with taxonomies.ER, 377–390, 2010. 7. D. Martinenghi, R. Torlone. Taxonomy-based relaxation of query answering in relational databases. VLDB J., 23(5):747–769, 2014. 8. M. Öztürk et al. Preference modelling. MCDA: State of the Art Surveys, 27–59. 2005.