=Paper=
{{Paper
|id=Vol-2738/paper7
|storemode=property
|title=Modeling Interdependent Preferences over Incomplete Knowledge Graph Query Answers
|pdfUrl=https://ceur-ws.org/Vol-2738/LWDA2020_paper_7.pdf
|volume=Vol-2738
|authors=Till Affeldt,Stephan Mennicke,Wolf-Tilo Balke
|dblpUrl=https://dblp.org/rec/conf/lwa/AffeldtMB20
}}
==Modeling Interdependent Preferences over Incomplete Knowledge Graph Query Answers==
Modeling Interdependent Preferences over Incomplete Knowledge Graph Query Answers Till Affeldt1[0000−0001−6440−5654] , Stephan Mennicke2[0000−0002−3293−2940] , and Wolf-Tilo Balke1[0000−0002−5443−1215] 1 Institute for Information Systems, TU Braunschweig, Braunschweig, Germany 2 Knowledge-Based Systems Group, TU Dresden, Dresden, Germany Abstract. We study the properties of optimal patterns, our novel exten- sion of Sparql, that allows for a fine-grained control over incompleteness of query answers in knowledge graphs. Optimal patterns encode pref- erences over the completeness of query answers. In this paper, we add language constructs for expressing dependencies between conflicting pref- erences. Furthermore, we provide discussions and proofs of fundamental results concerning Sparql with optimal patterns. Keywords: knowledge graphs · structural preferences · Sparql. 1 Introduction One of the major difficulties in working with knowledge graphs is that entities may vastly differ in their structural representation, even if they are of the same kind. Indeed, due to limitations of the underlying sources and knowledge extrac- tion processes it is quite common that in practical instances two entities (repre- sented as nodes or resources) share a type, but may be characterized in totally different ways regarding their properties. The consequences are unexpectedly small or even empty result sets when querying basic graph patterns (conjunctive queries), usually followed by a lot of manual query refinements in the retrieval process. Thus, operators for handling such heterogeneity are mandatory when designing a robust query language for knowledge graphs. Sparql [11] already of- fers optional patterns, which handle incompleteness in RDF graphs by returning complete matches (i. e., mandatory plus optional parts) and incomplete matches. Optional patterns and well-behaving subclasses of them have been heavily dis- cussed over the last decade [12,13,2,6,4]. However, the diversity of result types remains unchanged. Suppose we query for three patterns A, B, and C: A OPTIONAL B OPTIONAL C. Then we may get results satisfying A alone, results that additionally satisfy B or C, and results that satisfy all three patterns. In fact, the number of result Copyright c 2020 by the paper’s authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). schemas can be exponential in the number of OPTIONAL operators in a query. We recently proposed optimal patterns [1], which come with a built-in preference semantics [8], so that only maximal results are returned. In the example above, we would not return results satisfying A alone when there are results that satisfy more patterns. The new OPTIMAL operator for Sparql enables the expression of structural as well as some value preferences. These preferences can be used to satisfy a user’s information need if no perfect results can be found. For instance, a user may ask for a car with sunroof and air conditioner that costs less than 10,000e. If such configurations are not available in the data, cars with fewer features or higher prices are still relevant. Evaluating queries that use our OPTIMAL operator automatically retrieves the best possible answers. Thus, a user can receive useful information instead of having to manually tweak her search settings until a result can be found. OPTIMAL positions itself between the rather restrictive query conjunction and the loose optional patterns of Sparql. In our previous work [1] we have shown that the operator can be used in real-world scenarios to achieve fine- grained control over inherent incompleteness of RDF knowledge bases. It can also be used to model multiple preferences simultaneously, both of similar as well as prioritized relevance. Moreover, using optimal patterns results in readable and maintainable query representations. After we introduce some notational conventions in Sect. 2, we briefly recap syntax and semantics of optimal patterns in Sect. 3. In addition to what has been shown in our previous work, we discuss fundamental design decisions and properties of optimal patterns (Sections 3.1 and 3.2). In particular, we discuss how optimal patterns behave in open world knowledge bases like RDF. Optimal patterns alone lack the ability to describe complex dependencies between preference patterns. As we will argue with concrete query examples in Sect. 4.1, we cannot express what we call positive and negative dependencies. For instance, the preferred color of a car may only be relevant for a certain type of car but irrelevant for others. Similarly, optimal patterns do not allow for model- ing the next-best preference if one preference cannot be satisfied. Therefore, we extend our earlier work introducing structural preferences into Sparql with a focus on retrieving reasonably sized result sets over incomplete data sets. We an- alyze how multiple preferences can affect each other leading to a characterization of different types of dependencies (Sect. 4). In order to cope with these different types, we then introduce a suitable set of operators to be used in conjunction with OPTIMAL. Modeling preferences with Sparql has been the goal of several language ex- tensions [5,10,14,15]. However, all of these works focus on a preference model based on preferred data values, i. e., they allow for expressing certain prefer- ences over the attribute values such as the lowest price for a car or a specific set of brands. But, while these extensions work quite well for value-based ranking and selection operations, users cannot express preferences with respect to the completeness of query results. Indeed, expressing a simple query such as I am looking for cars, preferably including information on prices and/or brands will result in rather cumbersome expressions. And even if a user’s preferences are not completely satisfiable in some given knowledge graph, he/she is still looking for cars and expects that cars are returned whenever possible. Another difference is that value preferences do usually not consider structural preference depen- dencies. This is because no structural dependencies can exist if the returned results are guaranteed to be structurally complete. In this case also the need for non-structural dependencies is greatly reduced: The same results can often be achieved by defining a priority order for preferences, in particular by eval- uating independent preferences first. For a detailed discussion of other related approaches see [1]. 2 Preliminaries Since Sparql is the recommended query language for the Resource Description Framework (RDF) [11], we base our notions on that of RDF. Therefore, assume infinite and disjoint universes of IRIs I and literals L. The core construct of RDF is that of an RDF triple (s, p, o) ∈ I × I × IL3 In such a triple, predicates (p) connect resources (s) to other resources or actual data values (o). Sets of RDF triples form RDF graphs G. Sparql offers expressions for querying information from RDF graphs. We restrict our presentation to pattern matching capabilities, e. g., leaving out path queries. Abstract syntax and semantics of Sparql follow the standard nota- tion [11,12]. Let V denote the universe of variables, e. g., x, y, z ∈ V. A triple pattern is a triple (u, a, v) ∈ VI × VI × VIL, i. e., variables x ∈ V may occur in every component of triple patterns. Sets of triple patterns G also relate to graph-like structures, then called basic graph patterns (BGPs). Let t = (u, a, v) be a triple pattern and G an RDF graph. A partial function µ : VIL → IL is called a match of t in G iff (a) µ(c) = c for all c ∈ IL, (b) {u, a, v} ⊆ dom(µ)4 , and (c) (µ(u), µ(a), µ(v)) ∈ G. By ∅IL we denote the empty match, being the partial function that respects (a) but is otherwise unde- fined. The notion of a match naturally extends to BGPs G by requiring that µ is a match of all triple patterns t ∈ G. We denote the set of all matches of t (G, resp.) in G by JtKG (JGKG , resp.). Sparql further supports complex queries in terms of unions, query conjunc- tions, optional patterns, and filter conditions. The semantics of such queries Q (w. r. t. RDF graphs G) can be found in [12] and is denoted by JQKG . The core notion of compatibility for query conjunctions and optional patterns is also re- quired for a formal account of OPTIMAL. In general, two partial functions µ1 , µ2 : VIL → IL are compatible, denoted µ1 µ2 , iff for all x ∈ dom(µ1 ) ∩ dom(µ2 ), µ1 (x) = µ2 (x), i. e., µ1 and µ2 agree on the variables they share. In conjunctions and optional patterns, only compatible matches to subpatterns are joined in the result sets. 3 We use IL as shorthand for I ∪ L. 4 dom(µ) refers to the set of all elements of VIL for which µ is defined. 3 SPARQL with OPTIMAL Throughout this section, we present our extension of Sparql by optimal pat- terns. Additional examples, an encoding by standard Sparql, as well as an initial evaluation may be found in [1]. Here, we give a brief account of the syn- tax and semantics. Furthermore, we discuss the properties of optimal patterns formally (cf. Sections 3.1 and 3.2). As with the other Sparql structures, OPTIMAL combines queries Q1 and Q2 to an optimal pattern Q1 OPTIMAL Q2 . The intuition of optimal patterns is that we state a preference for matches that are complete w. r. t. Q1 and Q2 , i. e., in the optimal case, we get the matches of Q1 AND Q2 . Only if this goal is impossible to reach, i. e., there could be matches of Q1 but no compatible ones of Q2 , we expect the matches of Q1 . The semantics of optimal patterns will adhere to the following property: JQ1 AND Q2 KG ⊆ JQ1 OPTIMAL Q2 KG ⊆ JQ1 OPTIONAL Q2 KG (1) More concretely, if there are matches of Q1 AND Q2 in G, then JQ1 OPTIMAL Q2 KG = JQ1 ANDQ2 KG . Unlike the respective optional pattern Q1 OPTIONALQ2 , we do not include a single match of only Q1 in this case. This matching behavior allows us to express preferences over the completeness of the matches w. r. t. a given RDF graph. Only if JQ1 ANDQ2 KG = ∅, the respective semantics of optimal patterns and optional patterns align. As long as Q1 and Q2 are free of optimal patterns, the previous paragraph tells us the whole story of the new operator. The combination of (several) opti- mal patterns with filter constraints allows for formulating meaningful preference patterns on data values and structure. Example 1. OPTIMAL can be used to model preferences over attribute values. A simple price preference could be: I prefer cheap cars under 15,000e over other cars under 20,000e over more expensive cars. It can be adequately modeled in the form of: ?car rdf:typeof Car OPTIMAL { ?car ex:price ?price FILTER( ?price < 15000 ) } OPTIMAL { ?car ex:price ?price FILTER( ?price < 20000 ) } The query will return all cars that are cheaper than 15,000e if any exist. If not, then it will return all cars that are cheaper than 20,000e instead. If no such car exists either, then the query will return all cars in the data set that are more expensive or have no available price information. Example 2. A user can pose structural preferences as well. Such a query could be I prefer to see cars with available price and brand information. Such a query could be adequately modeled as: ?car rdf:typeof Car OPTIMAL { ?car ex:price ?price } OPTIMAL { ?car ex:brand ?brand } The query will return all cars with information for price and brand. If no such cars can be retrieved, it will instead return all cars with price information. If that also results in an empty set, then the query will return all cars with brand information instead. Only if all of these attempts yield empty results, then the query will return all cars instead. The last example shows how optimal patterns may be used to impose an or- dering on the preferred structure. To allow for the formulation of preferences of equal importance, we extended the syntax of optimal patterns Q1 OPTIMAL Q2 , allowing Q2 to be a finite list of queries Q12 , Q22 , . . . , Qn2 , i. e., the syntax of these optimal patterns is Q1 OPTIMAL (Q12 , Q22 , . . . , Qn2 ). The intuition is still that Q1 is necessary to be matched. However, a simple cor- respondence as in Equation (1) cannot be derived. In the optimal case, we find matches of all of the queries in the optional pattern, i. e., JQ1 AND Q12 AND Q22 AND . . . AND Qn2 KG ⊆ JQ1 OPTIMAL (Q12 , Q22 , . . . , Qn2 )KG . Whenever we cannot reach the optimal case, we will obtain dominant matches as answers. Example 3. Given the query from Example 2, a possible answer could be a car for a price of 17,000e but of no known brand. This result contains more information than one with completely unknown attributes. Thus, the first match dominates the second one. A match concerning a third car with a price of 18,000e and a known brand contains even more information on the other hand. Thus, the first match is dominated by the third one. If we consider price and brand to be of equal importance, then the first match would not dominate a fourth one that only has brand information. The two matches would be incomparable. If we consider the price to be more important instead, then the fourth match does not appear in the result set. The just sketched process is like asking for a skyline over the price (which does not include the fourth match) and then construct a skyline over the brand for the previous results. Hence, the new construct requires a Pareto-style semantic, which entails a skyline of matches. Subsequently, we order candidate matches by Pareto dominance. Only the maximal candidates (w. r. t. Pareto dominance) are considered matches of optimal patterns. Definition 1 (Candidate Matches). Let Q be an optimal pattern, i. e., Q = Q0 OPTIMAL (Q1 , Q2 , . . . , Qk ) for some integer k > 0. A partial function µ : VIL → IL is called a candidate match of Q in G iff there are µ0 , µ1 , µ2 , . . . , µk , such that 1. µ0 ∈ JQ0 KG , 2. µi ∈ JQi KG ∪ {∅IL } (0 < i ≤ k), 3. µi µj for all i, j ∈ {0, 1, 2, . . . , k}, and 4. µ = µ0 ∪ µ1 ∪ µ2 ∪ . . . ∪ µk . The first requirement accounts for the fact that Q0 is the necessary pattern to be matched. Note, every other part of µ may be the empty match ∅IL . Using the separation of candidate matches µ of Q (in G) into µ0 , µ1 , µ2 , . . . , µk , we say that µ covers Qi (0 ≤ i ≤ k) iff µi 6= ∅IL . Note, Q0 must be covered by all candidate matches. We denote by coverQ (µ) the set of all sub-queries Qj ∈ {Q0 , Q1 , Q2 , . . . , Qk } covered by µ. Then the semantics of optimal patterns boils down to the maximal matches w. r. t. inclusion of the sets of covered sub- queries. Definition 2 (Semantics of optimal patterns). Let Q be an optimal pat- tern, G an RDF graph, and µ, µ0 two candidate matches of Q in G. µ0 dominates µ w. r. t. Q, denoted by µ ≺Q µ0 , iff coverQ (µ) ( coverQ (µ0 ). Candidate match µ of Q in G is called a match of Q in G iff there is no candidate match µ0 of Q in G with µ ≺Q µ0 . The set of all matches of Q in G is denoted by JQKG . This formalization allows us to derive correctness of (1) for the special case of optimal patterns with a single query as the right-hand side. Proposition 1. For all Sparql queries Q1 and Q2 , (1) holds. Proof. Let µ be a match of Q1 AND Q2 . Then there are matches µi ∈ JQi KG (i = 1, 2) with µ = µ1 ∪ µ2 and µ1 µ2 . Thus, µ is a candidate match of Q0 = Q1 OPTIMAL Q2 . Furthermore, every match of Q1 AND Q2 covers Q1 and Q2 in Q0 . Hence, µ cannot be dominated by any other match of Q0 in G. Let µ0 be a match of Q0 = Q1 OPTIMAL Q2 . We need to show that µ0 is a match of Q1 OPTIONAL Q2 . Towards a contradiction assume, µ0 is not a match of Q1 OPTIONAL Q2 . Then µ0 ∈ JQ1 KG but there is a match µ00 ∈ JQ2 KG , such that µ0 µ00 . But then µ0 ∪ µ00 is a candidate match of Q0 and coverQ0 (µ0 ∪ µ00 ) = {Q1 , Q2 } ) {Q1 } = coverQ0 (µ0 ), which contradicts the assumption that µ0 is a match of Q0 , i. e., µ0 is not domi- nated by any other candidate match. t u Subsequently, we justify the syntax style of query lists in optional patterns and we consider RDF’s open world assumption in the context of optimal patterns. 3.1 Justifying Optimal Patterns Our divergence from the standard binary operator structure of Sparql may seem peculiar at first. However, this was a necessary step to allow for express- ing preferences of equal importance. Suppose, we want to express a preference of query Q over Q1 and Q2 , but Q1 and Q2 are equally important. Without the query list notation, we have several candidates using optimal patterns and standard Sparql operators: – Q OPTIMAL(Q1 UNIONQ2 ): This query returns all the matches of Q, prefer- ably joined with matches of Q1 or Q2 . However, as soon as Q1 and Q2 have a shared variable, we will hardly see any matches fulfilling all three sub-queries. – Q OPTIMAL (Q1 AND Q2 ): From this query we may expect only matches that fulfill all three sub-queries or only Q. Hence, if Q1 cannot be matched, we will also not see matches that fulfill Q and Q2 . – (Q OPTIMAL Q1 ) AND (Q OPTIMAL Q2 ): This query is quite close to the desired Pareto query. Here, Q1 and Q2 are seen as independent through the use of query conjunction. For queries of that shape we may actually observe matches fulfilling all three sub-queries or only a subset of these. However, this query also accommodates some weird matching behavior. Suppose, Q1 and Q2 share a variable x, that does not occur in Q, but Q1 and Q2 cannot simultaneously be matched in some RDF graph G. Then the query above may easily yield no results at all, although Q may be matched together with Q1 or Q2 separately. Besides the possibilities above, it is always possible to unfold the Pareto query ex- ponentially. However, such queries are hardly readable and maintainable. There- fore, we decided to expand Sparql’s operator set in this respect. 3.2 Optimal Patterns and Certain Answers How well do optimal patterns cope with RDF’s open world assumption (OWA). The OWA influences the way the query semantics is actually executed. We often see query semantics following the certain answers perspective [2]. A match is a certain answer of a query Q in some OWA database iff the match is a match in every possible interpretation of the OWA database. Regarding RDF, every superset RDF graph H of G is a possible interpretation of G. According to Arenas and Pèrez, the certain answers perspective for Sparql is manifested by T CertainAnswers(Q, G) := H⊇G JQKH . Iterating over all infinitely many extensions H of G is infeasible in practice. Therefore, handy characterizations can be proven. One such characterization is concerned with the monotonicity of the given query Q. A query Q is mono- tone iff for all RDF graph G ⊆ H, JQKG ⊆ JQKH , i. e., adding more informa- tion may only lead to more query results. Since more information may lead to different matches that cover more sub-queries, Arenas and Pèrez introduced the notion of weak monotonicity. A query Q is weakly monotone iff for all RDF graphs G ⊆ H. µ ∈ JQKG implies the existence of a µ0 ∈ JQKH with µ ⊆ µ0 . It can easily be shown that in the case of (weakly) monotone queries Q, CertainAnswers(Q, G) coincides with JQKG . Unfortunately, not all queries containing optimal patterns are weakly mono- tone. For instance, the third query shape in Sect. 3.1 is not monotone. Consider Q1 and Q2 to share a variable x, that is not shared with Q. Now construct G in such a way that it fulfills Q and Q1 but not Q2 . Disjointly add a minimal structure to G that fulfills Q and Q2 but not Q1 . Then we simultaneously match Q and Q1 as well as Q and Q2 , but the respective matches are necessarily in conflict in x. Therefore, the result set will easily be empty, where it used to be non-empty when Q2 was violated. Hence, optimal patterns are neither better nor worse than optional patterns, at least w. r. t. certain answers. It is simply two different styles of querying and we have to let the user decide which style is appropriate in which scenarios. 4 Adding Dependencies over Preferences In this section we propose a solution for modeling dependencies by introducing an additional set of operators that complement OPTIMAL. Preferences may exist in various different types and combinations. Being able to model such dependencies is necessary to adequately model complex user requests. Sometimes a subquery preference simply cannot return useful results for a given preference unless another part of the query also returns valid matches. This is the case when a query asks for a possibly missing node that is only indirectly related to an enforced match. For example, a query may ask for a person’s car as well as their car’s model. The person’s car can be trivially retrieved. However, if that person does not own a car, it is impossible to give an adequate answer regarding the car’s model. We call this kind of dependency a structural depen- dency because these directly depend on the (in-)completeness in the knowledge graph structure. In contrast, we call all dependencies that relate to data values within the modeled preference as non-structural dependencies, e. g., a user may prefer the color red for sports cars but not necessarily for other cars as well. If one preference cannot be answered unless the found answers satisfy a spec- ified condition, i. e., also cover a more relevant preference, we speak of a positive dependency. The previously given examples fall into this category. We speak of negative dependencies if one preference should only be answered if a specified condition remains unsatisfied, i. e., a specific and more relevant preference is not covered. For example, a user may prefer the color blue for any car that is not a sports car with either no color preference, or a different one for sports cars. Negative dependencies also model different alternatives of unequal importance, e. g., a user interested in buying a car needs to know about the price of the object. Most likely, that user will prefer offers made in his own currency. If no suitable matches can be found, offers in foreign currencies may still be relevant. Dependencies are not guaranteed to be 1:1 relations. Especially structurally dependent preferences often rely on the same condition, implying a 1:m relation. For example, asking for a person’s car and its properties will result in a structural dependency of all properties towards the retrieval of a matching car. When modeling multiple alternatives of unequal importance, the opposite behavior is true. For example, a car manual should only be displayed in an unknown language if it is not available in any language the user is fluent in. This yields an n:1 relationship between preferences. Mixed types of questions can also lead to general n:m relations between preferences which need to be accounted for. 4.1 Limitations of OPTIMAL Expressing interdependent preferences is already possible using existing Sparql operators which requires significant effort, even more so than individual pref- erences. By the new OPTIMAL operator we have developed a tool to ease this process. We have also shown how it can be used for individual preferences. How- ever, it is still not ideal for modeling dependencies. Let us assume a user is looking for a car, preferably a sports car. If it happens to be a sports car, then the user prefers it to be red – otherwise any color is acceptable. Simultaneously, the user also prefers cars with available price infor- mation. Using multiple optimal patterns in a naive way (as shown in Example 4) does not represent these requirements. Example 4. The following query will look for all cars, preferably sports cars. Among those, it will look for red ones. This works well if the database contains any sports car. ?car rdf:typeof Car OPTIMAL { ?car rdf:typeof SportsCar, ?car ex:hasPrice ?price } OPTIMAL { ?car ex:hasColor "red" } Additionally, the query also describes a dominance of red cars over non-red cars. A more accurate model can be achieved by nesting one optimal pattern into an- other one, being a technique also commonly used when modeling interdependent optional patterns. Example 5. The following query looks for sports cars with red color first and then combines it with other preferences. As a result, red sports cars will be preferred over other sports cars, and sports cars in general will be preferred over other cars. ?car rdf:typeof Car OPTIMAL { ?car rdf:typeof SportsCar OPTIMAL { ?car ex:hasColor "red" }, ?car ex:hasPrice ?price } This time, red cars that are no sports cars will no longer be preferred over other cars. So we managed to fix the underlying problem in our model. However, the model is still not correct. Because the color preference is evaluated first, red sports cars will always be ranked better than non-red sports cars. This also means that a red sports car without price will be ranked higher than a non-red sports car with a price. In our use case, these two preferences are meant to be incomparable. Modeling preferences with optimal patterns alone causes side effects by intro- ducing unwanted ranking of actually incomparable query answers. For sure, it is always possible to model them correctly using complex filter conditions (cf. en- codings in [1]). However, our goal is to ease the modeling process. The required use of unintuitive filters is very similar to the original motivation. Thus, we would like to have a simple set of operators that lets a user define dependencies between preferences directly. 4.2 Syntax and Semantics of THEN and OTHERWISE In order to enable an easy way of modeling dependencies, we introduce two new operators called THEN and OTHERWISE. Both operators can be used on the right-hand side of an OPTIMAL operator in order to fine-tune requirements of a preference’s matching behavior. A THEN B defines a positive dependency of B towards A, meaning that preference B should only be answered if preference A also yields a non-empty set of answers. A OTHERWISE B, on the other hand, defines a negative dependency of B towards A, meaning that preference B is only answered if preference A does return an empty set of answers. For our OPTIMAL operator we have already introduced lists of preferences for simultaneous evaluation (cf. Sect. 3). We apply the same concept to both sides of the new operators. For a preference A and a set of dependent pref- erences B towards A, A THEN (b1 , b2 , ..., bm ) applies a matching restriction for all dependent preferences bj ∈ B. They will only be answered if preference A is covered as well. Similarly, A OTHERWISE (b1 , b2 , ..., bm ) applies a negative de- pendency restriction to all bj ∈ B. For a set of independent preferences A and dependent preference B towards A, (a1 , a2 , ..., an ) THEN B applies a matching restriction on B. It will only be answered if at least one ai ∈ A is covered. When modeling a restriction towards all preferences in A (instead of any member) a simple conjunction can be used in order to turn the conditions into a single one. This method is best suited for handling alternative routes that lead to similar objects. Likewise,(a1 , a2 , ..., an ) OTHERWISE B applies a matching restriction on B. It will only be answered if none of the preferences ai ∈ A are covered. If both A and B are a set of preferences then both the respective restrictions apply for all bj ∈ B towards all ai ∈ A. Just like the OPTIMAL operator, we define THEN and OTHERWISE as left- associative. That way, the syntax for all following preferences Bi ∈ {B1 , ..., Bn } of the form A 1 B1 2 ... n Bn (with i ∈ {THEN, OTHERWISE}) will always imply a relation towards A, making the query easier to read. Query groups and preference list separators can be used to model the order of execution explicitly. Example 6. The following query adequately models the query we were looking for in Sect. 4.1 and thereby resolves our issues with optimal patterns. ?car rdf:typeof Car OPTIMAL ({ ?car rdf:typeof SportsCar } THEN { ?car ex:color "red" }, ?car ex:price ?price ) Example 7. In the following query (with negative dependencies), we consider the price in US-Dollar as a fallback if no offers in Euros can be found. Finding both currencies is not better than only Euros, though. ?car rdf:typeof Car OPTIMAL ({ ?car ex:price EUR ?p eur } OTHERWISE { ?car ex:price USD ?p usd }) 4.3 Encoding Dependencies In our previous work [1], we have demonstrated that optimal patterns can be en- coded using existing query operators in order to work under standard Sparql 1.0 semantics. The same holds for the new set of operators. We have found two dif- ferent styles of encodings – one using UNION and one using OPTIONAL. Since the use of optional patterns follows more or less standard procedure, we restrict our presentation to the union-style encoding. The encoding using UNION has proven to be more stable regarding perfor- mance. For this style we first construct a superset of desired results by combining all possible preference combinations. For every such combination we bind a fresh variable in the respective subquery to mark which combination an answer candi- date is generated from. As a next step, we retrieve the results for every subquery a second time (but with changed variable names) inside an EXISTS-operator. That way, we can determine which combinations yield at least one candidate match. Lastly, we use FILTER-conditions to remove any answers that are dominated by at least one combination with a non-empty set of matches. This method is easily changed to adapt for the new operators. When con- structing the superset, we have to redefine what a possible combination is. Every THEN or OTHERWISE operation results in additional constraints on this set. A preference A that is dependent on B is not supposed to be matched unless a map- ping for B also appears in the answer. Thus, any combination including mappings for A but not B has to be removed from the superset. Likewise, a preference A that has a negative dependency on B is only supposed to match if no mapping for B appears in the answer. Thus, any combination including mappings for both A and B has to be removed from the superset. Our prototypical implementation5 can easily be extended. In terms of performance, we leave out an extensive analysis here. Usage of the new operators only results in removal of answer candidates. Thus, any encoded query using THEN or OTHERWISE will be shorter and most likely slightly faster than an independent preference. 5 Conclusion We have presented a new set of operators for expressing structural preferences over the completeness of knowledge graph query results. The primary extension consists of so-called optimal patterns, which we introduced in [1]. Here, we added the possibility of stating semantic dependencies between conflicting preferences. Furthermore, we discussed fundamental design decisions. Although our discussion regarding certain answers (cf. Sect. 3.2) attests opti- mal patterns to have a similar behavior as optional patterns, a detailed discussion about the expressive power/complexity may still reveal important differences. For future work, we plan to overcome the limitations we observed in [1] with our encoding approach of evaluating optimal patterns by implementing custom evaluation strategies that may even use Pareto-specific optimizations [3,9,7]. Acknowledgements. This work is partly supported by Deutsche Forschungsge- meinschaft (DFG, German Research Foundation) in project number 389792660 (TRR 248, Center for Perspicuous Systems) and Emmy Noether grant KR 4381/1-1 (DIAMOND). 5 available at Github: https://github.com/ifis-tu-bs/optisparql References 1. Affeldt, T., Mennicke, S., Balke, W.T.: Preference-driven Control over Incomplete- ness of Knowledge Graph Query Answers. In: 12th ACM Web Science Conference 2020. WebSci’20, Southampton, United Kingdom, Association for Computing Ma- chinery, New York, NY, USA (July 2020) 2. Arenas, M., Pérez, J.: Querying Semantic Web Data with SPARQL. In: Proceed- ings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. pp. 305–316. PODS ’11, ACM, New York, NY, USA (2011), event-place: Athens, Greece 3. Balke, W.T., Güntzer, U., Zheng, J.X.: Efficient Distributed Skylining for Web Information Systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) Advances in Database Technology - EDBT 2004. pp. 256–273. Lecture Notes in Computer Sci- ence, Springer, Berlin, Heidelberg (2004) 4. Cheng, S., Hartig, O.: OPT+: A Monotonic Alternative to OPTIONAL in SPARQL. Journal of Web Engineering 18(1), 169–206 (2019) 5. Gueroussova, M., Polleres, A., McIlraith, S.A.: Sparql with qualitative and quan- titative preferences. In: OrdRing@ ISWC. pp. 2–8 (October 2013) 6. Kaminski, M., Kostylev, E.V.: Beyond Well-designed SPARQL. In: Martens, W., Zeume, T. (eds.) 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), vol. 48, pp. 5:1–5:18. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2016) 7. Keles, I., Hose, K.: Skyline Queries over Knowledge Graphs. In: Ghidini, C., Hartig, O., Maleshkova, M., Svátek, V., Cruz, I., Hogan, A., Song, J., Lefrançois, M., Gandon, F. (eds.) The Semantic Web – ISWC 2019. pp. 293–310. Lecture Notes in Computer Science, Springer International Publishing, Cham (2019) 8. Kießling, W.: Foundations of preferences in database systems. In: Proceedings of the 28th international conference on Very Large Data Bases. pp. 311–322. VLDB ’02, VLDB Endowment, Hong Kong, China (2002) 9. Morse, M., Patel, J.M., Jagadish, H.V.: Efficient skyline computation over low- cardinality domains. In: Proceedings of the 33rd international conference on Very large data bases. pp. 267–278. VLDB ’07, VLDB Endowment, Vienna, Austria (Sep 2007) 10. Pivert, O., Slama, O., Thion, V.: Sparql extensions with preferences: A survey. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing. p. 1015–1020. SAC ’16, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2851613.2851690 11. Prud’hommeaux, E., Seaborne, Andy: SPARQL Query Language for RDF. Tech. rep., W3C (2008), https://www.w3.org/TR/rdf-sparql-query/ 12. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (Sep 2009) 13. Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL Query Optimization. In: Proceedings of the 13th International Conference on Database Theory. pp. 4–33. ICDT ’10, ACM, New York, NY, USA (2010), event-place: Lausanne, Switzerland 14. Siberski, W., Pan, J.Z., Thaden, U.: Querying the semantic web with preferences. In: International Semantic Web Conference. pp. 612–624. Springer (2006) 15. Troumpoukis, A., Konstantopoulos, S., Charalambidis, A.: An extension of sparql for expressing qualitative preferences. In: International Semantic Web Conference. pp. 711–727. Springer (July 2017)