1. Introduction

A Rule-Based Approach to Specifying Preferences over Conflicting Facts and Querying Inconsistent Knowledge Bases (Extended Abstract)

Meghyn Bienvenu

Camille Bourgaux

Katsumi Inoue

Robin Jean

2 0 DI ENS, ENS, CNRS, PSL University & Inria , Paris , France 1 National Institute of Informatics , Tokyo , Japan 2 Univ. Bordeaux , CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence , France

2025

This extended abstract summarizes our KR'25 paper [1], where we introduce a declarative rule-based framework for specifying and computing a priority relation between conflicting facts. As the expressed preferences may contain undesirable cycles, we consider the problem of determining when a set of preference rules always yields an acyclic relation, and we also explore a pragmatic approach that extracts an acyclic relation by applying various cycle removal techniques. As a step towards an end-to-end system for querying inconsistent knowledge bases, we present a preliminary implementation and experimental evaluation of the framework, which employs answer set programming to evaluate the preference rules, apply the desired cycle resolution techniques to obtain a priority relation, and answer queries under prioritized-repair semantics.

eol>inconsistency handling preference specification inconsistency-tolerant query answering ontology-mediated query answering answer set programming

1. Introduction

Inconsistency-tolerant semantics are a well-established approach to querying data inconsistent w.r.t. some constraints, both in the relational database and ontology-mediated query answering settings (cf. recent surveys [ 2, 3 ]). Such semantics typically rely on (subset) repairs, defined as maximal subsets of the data consistent w.r.t. the constraints. The most well-known, called the AR semantics in the KR community and corresponding to consistent query answering in the database community, considers that a Boolean query holds true if it holds in every repair. The more cautious IAR semantics amounts to querying the repairs intersection, and the less cautious brave semantics only requires that the query holds in some repair.

Since an inconsistent dataset may have a lot of repairs, several notions of preferred repairs have been proposed in the literature, to restrict the possible worlds considered to answer queries, for example by taking into account some information about the reliability of the data [ 4, 5, 6, 7, 8 ]. In particular, since its introduction by [ 9 ], the framework of prioritized databases, in which a (binary acyclic) priority relation between conflicting facts is used to define three kinds of optimal repairs, has attracted attention, with numerous theoretical results [ 10, 11, 12, 13 ], and an implementation [14]. However, the crucial question of obtaining the priority relation was left unaddressed, preventing the adoption of this framework in practice. Indeed, it is not realistic to expect users to manually input a binary relation between the facts.

2. Specifying Priority Relations via Rules

We propose a framework for specifying a priority relation between conflicting facts. We use preference rules to state that, when some conditions are satisfied, a fact should generally be preferred to another fact. The rule conditions may naturally refer to the presence (or absence) of facts in the dataset. However, typically we may also want to exploit information about the facts themselves (e.g. the date they were added), provided in metadata.

Example 1. Consider a DL knowledge base ex = (ex, ex) about a university. The ontology expresses that associate and full professors (APr and FPr) are faculty members (Fac) and clerical staf workers (Cleric) are administrative staf workers ( Adm). Moreover, one cannot be both an associate and a full professor, or an administrative staf worker and a faculty member.

ex = {APr ⊑ Fac, FPr ⊑ Fac, APr ⊑ ¬FPr, ∃Teach ⊑ Fac, Cleric ⊑ Adm, Adm ⊑ ¬Fac} ex = {APr(), FPr(), Cleric(), Adm(), Teach(, ), Adm(), APr()} We associate to ex a meta-database ℳex = (idex, ℱex), where idex is a function that associates an identifier to each fact of ex: idex(APr()) = 1, idex(FPr()) = 2, idex(Cleric()) = 3, idex(Adm()) = 4, idex(Teach(, )) = 5, idex(Adm()) = 6, idex(APr()) = 7, and ℱex records the year that facts have been added to the university database:

{Date(1, 2014), Date(2, 2025), Date(3, 2013), < (2013, 2014), < (2013, 2025), < (2014, 2025)}. We then define the following preference rules.

Date(1, 1)∧Date(2, 2)∧ < (2, 1) → pref(1, 2) 1 = id(FPr()) ∧ 2 = id(APr()) → pref(1, 2) ⊑ Adm ∧ ⊑ Fac ∧ ¬(∃Teach(, )) ∧ 1 = id( ()) ∧ 2 = id(()) → pref(1, 2) The first rule states a general preference for keeping more recently added facts. The second one states if we have both FPr() and APr(), we prefer to keep FPr(), capturing the domain knowledge that associate professors are promoted into full professors. The third rule states that if a person is declared to belong both to a subclass of Adm and a subclass of Fac, but there is no Teach-fact for the person in the dataset, then the Adm-related facts are deemed more reliable. Observe that it uses ontology axioms with variables in order to simplify rule formulation (avoiding the need to write separate rules for every pair of subclasses of Adm and Fac). These three preference rules induce the following preferences over the facts of ex, designated by their identifiers:

{pref(2, 1), pref(2, 3), pref(1, 3), pref(6, 7)}.

Given a set of preference rules Σ , let ≻ Σ be the binary relation over facts obtained from the preferences over facts induced by Σ , restricted to facts that appear together in some conflict (minimal set of facts inconsistent with the logical theory ). This relation may still fail to be a priority relation if it contains a cycle, as priority relations are required to be acyclic. We explore two complementary approaches to tackling this issue: identifying preference rules which are guaranteed to yield an acyclic relation, and employing diferent methods to extract an acyclic sub-relation from ≻ Σ.

Checking acyclicity of preference rules We show that, given a logical theory , the problem of deciding whether a set of preference rules Σ yields an acyclic ≻ Σ for every dataset and meta-database is undecidable in general. However, for DL-Lite ontologies and preference rules whose bodies are essentially conjunctive queries, this problem is in coNP.

Resolving cycles to get a priority relation Ideally the preference ruleset Σ would always yield an acyclic ≻ Σ, but this cannot be assumed in general. Furthermore, cycles can naturally arise when users create rules that capture diferent criteria, e.g. prefer more recent facts and prefer facts from more trusted sources. To ensure acyclicity in such cases, one would need to create more complex rules whose bodies consider diferent combinations of the criteria, making rules much harder for users to specify and understand. We thus advocate a pragmatic approach: give users free rein to specify preferences as they see fit, then apply cycle resolution techniques to extract a suitable acyclic sub-relation should any cycles arise. To allow for a more fine-grained specification of the preferences, we assume that Σ is increment . partitioned into priority levels Σ 1, . . . , Σ , so that a preference induced by a preference rule from Σ is considered more important than one induced by a preference rule from Σ with > , and will thus be preferably kept in the cycle elimination process. Given two facts and such that ≻ Σ , we denote by level (,

) the minimal index such that ≻ Σ . We consider four cycle resolution techniques: • Going up (≻ ): Let ≻ := ∅ and := 1. Then while ≻ ∪ ≻ Σ is acyclic, let ≻ :=≻ ∪ ≻ Σ and ≻ Σ1 . Then for = 2 to , add to ≻ not belong to any cycle w.r.t. ≻

∪ ≻ Σ . level (,

) = , (, • Refined going up ( ≻ ): Let ≻ :=

≻ ) is in a cycle w.r.t. ≻

} and decrement . • Going down (≻ ): Let ≻ :=≻ Σ and := . Then while ≻ is cyclic, let ≻ :=≻ ∖{(, ) | Σ1 , then remove every (,

) that occurs in a cycle w.r.t. all pairs (, ) such that level (, ) = and (, ) does • Grounded (≻ ): Let ≻ := ∅. Then until a fixpoint is reached, add to ≻ all pairs (, ) such that ≻ Σ and for every cycle of ≻ Σ containing (, level (, ) < level (, ), or there is (, ) ∈ such that ≻ ∪{(, )} is cyclic. ), either there is (, ) ∈ such that

We relate these cycle removal strategies to notions that have been proposed in the literature to select a single consistent set of facts from a knowledge base whose dataset is partitioned into priority levels [ 15, 12 ], and show that ≻ ≻⊆ ≻⊆ and ≻ ≻⊆ ≻⊆ computed in polynomial time from the relations ≻ Σ .

3. ASP Implementation and Experiments

, and that each of these relations can be We implement our approach using answer set programming (ASP) [16, 17] to evaluate the preference rules, apply the desired cycle resolution techniques to obtain a priority relation, and answer queries under optimal repair-based semantics (AR, IAR or brave semantics based on Pareto- or completionoptimal repairs), towards an end-to-end system for querying inconsistent knowledge bases. Our system takes as input logic programs representing the input, and computes the query answers under the chosen semantics w.r.t. ≻

for the chosen ∈ {, , , }. All building blocks can be encoded into ASP programs that a Python program combines and passes to the ASP solver clingo [18] to check whether the resulting program has a stable model. However, we found more eficient in practice to split the computation into several steps and implement some of them in Python. Our approach applies to any logical theory such that:

Rew (, ) s.t. (, ) |= ′(⃗).

dataset , (, ) |= ⊥ if there exists → ⊥ ∈ Inc( ) such that |= ; and 1. there exists a set Inc( ) of rules of the form → ⊥ with a Boolean CQ, such that for every 2. for every CQ (⃗) there exists a set Rew (, ) of rules of the form ′(⃗) → (⃗) with ′ a CQ such that for every s.t. (, ) ̸|= ⊥ and tuple ⃗, (, ) |= (⃗) if there exists ′(⃗) → (⃗) ∈ programs.

These conditions are fulfilled, e.g., when is a set of denial constraints (then, Inc( ) = and Rew (, ) = { → }), or when is a DL-Lite ontology. Regarding preference rules, we handle rules whose bodies are CQs with negation and comparison operators. We expect that the KB = (, ), meta-database ℳ = (id, ℱ ), preference rules Σ = Σ 1 ∪ · · · ∪ Σ , and query are all given as ASP

Our main goal is to compare the diferent approaches to obtaining a priority relation from preferences rules, in terms of run time and size of the priority relation. We also compare our ASP implementation of the optimal repair-based semantics with orbits, the existing SAT-based implementation. We use the CQAPri benchmark [19], a synthetic benchmark adapted from LUBM2∃0 [20] to evaluate inconsistencytolerant query answering over DL-Lite KBs. We also consider its extension with two priority relations given by the orbits benchmark [14] for the comparison with orbits, and add a denial constraint to experiment with non-binary conflicts. For the meta-data, we randomly generate facts of the form date(id( ), ), source(id( ), ) and reliability(, ). We define four preference rules which express that one prefers more recent facts, facts with a more reliable source, FPr() over APr() facts, and APr() over GrSt() facts, and partition the rules in one, two or three priority levels.

We were not able to compute ≻ even on the simplest case because it overflows the number of atoms clingo can handle. However, we managed to compute the other priority relations for almost all small datasets (>75K) and several medium size datasets (>463K), even in cases with a large proportion of facts in conflict. Interestingly, on all instances for which we computed them, ≻ never compares more than 5% more pairs of facts than ≻ , while ≻ is often reduced to the empty relation. From a computational point of view, ≻ is significantly faster to compute than ≻ and ≻ . This indicates that ≻ may be a good method to use in practice. Regarding the computation of optimal repair-based semantics, our system is by far slower than orbits. However, we note that we manage to answer some queries under AR and brave semantics based on completion-optimal repairs when orbits runs out of time or memory.

Acknowledgments

This work was supported by the ANR AI Chair INTENDED (ANR-19-CHIA-0014).

Declaration on Generative AI

The authors have not employed any Generative AI tools. [14] M. Bienvenu, C. Bourgaux, Querying inconsistent prioritized data with ORBITS: algorithms, implementation, and experiments, in: Proceedings of KR, 2022. [15] S. Benferhat, Z. Bouraoui, K. Tabia, How to select one preferred assertional-based repair from inconsistent and prioritized DL-Lite knowledge bases?, in: Proceedings of IJCAI, 2015. [16] V. Lifschitz, Answer Set Programming, Springer, 2019. URL: https://doi.org/10.1007/ 978-3-030-24658-7. doi:10.1007/978-3-030-24658-7. [17] M. Gebser, R. Kaminski, B. Kaufmann, T. Schaub, Answer Set Solving in Practice, Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, 2012. URL: https:// doi.org/10.2200/S00457ED1V01Y201211AIM019. doi:10.2200/S00457ED1V01Y201211AIM019. [18] M. Gebser, B. Kaufmann, R. Kaminski, M. Ostrowski, T. Schaub, M. Schneider, Potassco: The potsdam answer set solving collection, AI Commun. 24 (2011) 107–124. [19] C. Bourgaux, Inconsistency Handling in Ontology-Mediated Query Answering. (Gestion des incohérences pour l’accès aux données en présence d’ontologies), Ph.D. thesis, University of Paris-Saclay, France, 2016. [20] C. Lutz, I. Seylan, D. Toman, F. Wolter, The combined approach to OBDA: Taming role hierarchies using filters, in: Proceedings of ISWC, 2013.

[1]

Bienvenu ,

Bourgaux ,

Inoue ,

Jean , A rule-based approach to specifying preferences over conflicting facts and querying inconsistent knowledge bases , in: Proceedings of KR (to appear), 2025 .

[2]

L. E.

Bertossi , Database repairs and consistent query answering: Origins and further developments , in: Proceedings of PODS , 2019 .

[3]

Bienvenu , A short survey on inconsistency handling in ontology-mediated query answering , Künstliche Intelligenz 34 ( 2020 ) 443 - 451 .

[4]

Lopatenko ,

L. E.

Bertossi , Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics , in: Proceedings of ICDT , 2007 .

[5]

Du , G. Qi,

Shen , Weight-based consistent query answering over inconsistent SHIQ knowledge bases , Knowl. Inf. Syst . 34 ( 2013 ) 335 - 371 .

[6]

Bienvenu ,

Bourgaux ,

Goasdoué , Querying inconsistent description logic knowledge bases under preferred repair semantics , in: Proceedings of AAAI , 2014 .

[7]

Calautti ,

Greco ,

Molinaro , I. Trubitsyna , Preference-based inconsistency-tolerant query answering under existential rules , Artif. Intell . 312 ( 2022 ) 103772 .

[8]

Lukasiewicz ,

Malizia ,

Molinaro , Complexity of inconsistency-tolerant query answering in datalog+/- under preferred repairs , in: Proceedings of KR , 2023 .

[9]

Staworko ,

Chomicki ,

Marcinkowski , Prioritized repairing and consistent query answering in relational databases , Annals of Mathematics and Artificial Intelligence (AMAI) 64 ( 2012 ) 209 - 246 .

[10]

Kimelfeld ,

Livshits , L. Peterfreund, Detecting ambiguity in prioritized database repairing , in: Proceedings of ICDT , 2017 .

[11]

Kimelfeld ,

Livshits , L. Peterfreund, Counting and enumerating preferred database repairs , Theor. Comput. Sci . 837 ( 2020 ) 115 - 157 .

[12]

Bienvenu ,

Bourgaux , Querying and repairing inconsistent prioritized knowledge bases: Complexity analysis and links with abstract argumentation , in: Proceedings of KR , 2020 .

[13]

Bienvenu ,

Bourgaux , Inconsistency handling in prioritized databases with universal constraints: Complexity analysis and links with active integrity constraints , in: Proceedings of KR , 2023 .