=Paper=
{{Paper
|id=Vol-3739/invited-1
|storemode=property
|title=Querying Inconsistent Prioritized Data (Abstract of Invited Talk)
|pdfUrl=https://ceur-ws.org/Vol-3739/invited-1.pdf
|volume=Vol-3739
|authors=Camille Bourgaux
|dblpUrl=https://dblp.org/rec/conf/dlog/Bourgaux24
}}
==Querying Inconsistent Prioritized Data (Abstract of Invited Talk)==
<pdf width="1500px">https://ceur-ws.org/Vol-3739/invited-1.pdf</pdf>
<pre>
                                Querying Inconsistent Prioritized Data
                                Camille Bourgaux1
                                1
                                    DI ENS, ENS, CNRS, PSL University & Inria, 45 rue d’Ulm, 75005 Paris, France


                                                                         Abstract
                                                                         This extended abstract accompanies an invited talk at DL 2024 and presents a summary of some recent
                                                                         results on querying inconsistent prioritized data.

                                                                         Keywords
                                                                         inconsistency handling, prioritized data, optimal repairs


                                   Real-world datasets are plagued by data quality issues which may render the data inconsistent
                                w.r.t. a set of constraints, be they given by database dependencies or ontologies. A prominent
                                way to handle such inconsistent data is to use inconsistency-tolerant semantics to obtain
                                meaningful answers to queries [1, 2, 3]. Most of these semantics are based on the notion of
                                a (subset) repair, which is an inclusion-maximal subset of the data that is consistent with the
                                constraints. In many scenarios, this basic notion of repair can be refined by exploiting preference
                                information about facts. Preferred repairs can then be used in place of subset repairs in any
                                repair-based semantics. This paper presents a summary of some recent results [4, 5, 6] on an
                                approach where preferences are given by a binary priority relation between conflicting facts [7].
                                Formal definitions A knowledge base (KB) 𝒦 = (𝒟, 𝒯 ) consists of a dataset 𝒟 and a logical
                                theory 𝒯 : 𝒟 is a finite set of facts, and 𝒯 a finite set of first-order logic (FOL) sentences.
                                Typically, 𝒯 is an ontology, e.g., in description logic, or a set of database constraints, e.g., denial
                                constraints of the form ∀𝑥⃗ ¬(𝛼1 ∧ . . . ∧ 𝛼𝑛 ), where each 𝛼𝑖 is a relational or inequality atom. A
                                KB 𝒦 = (𝒟, 𝒯 ) is consistent if 𝒟 ∪ 𝒯 has a model, and inconsistent otherwise (𝒦 |= ⊥). A set
                                of facts ℬ is 𝒯 -consistent if (ℬ, 𝒯 ) is consistent. A conflict of 𝒦 is an inclusion-minimal subset
                                𝒞 ⊆ 𝒟 such that (𝒞, 𝒯 ) |= ⊥; Conf (𝒦) denotes the set of conflicts of 𝒦. A (subset) repair of
                                𝒦 = (𝒟, 𝒯 ) is an inclusion-maximal 𝒯 -consistent subset ℛ ⊆ 𝒟; SRep(𝒦) denotes the set of
                                repairs of 𝒦. Following [7], we assume that preferences between conflicting facts are available:
                                Definition 1. A priority relation ≻ for a KB 𝒦 = (𝒟, 𝒯 ) is an acyclic binary relation over 𝒟
                                such that 𝛼 ≻ 𝛽 implies {𝛼, 𝛽} ⊆ 𝒞 for some 𝒞 ∈ Conf (𝒦). It is total if for all 𝛼 ̸= 𝛽 such that
                                {𝛼, 𝛽} ⊆ 𝒞 for some 𝒞 ∈ Conf (𝒦), either 𝛼 ≻ 𝛽 or 𝛽 ≻ 𝛼. A completion of ≻ is a total priority
                                relation ≻′ ⊇ ≻. A prioritized KB 𝒦≻ is a KB 𝒦 = (𝒟, 𝒯 ) with a priority relation ≻ for 𝒦.
                                Definition 2 (Optimal repairs). Let 𝒦≻ be a prioritized KB with 𝒦 = (𝒟, 𝒯 ) and ℛ ∈ SRep(𝒦).
                                A Pareto improvement of ℛ is a 𝒯 -consistent ℬ ⊆ 𝒟 such that there is 𝛽 ∈ ℬ ∖ ℛ with 𝛽 ≻ 𝛼
                                for every 𝛼 ∈ ℛ ∖ ℬ. A global improvement of ℛ is a 𝒯 -consistent ℬ ⊆ 𝒟 such that ℬ ≠ ℛ and
                                for every 𝛼 ∈ ℛ ∖ ℬ, there exists 𝛽 ∈ ℬ ∖ ℛ such that 𝛽 ≻ 𝛼. The repair ℛ is:
                                  DL 2024: 37th International Workshop on Description Logics, June 18–21, 2024, Bergen, Norway
                                $ camille.bourgaux@ens.fr (C. Bourgaux)
                                 0000-0002-8806-6682 (C. Bourgaux)
                                                                       © 2024 Copyright for this paper by its author. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Table 1
Data complexity [7, 4, 5]. Upper bounds hold for conjunctive queries and FOL fragments for which
consistency checking and BCQ entailment is in P. Lower bounds hold for atomic queries and any
fragment that extends DL-Litecore or functional dependencies.
                        Globally-optimal                  Pareto-optimal                 Completion-optimal
    AR, IAR                Π𝑝2 -complete                 coNP-complete                     coNP-complete
    Brave                  Σ𝑝2 -complete                  NP-complete                       NP-complete


       • Pareto-optimal if there is no Pareto improvement of ℛ;
       • globally-optimal if there is no global improvement of ℛ;
       • completion-optimal if ℛ is a globally-optimal repair of 𝒦≻′ , for some completion ≻′ of ≻.

We denote by PRep(𝒦≻ ), GRep(𝒦≻ ) and CRep(𝒦≻ ) the sets of Pareto-, globally- and completion-
optimal repairs. It holds that CRep(𝒦≻ ) ⊆ GRep(𝒦≻ ) ⊆ PRep(𝒦≻ ) ⊆ SRep(𝒦).

   If ≻ is induced by assigning scores to facts, then Pareto-, globally- and completion-optimal
repairs coincide [8, 9], and also coincide with the ⊆𝑃 -repairs from [10].
   A Boolean conjunctive query (BCQ) 𝑞 is entailed by a KB 𝒦 (𝒦 |= 𝑞) if 𝑞 holds in every model
of 𝒦. Hence, 𝒦 |= ⊥ implies that 𝒦 |= 𝑞 for every 𝑞. Meaningful answers can be obtained from
inconsistent KBs by using inconsistency-tolerant semantics based on repairs. The AR semantics
defines plausible query answers, and is used for consistent query answering in databases [11].
The brave semantics considers all possible answers, while the IAR semantics identifies the most
reliable ones. These semantics can be parametrized by the type of repair they consider.

Definition 3 (Inconsistency-tolerant semantics). Fix X ∈ {𝑆, 𝑃, 𝐺, 𝐶} and consider a prioritized
KB 𝒦≻ with 𝒦 = (𝒟, 𝒯 ) and a BCQ 𝑞. Then 𝑞 is entailed by 𝒦≻ under

       • X-AR semantics, denoted 𝒦≻ |=𝑋AR 𝑞, if (ℛ, 𝒯 ) |= 𝑞 for every ℛ ∈ XRep(𝒦≻ );
       • X-brave semantics, denoted 𝒦≻ |=𝑋brave 𝑞, if (ℛ, 𝒯 ) |= 𝑞 for some ℛ  ∈ XRep(𝒦≻ );
                                        𝑋
                                                                            ⋂︀
       • X-IAR semantics, denoted 𝒦≻ |=𝐼𝐴𝑅 𝑞, if (ℬ, 𝒯 ) |= 𝑞 where ℬ = ℛ∈XRep(𝒦≻ ) ℛ.

The semantics are related as follows: 𝒦≻ |=𝑋            𝑋           𝑋
                                           𝐼𝐴𝑅 𝑞 ⇒ 𝒦≻ |=AR 𝑞 ⇒ 𝒦≻ |=brave 𝑞.

   Table 1 gives the data complexity of BCQ entailment under optimal repair-based semantics.1
Note that S-AR entailment is coNP-hard even in very basic settings [12, 13], while S-brave or
S-IAR entailment is tractable for DL-Lite ontologies or denial constraints [14].
Practical SAT-based approaches [5] The (co)NP complexity results naturally suggest the
interest of employing SAT solvers to compute query answers under X-AR, X-brave and X-IAR
semantics for X ∈ {P, C}, as it has already be done for S-AR [15, 16]. There are several ways
of employing SAT encodings to compute answers under (optimal) repair-based semantics, by
exploiting different reasoning modes of SAT solvers. The orbits system implements different
algorithms and encodings for the case where conflicts are binary. It takes as input the conflicts,
1
    The lower bound for G-IAR and G-brave with functional dependencies follows from the proof of [7, Theorem 2].
priority relation, and the potential query answers associated with their causes, where a cause
for 𝑞 is a minimal 𝒯 -consistent subset 𝒞 ⊆ 𝒟 such that (𝒞, 𝒯 ) |= 𝑞. Our experimental
comparison shows that the choice of algorithm and encoding variant may have huge impact
on the computation time. Moreover, while in some cases our results can be used to single out
some approaches as more effective, more often there are no clear winner(s). For example, we
consider two ways of encoding the absence of a cause, and choosing one instead of the other
may make a notable difference, but which one is best depends on the KB and query.
Connections with abstract argumentation [4] An argumentation framework (AF) is a pair
(Args, ⇝) where Args is a finite set of arguments and ⇝ ⊆ Args × Args is the attack relation:
𝛼 attacks 𝛽 if 𝛼 ⇝ 𝛽. The semantics of AFs is based on extensions, which can be seen as
coherent sets of arguments. Many notions of extension have been considered, in particular
preferred extensions and stable extensions [17]. When these two kinds of extension coincide, the
AF is called coherent. We define preference-based set-based argumentation framework (PSETAF),
a variant of AF based on collective attacks with a preference relation between arguments. We
exhibit a natural translation of a prioritized KB 𝒦≻ into a PSETAF 𝐹𝒦,≻ that uses 𝒟 as the
arguments and show that ℛ ∈ PRep(𝒦≻ ) iff it is a stable extension of 𝐹𝒦,≻ . Moreover, if ≻ is
transitive or if 𝒦 has only binary conflicts, then 𝐹𝒦,≻ is coherent so ℛ ∈ PRep(𝒦≻ ) iff it is a
preferred extension of 𝐹𝒦,≻ . Since there is no notion of extension that corresponds to globally-
or completion-optimal repairs, this speaks to the interest of adopting Pareto-optimal repairs.
   The argumentation connection allows us to propose a new notion of grounded repair, directly
inspired by the grounded extension from argumentation. The (unique) grounded repair is
contained in the intersection of Pareto-optimal repairs and can be computed in polynomial time
from the dataset and conflicts. Moreover, it is more productive than the Elect semantics [18]
and lies between the non-defeated and the prioritized inclusion-based non-defeated repairs that
have been proposed in the case where the priority relation is score structured [19].
Connections with active integrity constraints [6] In the database setting, active integrity
constraints (AICs) state how to resolve constraint violations [20, 21]. A ground AIC is a formula
of the form 𝛼1 ∧ · · · ∧ 𝛼𝑛 ∧ ¬𝛽1 ∧ · · · ∧ ¬𝛽𝑚 → {𝐴1 , . . . , 𝐴𝑘 } where the 𝛼𝑗 , 𝛽𝑗 are facts and
each 𝐴𝑖 is an update action of the form −𝛼𝑗 for some 1 ≤ 𝑗 ≤ 𝑛 or +𝛽𝑗 for some 1 ≤ 𝑗 ≤ 𝑚.
The semantics of a set of AICs is based on the notion of a repair update, defined as a minimal
set 𝒰 of update actions such that 𝒟 ∖ {𝛼 | −𝛼 ∈ 𝒰} ∪ {𝛼 | +𝛼 ∈ 𝒰} satisfies the constraints.
Many proposals have been made to select the appropriate repair updates to take into account a
set of AICs, such as founded, well-founded, grounded and justified repair updates.
   Given a prioritized database (i.e., a prioritized KB such that 𝒯 is a set of database constraints),
we construct a set of ground AICs such that the Pareto-optimal repairs coincide with the repairs
obtained by applying founded, grounded and justified repair updates w.r.t. the generated set of
AICs. We also exhibit a translation of a ‘well-behaved’ set of AICs into a prioritized database
and again relate Pareto-optimal repairs with founded, grounded and justified repair updates.
We take this as further evidence that Pareto-optimal repairs are an especially natural notion.
   These results hold not only for denial constraints but also for universal constraints of the
form ∀𝑥 ⃗ ¬(𝛼1 ∧ . . . ∧ 𝛼𝑛 ∧ ¬𝛽1 ∧ · · · ∧ ¬𝛽𝑚 ), where repairs can also be obtained by adding
facts (while minimizing the symmetric difference with the original database). Such constraints
and repairs could be explored in the KB setting, e.g., for ontologies with closed predicates [22].
Acknowledgments
I thank the DL 2024 workshop chairs for inviting me. The work summarized in this extended
abstract has been conducted together with Meghyn Bienvenu (CNRS & University of Bordeaux)
and supported by the ANR AI Chair INTENDED (ANR-19-CHIA-0014).


References
 [1] M. Bienvenu, C. Bourgaux, Inconsistency-tolerant querying of description logic knowledge
     bases, in: Tutorial Lectures of the 12th International Reasoning Web Summer School, 2016.
 [2] M. Bienvenu, A short survey on inconsistency handling in ontology-mediated query
     answering, Künstliche Intelligenz 34 (2020) 443–451.
 [3] J. Wijsen, Foundations of query answering on inconsistent databases, SIGMOD Record 48
     (2019) 6–16.
 [4] M. Bienvenu, C. Bourgaux, Querying and repairing inconsistent prioritized knowledge
     bases: Complexity analysis and links with abstract argumentation, in: Proceedings of the
     17th International Conference on Principles of Knowledge Representation and Reasoning
     (KR), 2020.
 [5] M. Bienvenu, C. Bourgaux, Querying inconsistent prioritized data with ORBITS: algorithms,
     implementation, and experiments, in: Proceedings of the 19th International Conference
     on Principles of Knowledge Representation and Reasoning (KR), 2022.
 [6] M. Bienvenu, C. Bourgaux, Inconsistency handling in prioritized databases with universal
     constraints: Complexity analysis and links with active integrity constraints, in: Proceed-
     ings of the 20th International Conference on Principles of Knowledge Representation and
     Reasoning, (KR), 2023.
 [7] S. Staworko, J. Chomicki, J. Marcinkowski, Prioritized repairing and consistent query
     answering in relational databases, Annals of Mathematics and Artificial Intelligence
     (AMAI) 64 (2012) 209–246.
 [8] E. Livshits, B. Kimelfeld, Counting and enumerating (preferred) database repairs, in: Pro-
     ceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database
     Systems (PODS), 2017.
 [9] C. Bourgaux, Inconsistency Handling in Ontology-Mediated Query Answering. (Ges-
     tion des incohérences pour l’accès aux données en présence d’ontologies), Ph.D. thesis,
     University of Paris-Saclay, France, 2016.
[10] M. Bienvenu, C. Bourgaux, F. Goasdoué, Querying inconsistent description logic knowledge
     bases under preferred repair semantics, in: Proceedings of the 28th AAAI Conference on
     Artificial Intelligence (AAAI), 2014.
[11] L. E. Bertossi, Database Repairing and Consistent Query Answering, Synthesis Lectures
     on Data Management, Morgan & Claypool Publishers, 2011.
[12] S. Arming, R. Pichler, E. Sallinger, Complexity of repair checking and consistent query
     answering, in: Proceedings of the 19th International Conference on Database Theory
     (ICDT), 2016.
[13] M. Bienvenu, On the complexity of consistent query answering in the presence of simple
     ontologies, in: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI),
     2012.
[14] M. Bienvenu, R. Rosati, Tractable approximations of consistent query answering for robust
     ontology-based data access, in: Proceedings of the 23rd International Joint Conference on
     Artificial Intelligence (IJCAI), 2013.
[15] M. Bienvenu, C. Bourgaux, F. Goasdoué, Computing and explaining query answers over
     inconsistent DL-Lite knowledge bases, Journal of Artificial Intelligence Research (JAIR)
     64 (2019) 563–644.
[16] A. A. Dixit, P. G. Kolaitis, A SAT-based system for consistent query answering, in: Pro-
     ceedings of the 22nd International Conference on Theory and Applications of Satisfiability
     Testing (SAT), 2019.
[17] P. M. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic
     reasoning, logic programming and n-person games, Artif. Intell. 77 (1995) 321–358.
[18] S. Belabbes, S. Benferhat, J. Chomicki, Elect: An inconsistency handling approach for
     partially preordered lightweight ontologies, in: Proceedings of the 15th International
     Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR), 2019.
[19] S. Benferhat, Z. Bouraoui, K. Tabia, How to select one preferred assertional-based repair
     from inconsistent and prioritized DL-Lite knowledge bases?, in: Proceedings of the 24th
     International Joint Conference on Artificial Intelligence (IJCAI), 2015.
[20] S. Flesca, S. Greco, E. Zumpano, Active integrity constraints, in: Proceedings of the
     6th International ACM SIGPLAN Conference on Principles and Practice of Declarative
     Programming (PPDP), 2004.
[21] B. Bogaerts, L. Cruz-Filipe, Fixpoint semantics for active integrity constraints, Artif. Intell.
     255 (2018) 43–70.
[22] C. Lutz, I. Seylan, F. Wolter, The data complexity of ontology-mediated queries with closed
     predicates, Log. Methods Comput. Sci. 15 (2019).

</pre>