-

1613-0073

Yuqicheng Zhu

yuqicheng.zhu@ipvs.uni-stuttgart.de 0 2

Nico Potyka

PotykaN@cardiff.ac.uk 1

Bo Xiong

Trung-Kien Tran

Mojtaba Nayyeri

Staab

Evgeny Kharlamov

over ℰ ℒ

0 Bosch Center for Artificial Intelligence , Renningen , Germany 1 Cardif University , Cardif , UK 2 University of Stuttgart , Stuttgart , Germany

work. The Description Logic (DL) ℰ ℒ is a lightweight DL that has a favorable trade-of between expressive power and reasoning complexity and has been widely used in many real-world applications. Statistical ℰ ℒ ( ℰ ℒ ) extends ℰ ℒ by allowing conditional probabilities over axioms. Unlike other probabilistic DLs, the probabilistic semantics of ℰ ℒ is statistical, meaning that probabilities express proportions in a population rather than subjective beliefs. One major challenge is that reasoning in ℰ ℒ complete. To overcome this problem, we propose to use embeddings to perform approximate inference ontologies. This poster paper demonstrates the progress of the ongoing research, showcasing a demonstration through a simplified example, providing preliminary findings, and outlining the future ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, November 6-10, 2023, Athens, Greece ∗Corresponding author. †These authors contributed equally.

Description logics Uncertain reasoning Ontology embeddings

CEUR ceur-ws.org

1. Introduction

Description logics (DLs) [ 1 ] are logical languages used for representing ontological knowledge. Diferent DLs balance expressive power and reasoning complexity. One of the most prominent DLs is the Existential Language (ℰ ℒ) [2], which supports conjunction and existential quantification. ℰ ℒ is suficiently expressive for most ontologies that occur in practice and has polynomial reasoning complexity. Due to its appealing properties, ℰ ℒ has become one of the underlying formalisms of the standardized Web Ontology Language (OWL2 EL) [3].

In ℰ ℒ ontologies, knowledge is expressed by subsumption relationships like Politician ⊑ Person meaning that politicians are persons. However, there is usually uncertainty about our realworld knowledge. Statistical ℰ ℒ [4] ( ℰ ℒ ) is a statistical variant of ℰ ℒ that allows reasoning about statistics of a population. ℰ ℒ

ontologies are composed of probabilistic conditionals of the form ( ∣ )[, ]

, where 0 ≤ ≤ ≤ 1 . For example, (Politician ∣ Doctor)[0.1, 0.2] expresses that around 10-20% of persons with doctorate degree are politicians. Unfortunately, reasoning in ℰ ℒ

is ExpTime-complete [5] and, therefore, provably intractable. CEUR Workshop Proceedings

To overcome the practical limitations, we propose using embeddings to perform approximate reasoning in ℰ ℒ . The intuitive idea is to map concepts to boxes in a vector space. The statistical proportions in the ontology are maintained by guaranteeing similar proportions between the volume of the boxes. We can then reason about arbitrary concepts by computing new proportions between volumes in the vector space. Technically, we generalize the ℰ ℒ embedding BoxEL from [6] to ℰ ℒ . In practice, the ontology is typically not perfectly represented by the embedding. However, we assume that the approximation error is small whenever the embedding error is small. To evaluate this empirically, we derive a sound inference rule (Probabilistic Modus Ponens) for ℰ ℒ and use it to evaluate the approximation quality of our approach empirically. Our experiments show that both the embedding and inference errors are typically very small. 2. Embedding and Approximating ℰ ℒ The DL ℰ ℒ [7] describes individuals, concepts and their relationships using a set of individual names, a set of concept names and a set of role/relation names. Roughly speaking, ℰ ℒ allows talking about concepts that are formed from atomic concepts by taking conjunctions (denoted by ⊓) and existential quantification (denoted by ∃). Existential quantification in DLs assures the existences of a role successor. For example, ∃ℎ.ℎ refers to the set of objects that have (role) a child (concept). ℰ ℒ TBoxes are collections of subsumption relationships between concepts that are called general concept inclusions (GCIs). A GCI has the form ⊑ , where and are ℰ ℒ concepts.

ℰ ℒ is a probabilistic extension of ℰ ℒ that allows reasoning about statistical statements [4]. The basic syntactic elements are (probabilistic) conditionals ( ∣ )[, ] , where , are ℰ ℒ concept descriptions and , are probabilities such that ≤ . Intuitively, ( ∣ )[, ] expresses that the proportion of individuals in that also belong to is between and . If = , we simplify notation and just write ( ∣ )[] . ℰ ℒ generalizes ℰ ℒ in the sense that ( ∣ )[ 1 ] is semantically equivalent to ⊑ [4]. To illustrate the additional expressiveness of ℰ ℒ , let us consider some statistical beliefs about food.

( | )[0.4] ( | )[0.1] ( | )[0] ( ⊓ ℎ | )[0.4] ( ⊓ |ℎ )[0.35] ( ℎ. | )[0.25] (ℎ | ℎ. )[0.9] (ℎ | )[0.45] ( | )[ 1 ] (| )[0.1] The first two rows state proportions of diferent food products (they do not have to be disjoint and so the probabilities do not need to sum up to 1). The third row represents deterministic knowledge, stating the disjointness of ice cream and spicy pizza, and the fact that ice cream is always classified as food. The last two rows show some more complex examples with conjunction and existential quantification. For instance, the conditional ( ℎ. | )[0.25] expresses that 25% of food products are eaten with ice cream.

Given an arbitrary ℰ ℒ conditional ( ∣ )[, ] , we first replace it with the conditional ( ∣ )[, ] , where , are new concept names corresponding to and . To guarantee equivalence B ⊑ A

A C

B ≡ and ≡ , we add four ℰ ℒ GCIs ⊑ , ⊑ , ⊑ , ⊑ . To perform approximate inference on this knowledge bases, we embed GCIs using the geometric interpretation in BoxEL[6]. Concretely, concepts are modeled as boxes (i.e., axis-aligned hyperrectangles) and the relation as the afine transformation between boxes (see figure. 1). Furthermore, we embed the remaining atomic conditionals by additional loss terms that encourage that the ratio between the intersection of box and , and , respects the bounds expressed by the conditional. Having computed the embedding, we can perform approximate inference by computing unknown proportions in the embedding space.

3. Preliminary Experimental Results

Proposition 1 (Probabilistic Modus Ponens (PMP)). If ()[ 1, 1] and ( ∣ )[ ()[ 3, 3], where 3 = 1 ⋅ 2 and 3 = min{1, 1 ⋅ 2 + 1 − 1}.

Preliminary experimental results are presented in table 1. The table is mostly in line with our assumption that if the embedding error (the loss of the logical terms in the embedding) is small, the inference error is small as well. In future work, we will try to make an analytic connection between the embedding error (that is known after computing the embedding) and the potential inference error (which is unknown).

4. Related Work

Our work builds up on knowledge graph (KG) embeddings that map entities and relations into a vector space to model the relationships between entities. Most KG embeddings encode factual/instance-level knowledge expressed by triples ⟨head entity, relation, tail entity⟩ but ignore the terminological/concept-level knowledge expressed by logical axioms. [9] proposed embedding ℰ ℒ concepts as -balls and relations as translations between them. However, as balls are not closed under intersection, they cannot faithfully represent concept intersection. BoxEL [6] and ELBE [10] overcome this issue by embedding concepts as axis-parallel boxes. ELBE models relations as translations while BoxEL replaces translations by afine transformations . Box2EL [11] further considers one-to-many and many-to-many relations and embeds both concepts and roles as boxes. However, none of these methods is based on the probabilistic semantics that underlies ℰ ℒ ontologies.

5. Discussion and Outlook

This poster paper presents our ongoing research focused on utilizing box embeddings for approximate inference over ℰ ℒ ontologies. Our preliminary experiments show a small embedding and inference error and indicate that the known embedding error can be used to bound the unknown inference error. We are planning to utilize this in future work by reporting confidence intervals rather than point probabilities for queries. A related avenue for future investigation involves exploring the possibility of generating embeddings uniformly at random to get a better representation of the entailed probability interval (while our embedding always returns point probabilities, ℰ ℒ knowledge bases typically entail interval probabilities). Additionally, incorporating region-based role embeddings, as proposed in [11], may help to reduce the embedding error and consequently the inference error further. University Press, 2003. [2] F. Baader, B. Morawska, Unification in the description logic el., in: RTA, Springer, 2009, pp. 350–364. [3] B. C. Grau, I. Horrocks, B. Motik, B. Parsia, P. Patel-Schneider, U. Sattler, Web semantics: Science, services and agents on the world wide web, Web Semantics: Science, Services and Agents on the World Wide Web 6 (2008) 309–322. [4] R. Peñaloza, N. Potyka, Towards statistical reasoning in description logics over finite domains, in: International Conference on Scalable Uncertainty Management (SUM), Springer, 2017, pp. 280–294. [5] B. Bednarczyk, Statistical EL is exptime-complete, Information Processing Letters 169 (2021) 106113. [6] B. Xiong, N. Potyka, T. Tran, M. Nayyeri, S. Staab, Faithful embeddings for E ++ knowledge bases, in: U. Sattler, A. Hogan, C. M. Keet, V. Presutti, J. P. A. Almeida, H. Takeda, P. Monnin, G. Pirrò, C. d’Amato (Eds.), International Semantic Web Conference (ISWC), volume 13489 of Lecture Notes in Computer Science, Springer, 2022, pp. 22–38. [7] F. Baader, Least common subsumers and most specific concepts in a description logic with existential restrictions and terminological cycles, in: International Joint Conference on Artificial Intelligence (IJCAI), Morgan Kaufmann, 2003, pp. 319–324. [8] F. Mahdisoltani, J. Biega, F. Suchanek, Yago3: A knowledge base from multilingual wikipedias, in: 7th biennial conference on innovative data systems research, CIDR Conference, 2014. [9] M. Kulmanov, W. Liu-Wei, Y. Yan, R. Hoehndorf, EL embeddings: Geometric construction of models for the description logic EL++, in: IJCAI, ijcai.org, 2019, pp. 6103–6109. [10] X. Peng, Z. Tang, M. Kulmanov, K. Niu, R. Hoehndorf, Description logic EL++ embeddings with intersectional closure, CoRR abs/2202.14018 (2022). [11] M. Jackermeier, J. Chen, I. Horrocks, Box2el: Concept and role box embeddings for the description logic EL++, CoRR abs/2301.11118 (2023).

[1]

Baader ,

Calvanese ,

D. L.

McGuinness ,

Nardi ,

P. F.

Patel-Schneider (Eds.), The Description Logic Handbook: Theory , Implementation, and Applications , Cambridge