Introduction

Learning Ontology Axioms over Knowledge Graphs via Representation Learning

Leyuan Zhao

0 2

Xiaowang Zhang

xiaowangzhang@tju.edu.cn 0 2

Kewen Wang

Zhiyong Feng

0 2

Zhe Wang

1 0 College of Intelligence and Computing, Tianjin University , Tianjin 300350 , China 1 School of Information and Communication Technology, Gri th University , Brisbane, QLD 4111 , Australia 2 Tianjin Key Laboratory of Cognitive Computing and Application , Tianjin , China

This presents a representation learning model called SetE by modeling a predicate into a subspace in a semantic space where entities are vectors. Within SetE, a type as unary predicate is encoded as a set of vectors and a relation as binary predicate is encoded as a set of pairs of vectors. A new approach is proposed to compute the subsumption of predicates in a semantic space by employing linear programming methods to determine whether entities of a type belong to a sup-type and thus an algorithm for learning OWL axioms is developed. Experiments on real datasets show that SetE can e ciently learn various forms of axioms with high quality.

Introduction

Ontology construction is a core task of ontology engineering. It has been a research challenge in both knowledge representation and machine learning communities. This is because ontologies are often based on logical formalisms such as description logics (DLs), and contain more complex logical structures than graph databases or RDF triples. DL-Learner [ 1 ] is among the rst practical systems to learn ontological expressions, including complex DL class descriptions. Many methods for learning new rst order formulas and rules have been developed in Inductive Logic Programming (ILP) but they are often unable to handle very large ontologies. Recently, some attempts have been made to e ectively learn rules, such as [ 4 ], over KG through techniques in knowledge representation learning, but the rules they learn are not typical ontological axioms. What is more, conventional embedding models (e.g.,TransE, TransR, DistMult and SimplE) mainly focus on KG completion, which only embed entities and relations without modeling unary predictes. TransC [ 3 ] rstly di erentiate types (unary predictes) and entities, it encodes each type as a sphere and can learn the SubClassOf relationship between types. However, the encoding of relations and types in TransC * Copyright 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). is split, which prevents it from learning the relation SubPropertyOf and other complex axioms (e.g.SubClassOf(ObjectSomeValuesFrom(P; C); D)).

In this paper, we propose a novel uni ed embedding (called SetE) for KG unary predicates (types) and binary predicates (relations) treating types as sets of entities and relations as sets of entity pairs. On this basis, the subsumption is transformed to relative position of set boundaries which can be e ciently computed by linear programming (LP). We provide an algorithm for learning positive OWL axioms over large-scale knowledge graphs. 2

Our Approach

In this section, we will introduce SetE and the learning algorithm.

Baguette Quartz .. ..

0.2 0.8 ….

…. (a) A simple illustration of the embeddings of Baguette and Quartz. t f(e,t) p

g(s,p,o) e s

o (b) The structure of SetE Following the same intuition, the entity pair < s; o > can be considered as an instance of the relation p, so we model the fact < s; p; o > as follows. Where s and o are head and tail entity of the relation p, concate(s; o) means concatenate the two vecotrs s and o.

g(s; p; o) = concate(s; o)T p = 2n X[concate(s; o)]i [p]i Learning Ontology Axioms over Knowledge Graphs via Embdding To train the model, we introduce type boundary Bt 2 R. So that for all entity e of type t, there has f (e; t) > Bt; for e 2= t, there has f (e; t) < Bt. The relation boundary Br is the same. Like previous models, we generate negative samples and use SGD to train SetE.

LP to Subsumption Subsumption in KG has SubClassOf and SubPropertyOf. We take SubClassOf as an example to show how this can be transformed into LP under our model. The axiom SubClassOf(C; D) means that all entities that are instances of C must be instances of D. i.e., f (e; tC ) > Bt implies f (e; tD) > Bt, where tC and tD are type embeddings of C and D. So we convert this to linear programming that computes the minimum value of f (e; tD) subject to e 2 C (f (e; tC ) > Bt). If the minimum value is greater than the boundary Bt, that is for all entity e in type C, e always satisfy D, so we get the axiom SubClassOf(C; D).

Learning Ontology Axioms Based on previous analysis, we use liner programming on embeddings to learn the following forms: A1, SubClassOf(C; D); A2, SubPropertyOf(P; Q); A3,SubClassOf( ObjectSomeValuesFrom(P; C); D); A4, SubClassOf(ObjectIntersectionOf( C; D); Range(F )). The algorithm learning Sub ClassOf(C; D) is as follows.Line 3 means that if values in tC are smaller than or equal to tD in every dimension, then we can directly get that for any e, if f (e; tC ) > Bt then f (e; tD) > Bt. At last, Filter() returns axioms whose SC(standard con dence, de ned in [ 4 ]) are greater than M inSC. Algorithm 1 Learning SubClassOf Axioms from a KG Input: a KG K, and two real numbers LBt and M inSC 2 [0; 1] Output: a set O of SubClassOf axioms 1: E := SetE(K); O := ;. 2: for type embeddings tC and tD in E do 3: if (Pin=1([tC]i 6 [tD]i)?1 : 0) == n then 4: Add SubClassOf(C; D) to O 5: else if LP(tC; tD) > LBt then 6: Add SubClassOf(C; D) to O 7: end if 8: end for 9: O := Filter(O; M inSC) 10: return O 3

Experiments and Evaluation

The experiment on YAGO39K aims to evaluate the e ectiveness of SetE by comparing with the state-of-the-art model TransC[ 3 ] in SubClassOf classi cation. We retain four metrics: Accuracy, Precision, Recall and F1-score. TransC was trained with the con guration in their report. SubClassOfs were removed from the training set. To re ect real data that the negative samples far exceeds the positive(e.g.,#negative :#positive is 226:1 in DBpedia 2016 OWL), we add the proportion of negative samples during the experiment.

Result in Table 1 indicates that SetE outperforms TransC and is getting better when improving the proportion of negative samples. The Precision of SetE is much higher than TransC (up to 87.03% for rate 1:10). It shows that SetE is more cautious in making positive judgments, i.e., SetE distinguishes positive samples better. 1:1 TransC 1:4

1:10 1:1 1:4 1:10 57.95 50.96 48.17

Conclusion

In this paper, we present a new model SetE to speci cally represent types and relations in a semantic space which can reduce subsumption into linear programming. Our proposal utilizes the logical relationship to characterize the semantic features of expressive types in learning shows certain interpretability. In the future, we will improve the quality of expressive axioms learned and considerate even negated axioms.

Acknowledgments

This work is supported by the National Key Research and Development Program of China (2017YFC0908401) and the National Natural Science Foundation of China (61976153,61972455). Xiaowang Zhang is supported by the Peiyang Young Scholars in Tianjin University (2019XRX-0032).

1. Buhmann, L., Lehmann , J. , and Westphal , P. : DL-Learner - A framework for inductive learning on the semantic web . J. Web Semant., (39) , 15 { 24 ( 2016 ).

2. Ding , B. , Wang , Q. , Wang , B. , Guo , L. : Improving knowledge graph embedding using simple constraints . In: Proc. of ACL 2018 , pp. 110 { 121 .

3. Lv , X. , Hou , L. , Li , J. , Liu , Z. : Di erentiating concepts and instances for knowledge graph embedding . In: Proc. of EMNLP 2018 , pp. 1971 { 1979 .

4. Omran , P.G. , Wang , K. , Wang , Z. : Scalable rule learning via learning representation . In: Proc. of IJCAI 2018 , pp. 2149 { 2155 .