A System for Reasoning-based Link Prediction in Large Knowledge Graphs

A System for Reasoning-based Link Prediction in Large Knowledge Graphs HongWu College of Intelligence and Computing Tianjin University

China

ZheWang School of Information and Communication Technology Griffith University

Australian

XiaowangZhang College of Intelligence and Computing Tianjin University

China

PouyaGhiasnezhadOmran Research School of Computer Science Australian National University

Australia

ZhiyongFeng College of Intelligence and Computing Tianjin University

China

KewenWang k.wang@griffith.edu.au School of Information and Communication Technology Griffith University

Australian

A System for Reasoning-based Link Prediction in Large Knowledge Graphs 268BF7EFB68F82F19D9944FBF819CA95 GROBID - A machine learning software for extracting information from scholarly documents

This poster paper presents an efficient method R-Linker for link prediction in large knowledge graphs, based on rule learning. The scalability and efficiency is achieved by a combination of several optimisation techniques. Experimental results show that R-Linker is able to handle KGs with over 10 million of entities and more efficient than existing state-of-the-art methods including RLvLR and AMIE+ in rule learning stage for link prediction.

Introduction

Knowledge graphs (KGs), a new generation of knowledge bases, have received significant attention in semantic technologies. As a KG is usually very large (of size over 10 million entities), it is infeasible for manual construction. Also, KGs are usually incomplete. Thus, it is useful and challenging to automatically construct and enrich KGs. Link prediction is one of important tasks for automated construction of KGs. Given an entity e and a (binary) relation R, the problem of link prediction is to find an entity e such that the triple (e, R, e ) (or equivalently, the fact R(e, e )) is in the KG. A large number of methods for link prediction have bee proposed in the literature, including Neural LP, Node+LinkFeat and DISMULT [1]. However, most of these methods work only for relatively small KGs like WN18 and FB15K.

AMIE+ [4] and RLvLR [6] are among more recent methods that are able to predict links for larger KGs of size over 10 millions, and thus these methods are much more scalable than other rule learners such as [3,5]. As these two methods are essentially rule learners, they can address the link prediction in a more general form. For convenience, we refer this link prediction as Reasoning-based Link Prediction or just R-link prediction. Specifically, given a relation R, we want to find a pair (or pairs) e and e of entities such that triple (e, R, e ) is in the KG. In particular, RLvLR demonstrates that the technique of embedding in representation learning is promising for handling R-link prediction problem in large KGs. In order to extract information on the nationality of persons, one can learn a rule like BornIn(x, y) ∧ Country(y, z) → Nationality(x, z). Rules are explicit knowledge (compared to a neural network) and can provide human understandable explanations to learning results (e.g., link prediction) based on them. Thus, it is useful and important to extract rules for KGs automatically.

In this poster paper, we further push the envelope by developing a more efficient method R-Linker for R-link prediction in large KGs. The scalability and efficiency of R-Linker is achieved by a combination of several optimisation techniques. First, we use an adapted embedding for rule learning; Second, we introduce a new strategy of sampling called Hierarchical Sampling; Moreover, we develop new techniques of rule search and rule evaluation. As a result, we have implemented a new system R-Linker for link prediction with large KGs. Our experiments show that R-Linker is able to handle KGs of size over 10M and more efficient than other methods including RLvLR and AMIE+ in rule learning stage for link prediction. R-Linker is available at https: //www.dropbox.com/sh/c8ent25u3qp4vp1/AABc6Jl3zTRtOkdTwHaoDBDUa?dl=0

A Rule-based Model

Unlike other statistical relational models, we adopt rule-based models for link prediction, with the obvious advantage that the learned models (as sets of logical rules) are explainable and reusable. In what follows, we describe how we construct such models.

Embedding-based Rule Selection

Inspired by [6], we learn such rule-based models via predicate embeddings; yet unlike [6] using matrix embeddings, we adopt TransE vector embeddings which can significantly improve learning efficiency. As we demonstrate in the experiments, adopt a simpler form of embeddings does not compromise the learning accuracy. In [2], vector embeddings r and e are constructed for each relation R and each entity e in the KG. When a fact R(e, e ) exists in the KG, the embeddings satisfy e+r ≈ e . We extend it to an embedding characterisation for closed-path rules, that is, first-order Horn rules of the form R 1 (x, z 1 ) ∧ R 2 (z 1 , z 2 ) ∧ . . . ∧ R n (z n−1 , y) → R(x, y) with x, y, z 1 , . . . , z n−1 being variables. There are two aspects we hope to capture: (1) the composition of relations R 1 , . . . R n associates entities (in place of x and y) similarly as relation R does; and (2) the co-occurrence of arguments in the positions of x, y, z 1 , . . . , z n−1 .

For (1), it requires for each pair of entities (e, e ), e+r 1 +• • •+r n −e ≈ e+r−e . We define a measure sim(r 1 + • • • + r n , r), where sim is the L2 norm of vector distances. For (2), we use the notion of argumentation embeddings from [6]. More specifically, for each relation R, two vector embeddings r 1 and r 2 are computed by averaging the entity embeddings (as vectors) of all the entities occurring in the position of respectively, the subject and object arguments of R. Then, for x occurring as the subject arguments of both R 1 and R, y occurring as the object argument of both R n and R, and z i (1 ≤ i ≤ n − 1) occurring as the object argument of R i and subject argument of R i+1 , we have the following measure sim(r

1 1 , r 1 ) + sim(r 2 n , r 2 ) + sim(r 2 1 , r 1 2 ) + • • • + sim(r 2 n−1 , r 1 n ).

Hierarchical Data Sampling

A major challenge in the computation of embeddings is that existing methods cannot scale over large KGs, even for vector embeddings. Hence, we propose a new data sampling strategy, called hierarchical sampling, to reduce the sizes of input KGs by focusing on entities that are relevant to the link prediction task. Intuitively, for each link prediction task, the link (i.e., a relation) R is often given, and we sample entities (and facts) in the KG that are directly or indirectly related to R for embedding construction. Consider a KG K = (E, F ) with E being the set of all entities and F being the set of all facts (i.e., triples) in K. Our sampling method selects a (small) subset E ⊆ E that are relevant to R and focus on the facts F only about E (not mentioning other entities). Since each rule in our model forms a path, our sampling method also deploys a breath-first tree search. As shown in Figure 1 (a), the first sampled entities E 0 are those occurring in facts about R. Then, E 1 are those entities that occur in any facts (not necessarily about R) mentioning entities from E 0 . Similarly, E i+1 are those entities that occur in any facts mentioning entities from E i , for each i ≥ 1 till a prescribed depth. Figure 1 (b) shows how our sampling method preserves closed-path rules. For a rule of length 3, R 1 (x, z) ∧ R 2 (z, y) → R(x, y) and each supporting instance of the rule R 1 (e, e ) ∧ R 2 (e , e ) → R(e, e ), entities e and e will be sampled in E 0 and e in E 1 .

𝑹 𝑬 𝟏 𝑬 𝟑 𝑬 𝟐 𝑬 𝟏 𝑬 𝟎 𝑬 𝟎 R 1 R 1 𝑹 𝐄 𝟐 𝑬 𝟏 𝑬 𝟎 𝑬 𝟐 𝑹 2 𝑹 1 𝑹 2 𝑹 1 𝑬 𝟎 𝑬 𝟏 𝑬 𝟏 𝑹 1 𝑹 2 𝑹 1 𝑬 𝟎 𝑬 𝟏 𝑬 𝟎 𝑙𝑒𝑛 = 2 𝑙𝑒𝑛 = 3 𝑙𝑒𝑛 = 4 𝑹 𝑹

One optimisation is loop elimination during the breath-first search, as shown in Figure 1 (a), if a repeated entity is found on a path (represented in light color), the path is no longer explored. This is to avoid redundant atoms in the rules, for example R 1 (x, y) ∧ R − 1 (y, x) ∧ R 1 (x, y) → R(x, y). Furthermore, by recording the path information during the search, it eliminates a large number of invalid compositions of relations and can effectively suggest candidate rules. Other optimisations include selecting a bounded number of neighbours for each entity, and pruning relations with low frequency.

The evaluation of candidate rules, through the computation of standard confidence and head coverage, is often expensive, and much research effort has be dedicated to optimise such computation. A key step is to compute the support degree, i.e., the number of entity pairs in KG that make both body and head of the rule true. From the above discussions, we can quickly narrow down our search to entities directly connected to those in E 0 , and since the relation in the head is known, we can first check whether a pair of entities satisfy the head. These optimisations prove to be quite effective.

Experiments

We compared our system with RLvLR, AMIE+ and Neural LP on rule learning and link prediction, on common benchmarks FB15K(-237), Wikidata, DBPedia 3.8, and YAGO2s.

For large KGs Wikidata, DBPedia 3.8, and YAGO2s, Table 1 shows our system outperforms both RLvLR and AMIE+ in learning efficiency, as shown by the average numbers of rules (#R) and quality rules (#QR, standard confidence over 0.7) learned per hour. Table 2 shows that compared to RLvLR and Neural LP, the model constructed by our system demonstrates better accuracy on link prediction. Table 2 shows the comparison of our rule-based model against statistical models on FB15K-237. While our model has competitive performance on link prediction, its major advantage is that rule-based models are explainable and reusable. We plan to compare our method with some other approaches such as [7].

Table 2: Link prediction on large KGs.

Learner FB75K Wikidata MRR Hits@10 MRR Hits@10 R-Linker 0.37 59.0 0.33 39.

Fig. 1 :1Fig. 1: (a) Breath-first search for sampling; (b) Sampling preserves closed-paths.

Table 1 :1Rule learning on large KGs.KGR-Linker #R #QR #R #QR #R #QR RLvLR AMIE+DBpedia 13.38 3.6711 2.37 1.970.11Wikidata 37.38 18.52 23.56 10.62 <0.09 <0.03YAGO2s 9.71 2.28 6.56 1.88 <0.56 <0.05

Table 3 :3Link prediction on FB15K-237.LearnerMRR Hits@10DISTMULT0.2540.83Node+LinkFeat 0.2334.7RLvLR0.3443.40.2938.9Neural LP0.2436.1Neural LP 0.1325.7--RLvLR0.2439.3R-Linker0.2438.1

A Review of Relational Machine Learning for Knowledge Graph MNickel KMurphy VTresp EGabrilovich Proceedings of IEEE IEEE 2016 1041 Translating embeddings for modeling multi-relational data BAntoine UNicolas GDAlberto WJason YOksana NIPS 26 2013 Scalekb: scalable learning and inference over large knowledge bases YChen DZWang SGoldberg The VLDB Journal 25 6 2016 Fast rule mining in ontological knowledge bases with amie+ LAGalárraga CTeflioudi KHose FMSuchanek The VLDB Journal 24 6 2015 Rule learning from knowledge graphs guided by embedding models VTHo DStepanova MHGad-Elrab EKharlamov GWeikum Proc. ISWC pp ISWC pp 2018 Scalable rule learning via learning representation PGOmran KWang ZWang Proc. AAAI AAAI 2018 Blrn: End-to-end learning of knowledge base representations with latent, relational, and numerical features AGarcía-Durán MNiepert Proc. UAI UAI 2018