1. Introduction

ReliK: A Reliability Measure for Knowledge Graph Embeddings [Extended Abstract]

Maximilian K. Egger

Wenyue Ma

Davide Mottin

Panagiotis Karras

0 1

Ilaria Bordino

Francesco Gullo

Aris Anagnostopoulos

Knowledge Graph Embeddings, Reliability, Knowledge Graphs

0 Aarhus University, Department of Computer Science , Aabogade 34, 8200 Aarhus , Denmark 1 Copenhaghen University, Department of Computer Science , Universitetsparken 1, 2100 København , Denmark 2 Dept of Computer, Control, and Management Eng., Sapienza University of Rome , Via Ariosto 25, 00185 Rome , Italy 3 Unicredit , Via Molfetta 101, 00171 Rome , Italy

Can we assess a priori how well a knowledge graph embedding will perform on a specific downstream task and in a specific part of the knowledge graph? Knowledge graph embeddings (KGEs) represent entities and relationships of a knowledge graph (KG) as vectors. KGEs are generated by optimizing an embedding score, which assesses whether a triple exists in the graph. KGEs have been proven efective in a variety of downstream tasks, including, for instance, predicting relationships among entities. However, the problem of anticipating the performance of a given KGE in a certain downstream task and locally to a specific individual triple, has not been tackled so far. In this paper, we fill this gap with ReliK, a Reliability measure for KGEs. ReliK relies solely on KGE embedding scores, is task- and KGE-agnostic, and requires no KGE retraining. Through extensive experiments, we attest that ReliK correlates well with both common downstream tasks, such as tail or relation prediction and triple classification, and advanced downstream tasks, such as rule mining and question answering, while preserving locality.

1. Introduction

Knowledge graphs (KGs) are sets of facts (i.e., triples such as “da Vinci,” “painted,” “Mona Lisa”) that interconnect entities (“da Vinci,” “Mona Lisa”) via relationships (“painted”) [ 1, 2 ]. Entities and relationships correspond to nodes and (labeled) edges of the KG, respectively. Knowledge graph embeddings (KGEs) [ 3 ] are popular techniques to generate a vector representation for entities and relationships of a KG. A KGE is computed by optimizing a scoring function that provides an embedding score as an indication of whether a triple actually exists in the KG. KGEs have been extensively used as a crucial building block of state-of-the-art methods for a variety of downstream tasks commonly carried out on the Web, such as knowledge completion [ 4 ], whereby a classifier is trained on the embeddings to predict the existence of a triple; or head/tail prediction [ 5 ], which aims to predict entities of a triple, as well as more advanced ones, including rule mining [ 6 ], query answering [ 7 ], and entity alignment [ 8, 9, 10, 11 ]. Motivation. So far, the choice of an appropriate KGE method has depended on the downstream task, the characteristics of the input KG, and the computational resources. The existence of many diferent scoring functions, including linear embeddings [ 12 ], bilinear [ 6 ], based on complex numbers [ 13 ], or projections [ 14 ] further complicates this choice. Alas, the literature lacks a unified measure to quantify how reliable the performance of a KGE method can be for a certain task beforehand, without performing such a potentially slow task. Furthermore, KGE performance on a specific downstream task is typically assessed in a global way, that is, in terms of how accurate a KGE method is for that task on the entire KG. However, the performance of KGEs for several practical applications (e.g., knowledge completion [ 4 ]) typically varies across the parts of the KG. This requires carrying out a performance assessment of KGE locally to specific parts of the KG, rather than globally.

Contributions. We address all the above shortages of the state of the art in KGE and introduce ReliK (Reliability for KGEs), a simple, yet principled measure that quantifies the reliability of how a KGE will perform on a certain downstream task in a specific part of the KG, without executing that task or (re)training that KGE. To the best of our knowledge, no measure like ReliK exists in the literature. ReliK relies exclusively on embedding scores as a black box, particularly on the ranking determined by those scores (rather than the scores themselves). ReliK is simple, intuitive, and easy-to-implement. Despite that, its exact computation requires processing all the possible combinations of entities and relationships, for every single fact of interest. We address this major challenge by devising an approximation to ReliK.

Apart from experimenting with ReliK in basic downstream tasks, such as entity/relation prediction, we also showcase ReliK on two advanced downstream tasks, query answering, which ifnds answers to complex logical queries over KGs, and rule mining, deduces logic rules, with the purpose of cleaning the KG from spurious facts or expanding the information therein.

2. Preliminaries

A knowledge graph (KG) : ⟨ℰ , ℛ, ℱ ⟩ is a triple consisting of a set ℰ of entities, a set ℛ of relationships, and a set ℱ ⊂ ℰ × ℛ × ℰ of facts. A fact is a triple ℎ = (ℎ, , ) , where ℎ ∈ ℰ is the head, ∈ ℰ is the tail, and ∈ ℛ is the relationship. For instance, entities “Leonardo da Vinci” and “Mona Lisa,” and relationship “painted” form the triple (“Leonardo da Vinci,” “painted,” “Mona Lisa”). The set ℱ of facts form an edge-labeled graph whose nodes and labeled edges correspond to entities and relationships, respectively. We say a triple ℎ is positive if it actually exists in the KG (i.e., ℎ ∈ ℱ ), negative otherwise (i.e., ℎ ∈/ ℱ ). Knowledge graph embedding. A KG embedding (KGE) [ 3, 5, 15 ] is a representation of entities and relationships in a -dimensional (≪|ℰ | ) space, typically, the real R space or the complex C space. For instance, TransE [ 12 ] represents a triple ℎ as entity vectors eℎ, e ∈ R and relation vector e ∈ R, and DistMult [ 6 ] represents the relationship as a matrix W ∈ R× . Although KGEs can difer (significantly) from one another in their definition, a common key aspect of all KGEs is that they are typically defined based on a so-called embedding scoring function or simply embedding score. This is a function : ℰ × ℛ × ℰ → R, which quantifies how likely a triple ℎ ∈ ℰ × ℛ × ℰ exists in based on the embeddings of its head (ℎ), relationship (), and tail (). Specifically, the higher (ℎ), the more likely the existence of ℎ. For instance, TransE’s score (ℎ) = −‖ eℎ + e − e‖ is the (ℓ1 or ℓ2) distance between the “translation” from ℎ’s embedding to ’s embedding through ’s embedding [ 12 ].

KGEs are typically learned via optimization (e.g., gradient descent) of a loss function defined based on the embedding score. This training process can be computationally expensive, especially if it has to be repeated for multiple KGEs. KGEs learned this way are shown to be efective for a number of downstream tasks [ 5 ], such as predicting the existence of a triple, but do not ofer any prior indication on their performance [ 16 ]. Moreover, existing benchmarks [ 15 ] show global performance on the entire graph rather than local on subgraphs. To this end, in this work, we provide an answer to the following key question: Is there a measure that provides a prior indication on the performance of an embedding on a specific subgraph?

3. KGE Reliability

We would like a measure of reliability that properly assesses how the embedding of a triple would perform on certain tasks and data, without knowing them in advance. To this end, we aim to fulfill the following desiderata for a KGE reliability measure. (R1) Embedding-agnostic. Is independent of the specific KGE method. This is to have a measure fully general. (R2) Learning-free. Requires no further KGE training. This is primarily motivated by eficiency, but also for other reasons, such as privacy or unavailability of the data used for KGE training. (R3) Task-agnostic. Is independent of the specific downstream task. by anticipating the performance of a KGE in general, for any downstream task. (R4) Locality. Is a good predictor of KGE performance locally to a given triple, that is, in a close surrounding neighborhood of that triple. This is important, as a KGE model may be more or less efective based on the diferent parts of the KG it is applied to.

3.1. The Proposed ReliK Measure

Design principles. To fulfill (R1) and (R2), the KGE reliability measure should not engage with the internals of the computation of KGEs. Thus, we need to treat the embeddings as vectors and the embedding score as a black-box function that provides only an indication of the actual existence of a triple. Although the absolute embedding scores are incomparable to one another, we observe that the distribution of positive and negative triples is significantly diferent. Specifically, we assume the relative ranking of a positive triple to be higher than that of a negative. Otherwise, we multiply the score by − 1. This leads to the following main observation.

Observation 1. A KGE reliability measure that uses the position of a triple relative to other triples via a ranking defined based on the embedding score fulfills (R1) and (R2).

Furthermore, comparing a triple to all other (positive or negative) triples might be inefective. This is because the absolute score does not provide an indication of performance. We thus advocate that a local approach that considers triples relative to a neighborhood is more appropriate, and propose a measure that fulfills (R4). The soundness of (R4) is better attested in our experiments in Section 4. To meet (R3), the KGE reliability measure should not exploit any peculiarity of a downstream task in its definition. Indeed, this is accomplished by our measure, as we show next.

Definition.

negative triples, that is, triples with head ℎ that do not exist in . Similarly, we compute − () for tail . We define the

head-rank ℎ of a triple ℎ as the position of the triple in the rank obtained using score for a specific KGE relative to all the negative triples having head ℎ.

For a triple ℎ = (ℎ, , ) we compute the neighbor set − (ℎ) of all possible rank (ℎ) = ⃒⃒ { ∈ − (ℎ) : () > (ℎ)}⃒⃒ + 1 The tail-rank rank (ℎ) for tail is defined similarly. reciprocal of the head- and tail-rank

Our reliability measure, ReliK, for a triple ℎ is ultimately defined as the average of the ReliK(ℎ) = 1 (︂ 2

1 rank (ℎ) +

1 rank (ℎ) ︂) We define the ReliK score of a set ⊆ ℱ

of triples as the average ReliK of the triples in the set.

Rationale. ReliK ranges from (0, 1], with higher values corresponding to better reliability. In fact, the lower the head-rank rank (ℎ) or tail-rank rank (ℎ), the better the ranking of ℎ induced by the underlying embedding scores, relatively to the nonexisting triples in ℎ’s neighborhood, complies with the actual existence of ℎ in the KG.

It is easy to see that ReliK achieves (R1) and (R2) by relying on the relative ranking rather than the absolute scores. It also fulfills (R3) as it involves no downstream tasks at all, and (R4) as it is based on the local (i.e., 1-hop) neighborhood of a target triple.

3.2. Eficiently Computing

ReliK

We define our estimator as Computing ReliK (Eq. (1)) requires Ω (|ℰ | · |ℛ| ) time, as it needs to scan the entire negative neighborhood of the target triple. For large KGs, repeating this for a (relatively) high number of triples may be computationally too heavy. For this purpose, here we focus on approximate versions of ReliK, which properly trade of between accuracy and eficiency.

The main intuition behind the ReliK approximation is that the precise ranking of the various potential triples is not actually needed. Rather, what it matters is just the number of those triples that exhibit a higher embedding score than the target triple. As such, we estimate ReliK by evaluating the fraction of triples in the sample that have a higher score than the triple under consideration and then scaling this fraction to the total number of negative triples.

Let be a random subset of elements selected without replacement independently and uniformly at random from the negative neighborhood − (ℎ) of the head ℎ of a triple ℎ. The size | | trades of between eficiency and accuracy of the estimator, and it may be defined based on the size of − (ℎ). Define also rank (ℎ) = |{ ∈ : () > (ℎ)}| + 1, to be the rank of the score (ℎ) that the KGE assigns to ℎ, among all the triples in the sample. We similarly compute and rank for tail’s neighborhood − ().

ReliKApx = rank (ℎ) | − (ℎ)| )︂ − 1 rank (ℎ) | − ()| )︂ − 1]︃ In words, we simply scale up the rank induced by the sample to the entire set of negative triples. Theorem 1. Equation 2 is an upper bound for ReliK; that is E[ReliKApx(ℎ)] ≥ ReliK(ℎ).

Theorem 1 follows immediately from the Jensen’s inequality [ 17 ] and the fact that E[rank (ℎ)] = | | · rank (ℎ) .

| − (ℎ)| Algorithm. To compute ReliKApx, we first sample, uniformly at random, negative triples from the head neighborhood and the tail neighborhood. We can save computation time by first or the tail in common with ℎ to update the corresponding rank. ifltering the triples in ∪ by score, that is, by considering only those with score higher than the input triple ℎ, and then checking whether a triple in ∪ has either the head triple.The sample size trades of between accuracy and eficiency of the estimation. Time complexity. ReliKApx runs in () time to sample the negative triples for each positive

4. Experimental evaluation

We evaluate ReliK on four downstream tasks, six embeddings, and four datasets. Embeddings. We include six established KGE methods, TransE [ 12 ], DistMult [ 6 ], RotatE [ 13 ], PairRE [ 14 ], ComplEx [ 18 ], ConvE [ 19 ] representative of the major embedding families (Sec. 5). Datasets. We perform experiments on KGs with diferent characteristics:

Countries [20] is

a small graph from geographical locations; FB15k237 [21] is a sample of Freebase KG [22] with 14k nodes, 310k facts, and 237 relationships; Codex [23] is a collection of three datasets of incremental size, Codex-S (2 entities, 36 triples), Codex-M (17 entities, 200 facts), and Codex-L (78 entities, 610 facts) extracted from Wikidata and Wikipedia.; YAGO2 [24] is a KG automatically extracted from Wikidata, which comprises 834k entities and 948k facts. . Experimental setup. We implement our approximate and exact ReliK in Python v3.9.13 (Code at: https://github.com/AU-DIS/ReliK). We train the embedding with Pykeen v1.10.1 [ 15 ], with default parameters besides the embedding dimension = 50 and training loop sLCWA. We report an average of 5 experiments using 5-fold cross validation with 80-10-10 split.

4.1. Common Downstream Tasks

We test ReliK on the ability to anticipate the results of common tasks for KGEs [ 5, 3 ]. We measure the statistical significance of Pearson correlation among two ranking tasks, tail and relation prediction, and the triple classification task. To evaluate ReliK on diferent areas of the graph and diferent graph topologies, we sample random subgraphs of Codex-S with 60 nodes by initially selecting a starting node uniformly at random and then including nodes and edges by a random walk with restart [25] with restart probability 1 = 0.2, until the subgraph comprises 60 nodes. For Codex-M and Codex-L we use size 100 and for FB15k237 we use 200 − nodes. We report the average ReliK on 100 random subgraphs.

Ranking tasks (T1). In the first experiments, we measure the Pearson correlation between ReliK and the performance on ranking tasks with mean reciprocal rank (MRR) [26]. The first task, tail prediction [ 12, 14, 13 ], assesses the ability of the embedding to predict the tail given the head and the relation, thus answering the query (ℎ, , ?) where the tail is unknown. The second task, relation prediction, assesses the ability of the embedding to predict the undisclosed relation of a triple (ℎ, ?, ). The common measure used for tail and relation prediction is MRR, which provides an indication of how close to the top the score ranks the correct tail (or relation). Consistently with previous approaches [ 12, 14, 13 ], we employ the filtered approach in which we consider for evaluation only negative triples that do not appear in either the train, test, or validation set. Table 1 reports the correlations alongside the statistical significance in terms of the p-value. We marked in red, high p-values (> 0.05), which suggest no correlation, and Pearson score values that indicate inverse correlation. Generally, ReliK exhibits significant correlation across embeddings and tasks. Noteworthy, even though ReliK (see Eq. (1)) does not explicitly target tail or head rankings by including both, we observe significant correlation on tail prediction in most embeddings and datasets. Because of the considerable training time, we only report results for RotatE on Codex-S. Comparing the actual results of the various tasks, it is also clear in most cases in which we do not have correlation that the results are too close to distinguish; for example, ComplEx having only results close to 0. In such cases, the results indicate that the particular embedding method needs additional training.

Tail (MRR)

Relation (MRR)

Classific. (Acc.) KGE

Pearson p-value

Pearson p-value TransE 0.83 LDistMult 0.49 -xRotatE – eodPairRE − 0.04 CComplEx 0.82 ConvE 0.59 TransE 0.24 7DistMult − 0.05 23kRotatE – 51PairRE 0.80 FBComplEx 0.20 ConvE 0.09 1.13− 26 0.97 2.10− 07 0.78 – – 0.68 0.95 1.03− 25 0.91 4.26− 11 − 0.07 Tuning Subgraph Size. Next, we analyze how ReliK tail relation classifier correlates with the tasks presented in Section 4.1 on n0.8 subgraphs of varying size with the TransE embedding. rso0.6 Figure 1 reports the correlation values for all three tasks, eaP0.4 only including those values where the p-value is be- 0 20 40 60 80 100 120 140 low 0.05. We observe that ReliK’s correlation generally Subgraph size increases with subgraphs of up to 100 nodes on Codex- Figure 1: Correlation vs subgraph size S. After that point, we note an unstable behavior in all on Codex-S. tasks. This is consistent with the assumption that ReliK is a measure capturing local reliability. To strike a balance between quality and time we test on subgraphs with 60 nodes for Codex-S in all experiments. Yet, as tasks are of diferent nature, the subgraph size can be tuned in accordance with the task to provide more accurate results.

4.2. Complex Downstream Tasks

Query answering (T3). We show how ReliK can improve query-answering tasks. Complex logical queries on KGs are working with diferent query structures. We focus on queries of chaining multiple predictions or having an intersection of predictions, from diferent query structures that have been described in recent work [27, 28]. We keep the naming convention introduced by Ren and Leskovec [27]. We evaluate a selection of 1000 queries per type (1,2,3,2,3) from their data on the FB15k237 graph.The queries of type are 1 to 3 hops from a given entity with fixed relation labels that point to a solution, whereas queries of type are the intersection of 2 or 3 predictions pointing towards the same entity. We evaluate ReliK on the ability to detect whether an instance of an answer is true or false. We compute ReliK on TransE embeddings trained on the entire FB15k237. Figure 2 shows the average ReliK scores for positive and negative answers. ReliK clearly discriminates between positive and negative instances, often by a large margin.

· 10− 5Query Answering

Detecting true instances. To showcase performance on the downstream task (T4), we compare ReliK with the reciprocal rank (RR) of a combination of the tail and the relation embeddings on the ability to detect whether an instance of a rule is true or false. This task is particularly important to quantify the real confidence of a rule [ 33]. To this end, we use a dataset [34] comprising 23 324 manually annotated instances over 26 rules extracted from YAGO2 using the AMIE [29] and RudiK [31] methods. We compute ReliK on TransE embeddings trained on the entire YAGO2. Figure 2 shows the average ReliK scores for positive and negative instances. ReliK discriminates between positive and negative instances, often by a large margin, whereas RR often confounds positive and negative instances.

5. Related Work

Knowledge graph embeddings are commonly used to detect missing triples, correcting errors, or question answering [ 3, 5 ]. Translational embeddings in the TransE [ 12 ] family and the recent PairRE [ 14 ] assume that the relationship performs a translation from the head to the tail. Semantic embeddings, such as DistMult [ 6 ] or HolE [35], interpret the relationship as a multiplicative operator. Complex embeddings, such as RotatE [ 13 ] and ComplEx [ 18 ], use complex-valued vectors and operations in the complex plane. Neural-network embeddings, such as ConvE [ 19 ], perform sequences of nonlinear operations.

Embedding calibration. An orthogonal direction to ours is embedding calibration [36, 37]. Calibration methods provide efective ways to improve the existing embeddings on various tasks, by altering the embedding vectors in subspaces with low accuracy [36], by reweighing the output probabilities in the respective tasks [37], or by matrix factorization [38]. Evaluation of embeddings. ReliK bears an interesting connection with ranking-based quality measures, in particular with the mean reciprocal rank (MRR) and HITS@k for head, tail, and relation prediction [ 3, 36, 39, 12, 14, 13 ]. Even though MRR and HITS@k provide a global indication of performance, ReliK is suitable for local analysis. Yet, current global measures have been recently shown to be biased towards high-degree nodes [40].

6. Conclusion

Aiming to develop a measure that prognosticates the performance of a knowledge graph embedding on a specific subgraph, we introduced ReliK, a KGE reliability measure agnostic to the choice of the embeddings, the dataset, and the task. To allow for eficient computation, we proposed a sampling-based approximation, which we show to achieve similar results to the exact ReliK in a fraction of the time. Our experiments confirm that ReliK anticipates the performance on a number of common and complex downstream tasks for KGEs. These results suggest that ReliK may be used in other domains, as well as a debugging tool for KGEs.

Acknowledgments

M. Egger is supported by the Danish Council for Independent Research grant DFF-1051-00062B. W. Ma is supported by the China Scholarship Council grant 202110320012. [20] G. Bouchard, S. Singh, T. Trouillon, On approximate reasoning capabilities of low-rank vector spaces, in: AAAI, 2015. [21] K. Toutanova, D. Chen, Observed versus latent features for knowledge base and text inference, in: Proceedings of the 3rd workshop on continuous vector space models and their compositionality, 2015, pp. 57–66. [22] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, in: SIGMOD, 2008, pp. 1247–1250. [23] T. Safavi, D. Koutra, Codex: A comprehensive knowledge graph completion benchmark, in: EMNLP, 2020, pp. 8328–8350. [24] J. Hofart, F. M. Suchanek, K. Berberich, G. Weikum, Yago2: A spatially and temporally enhanced knowledge base from wikipedia, Artificial intelligence 194 (2013) 28–61. [25] H. Tong, C. Faloutsos, J.-Y. Pan, Fast random walk with restart and its applications, in:

ICDM, IEEE, 2006, pp. 613–622. [26] N. Craswell, Mean reciprocal rank., Encyclopedia of database systems 1703 (2009). [27] H. Ren, J. Leskovec, Beta embeddings for multi-hop logical reasoning in knowledge graphs,

Advances in Neural Information Processing Systems 33 (2020) 19716–19726. [28] Y. Bai, X. Lv, J. Li, L. Hou, Answering complex logical queries on knowledge graphs via query computation tree optimization (2023). [29] L. A. Galárraga, C. Teflioudi, K. Hose, F. M. Suchanek, AMIE: association rule mining under incomplete evidence in ontological knowledge bases, in: TheWebConf, 2013, pp. 413–422. [30] L. Galárraga, C. Teflioudi, K. Hose, F. M. Suchanek, Fast rule mining in ontological knowledge bases with amie ++, VLDBJ 24 (2015) 707–730. [31] S. Ortona, V. V. Meduri, P. Papotti, Robust discovery of positive and negative rules in knowledge bases, in: ICDE, 2018, pp. 1168–1179. [32] R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, in: VLDB, volume 1215, Santiago, Chile, 1994, pp. 487–499. [33] M. Loster, D. Mottin, P. Papotti, J. Ehmüller, B. Feldmann, F. Naumann, Few-shot knowledge validation using rules, in: TheWebConf, 2021, pp. 3314–3324. [34] M. Loster, D. Mottin, P. Papotti, J. Ehmüller, B. Feldmann, F. Naumann, Colt dataset, 2021.

URL: https://hpi.de/naumann/projects/repeatability/datasets/colt-dataset.html, accessed on November 1, 2023. [35] M. Nickel, V. Tresp, H.-P. Kriegel, et al., A three-way model for collective learning on multi-relational data., in: ICML, volume 11, 2011, pp. 3104482–3104584. [36] T. Safavi, D. Koutra, E. Meij, Evaluating the calibration of knowledge graph embeddings for trustworthy link prediction, in: EMNLP, 2020. [37] P. Tabacof, L. Costabello, Probability calibration for knowledge graph embedding models, in: ICLR, 2020. [38] C. Demir, J. Lienen, A.-C. Ngonga Ngomo, Kronecker decomposition for knowledge graph embeddings, in: HT, 2022, pp. 1–10. [39] F. Bianchi, G. Rossiello, L. Costabello, M. Palmonari, P. Minervini, Knowledge graph embeddings and explainable ai, in: Knowledge Graphs for eXplainable Artificial Intelligence: Foundations, Applications and Challenges, IOS Press, 2020, pp. 49–72. [40] S. Tiwari, I. Bansal, C. R. Rivero, Revisiting the evaluation protocol of knowledge graph completion methods for link prediction, in: TheWebConf, 2021, pp. 809–820.

[1]

Hogan , E. Blomqvist,

Cochez , C. d'Amato, G. de Melo,

Gutierrez ,

Kirrane ,

J. E. L.

Gayo ,

Navigli ,

Neumaier ,

A. N.

Ngomo ,

Polleres ,

S. M.

Rashid ,

Rula ,

Schmelzeisen ,

J. F.

Sequeda ,

Staab ,

Zimmermann , Knowledge graphs , ACM CSUR 54 ( 2022 ) 71 : 1 - 71 : 37 .

[2]

Weikum , Knowledge graphs 2021 : A data odyssey , PVLDB 14 ( 2021 ) 3233 - 3238 .

[3]

Wang ,

Mao ,

Wang ,

Guo , Knowledge graph embedding: A survey of approaches and applications , TKDE 29 ( 2017 ) 2724 - 2743 .

[4]

Wang ,

Chen ,

Ban ,

Usman ,

Guan , S. Liu, T. Wu,

Chen , Knowledge graph quality control: A survey , Fundamental Research 1 ( 2021 ) 607 - 626 .

[5]

Ji ,

Pan , E. Cambria,

Marttinen ,

S. Y.

Philip , A survey on knowledge graphs: Representation, acquisition, and applications , Trans. Neural Netw. Learn. Syst . 33 ( 2021 ) 494 - 514 .

[6]

Yang , S. W.-t. Yih,

He ,

Gao ,

Deng , Embedding entities and relations for learning and inference in knowledge bases , in: ICLR , 2015 .

[7]

Wu ,

Xu ,

Lin , W. Zhang, A holistic approach for answering logical queries on knowledge graphs , in: ICDE , 2023 , pp. 2345 - 2357 .

[8]

S. S.

Bhowmick ,

E. C.

Dragut , W. Meng, Globally aware contextual embeddings for named entity recognition in social media streams , in: ICDE , 2023 , pp. 1544 - 1557 .

[9]

Huang ,

Sun ,

Chen ,

Xu ,

Ren , W. Hu, Deep active alignment of knowledge graph entities and schemata , PACMMOD 1 ( 2023 ) 159 : 1 - 159 : 26 .

[10]

Zeakis ,

Papadakis ,

Skoutas ,

Koubarakis , Pre-trained embeddings for entity resolution: An experimental analysis , PVLDB 16 ( 2023 ) 2225 - 2238 .

[11]

Zhong ,

Zhang ,

Fan ,

Dou , Semantics driven embedding learning for efective entity alignment , in: ICDE , 2022 , pp. 2127 - 2140 .

[12]

Bordes ,

Usunier ,

Garcia-Duran ,

Weston ,

Yakhnenko , Translating embeddings for modeling multi-relational data , NeurIPS 26 ( 2013 ).

[13]

Sun ,

Deng ,

Nie ,

Tang , Rotate: Knowledge graph embedding by relational rotation in complex space , in: ICLR , 2019 .

[14]

Chao ,

He ,

Wang ,

Chu , Pairre: Knowledge graph embeddings via paired relation vectors , in: ACL , 2021 , pp. 4360 - 4369 .

[15]

Ali ,

Berrendorf ,

C. T.

Hoyt ,

Vermue ,

Galkin ,

Sharifzadeh ,

Fischer ,

Tresp ,

Lehmann , Bringing light into the dark: A large-scale evaluation of knowledge graph embedding models under a unified framework , TPAMI 44 ( 2021 ) 8825 - 8845 .

[16]

Jain ,

J.-C.

Kalo , W.-T. Balke,

Krestel , Do embeddings actually capture knowledge graph semantics? , in: ESWC, 2021 , pp. 143 - 159 .

[17]

J. L. W. V.

Jensen , Sur les fonctions convexes et les inégalités entre les valeurs moyennes , Acta mathematica 30 ( 1906 ) 175 - 193 .

[18]

Trouillon ,

Welbl , S. Riedel, É. Gaussier, G. Bouchard, Complex embeddings for simple link prediction , in: ICML, PMLR, 2016 , pp. 2071 - 2080 .

[19]

Dettmers ,

Minervini ,

Stenetorp ,

Riedel , Convolutional 2d knowledge graph embeddings , in: AAAI , volume 32 , 2018 .