-

TransHExt: a Weighted Extension for TransH on Weighted Knowledge Graph Embedding

Kong Wei Kun

kong.diison@jaist.ac.jp 1

Xin Liu

xin.liu@aist.go.jp 0

Teeradaj Racharak

racharak@jaist.ac.jp 1

Le-Minh Nguyen

nguyenml@jaist.ac.jp 1

STRING

0 Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology , AIST 1 School of Information Science, Japan Advanced Institute of Science and Technology

2022

23 27

Many methods have been proposed for embedding ordinary knowledge graphs (KG), which greatly promote various applications. However, current knowledge graph embedding algorithms cannot encode weighted knowledge graphs (WKG), which are a generalized form of the ordinary KG. This paper gives a promising approach that enables to extend existing models to encode weighted knowledge graphs. Taking TransH as an example, we propose TransHExt. TransHExt shows competitive performance in both link prediction task and weight prediction task.

weighted Knowledge graph embedding weighted extension

https://staff.aist.go.jp/xin.liu/ (X. Liu); https://sites.google.com/view/teeradaj (T. Racharak) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

1. WeExt: Weighted ETxratnesnHsEioxnt, FISrWa mCe2w02o2rk

The framework of WeExt is shown in Figure 1. We introduce an add-on weight prediction module consisting of preprocessing and a neural weight predictor (denoted by ) to predict the weight for a given triple. The preprocessing is defined as the scoring function of a base model without the norm. We implement the neural weight predictor using a multi-layer feed-forward neural network.

base model

weight prediction module t h 표푐 푠푠 훠 푤 푤

훠ℎ 푢

훠ℎ 푑 푐 표 푠푐표 푓푢 푐 표 훠

푠푐표 embedded relation relation embedding entity embedding embed tail entity embedded head entity nents are the weight prediction component.

The preprocessing of the embedded entity and relation vary to the scoring function (⋅) of the base KGE model. The preprocessing serves two purposes: one is to back-propagate the loss from the neural weight predictor to all learnable parameters, another is to make the loss from the scoring function and the loss from the neural weight predictor as consistent as possible in the adjustment of the parameters in the back-propagation. For TransH, the scoring function is: = − ‖(h − w⊤hw ) + dr − (t − w⊤tw )‖2 2 where dr is the relation-specific translation vector and w is the relation-specific hyperplane.

Correspondingly, the predicted weight can be calculated by the weighting prediction component is: = (

(h − w⊤hw ) + dr − (t − w⊤tw )) We call this model TransHExt, i.e., extending TransH based on WeExt.

1.1. Training Protocol

For a given positive training set = {⟨(ℎ , , ) , ⟩}=0 , we generate a negative set ′ = {(ℎ′, , ′)} = {(ℎ′, , ) ∣ ℎ′ ∈ ⧵ {ℎ }} ∪ {(ℎ , , ′) ∣ ′ ∈ ⧵ { }}=0 . We measure the accuracy of the neural weight predictor with respect to the error on the weight prediction in each triple of : (ℎ, , , ) = { − |− | 0, , ∈ [0, 2 ] ℎ We adopt margin ranking loss as the loss function for the proposed models: ℒ =

∑ ⟨(ℎ,, ),⟩∈

∑ (ℎ′, ′, ′)∈ ′ + (ℎ, , ) + (ℎ, , , ) − (ℎ′, ′, ′) where is the margin. Since negative triples represent unencoded facts in the real world, it is not necessary to measure the loss of prediction on this set.

1.2. Evaluation Protocol

We evaluate the developed TransHExt on link prediction and weight prediction. For link prediction, given a test triple (ℎ , , ), ℎ is removed and the head entity is replaced by each of the entities of the dictionary in turn to form all possible triples. Triple scores are calculated by the scoring function = (ℎ, , ). We sort by ascending order, and get the rank of the -th correct triple in its all possible triples rk . We adopt mean rank (MR= || 1

|| ∑ =1 rk ), mean reciprocal rank (MRR= |1|

|| ∑ =1 rk of the models on link prediction. The [⋅] is the indicator function, which outputs 1 if the input is true, 0 otherwise. This whole procedure is repeated while removing instead of ℎ .

For weight prediction, we predict the weight of a given triple and report the mean squared error (MSE) and mean absolute error (MAE) between the predicted weight and the real weight. =1 [ rk ≤ ] to measure the performance

2. Experiment and Result

We did experiments on CN15K, and PPI5K [8] datasets. CN15K is a subgraph of ConceptNet [ 4 ], containing 15,000 entities and 241,158 weighted triples in English. NL27k is extracted from NELL [9], a weighted KG obtained from webpage reading. NL27k contains 27,221 entities, 404 relations, and 175,412 weighted triples. PPI5k is a subset of the protein-protein interaction knowledge base STRING [5] that contains 271,666 weighted triples for 4,999 proteins and 7 interactions. STRING labels the interactions between proteins with the probabilities of occurrence.

We implemented the neural weight predictor using a 4-layer neural network with 50 inputs, three hidden layers with 300, 100, 50 neurons, and one output layer, also with ReLU activations. We set learning rate for the stochastic gradient descent optimizer to 0.0001. We trained the models for 3000 epochs, evaluated the models every 100 epochs, and chose the best result according to MRR. The margin was set to 1. The dimension of embeddings was set to 50.

The results of link prediction and weight prediction are shown in Table 1 and Table 2, respectively. TransHExt outperforms TransH on link prediction in general, but gets a worse Hits@1 score on CN15K, NL27K and PPI5K. We assume this may be caused by the preprocessing. We will explore more preprocessing methods for WeExt. For weight prediction, TransHExt outperforms both of URGE and UKGE on CN15K and PPI5K, but gets a worse performance than UKGE on NL27K. We hypothesize this is caused by the diferent weight distribution of the dataset. We will explore more in our future work.

Acknowledgments

This work was supported by JST SPRING, Grant Number JPMJSP2102. This work was also partially supported by JSPS Grant-in-Aid for Early-Career Scientists (Grant Number 22K18004), JSPS Grant-in-Aid for Scientific Research (Grant Number 21K12042), and the New Energy and Industrial Technology Development Organization (Grant Number JPNP20006). [5] D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Simonovic, A. C. J. Roth, A. Santos, K. Tsafou, M. Kuhn, P. Bork, L. J. Jensen, C. von Mering, String v10: protein–protein interaction networks, integrated over the tree of life, Nucleic Acids Research 43 (2015) D447 – D452. [6] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating on hyperplanes, in: AAAI, 2014. [7] X. Chen, M. Chen, W. Shi, Y. Sun, C. Zaniolo, Embedding uncertain knowledge graphs,

AAAI (2019). [8] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowledge graph embeddings, in: AAAI, 2018. [9] T. M. Mitchell, W. W. Cohen, E. R. Hruschka, P. P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. A. Platanios, A. Ritter, M. Samadi, B. Settles, R. C. Wang, D. Wijaya, A. K. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-ending learning, Communications of the ACM 61 (2015) 103 – 115. [10] J. Hu, R. Cheng, Z. Huang, Y. Fang, S. Luo, On embedding uncertain graphs, CIKM (2017).

[1]

Hu ,

Shu ,

Huang ,

Qu , Edg-based question decomposition for complex question answering over knowledge bases , in: ISWC , 2021 .

[2]

Xiong ,

Power ,

Callan , Explicit semantic ranking for academic search via knowledge graph embedding , WWW ( 2017 ).

[3]

Wang ,

Chen ,

J. S. J.

Ren ,

Yu , H. Cheng, L. Lin, Deep reasoning with knowledge graph for social relationship understanding , in: IJCAI , 2018 .

[4]

Speer ,

Chin ,

Havasi , Conceptnet 5.5: An open multilingual graph of general knowledge , in: AAAI , 2017 .