TransHExt: a Weighted Extension for TransH on Weighted Knowledge Graph Embedding Kong Wei Kun1 , Xin Liu2 , Teeradaj Racharak1 and Le-Minh Nguyen1 1 School of Information Science, Japan Advanced Institute of Science and Technology 2 Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) Abstract Many methods have been proposed for embedding ordinary knowledge graphs (KG), which greatly promote various applications. However, current knowledge graph embedding algorithms cannot encode weighted knowledge graphs (WKG), which are a generalized form of the ordinary KG. This paper gives a promising approach that enables to extend existing models to encode weighted knowledge graphs. Taking TransH as an example, we propose TransHExt. TransHExt shows competitive performance in both link prediction task and weight prediction task. Keywords weighted Knowledge graph embedding, weighted extension Knowledge graphs (KG) are thriving and promoting many downstream applications in se- mantic web such as question answering over RDF triples [1] and related areas such as academic search [2] and social relationship recognition [3]. Facts encoded in KG are formalized as triple (β„Ž, π‘Ÿ, 𝑑), in which β„Ž denotes a head entity, 𝑑 denotes a tail entity, and π‘Ÿ denotes a relation between β„Ž and 𝑑. Weighted knowledge graphs (WKG) generalize (crisp) knowledge graphs by associating a weight in (0, 1] to each triple. This formalism have been used to represent uncertainty [4], and even out-of-band knowledge [5] in a growing number of scenarios. The weighted triples can better model the interactions between the entities, such as the interactions of proteins in STRING [5]. Most knowledge graph embedding (KGE) models are targeted at learning the representation of KGs without weights. To leverage the existing models, such as TransH [6] to embed facts encoded in WKGs, we propose WeExt, an extension method that adds on a weight prediction module to a KGE model (called the base model). To illustrate our intuition, we select TransH and show its extension called TransHExt in this work. Our experiments show that TransHExt outperforms TransH on link prediction task and UKGE [7] on weight prediction task. The 21st International Semantic Web Conference, October 23–27, 2022, Hangzhou, China Envelope-Open kong.diison@jaist.ac.jp (K. W. Kun); xin.liu@aist.go.jp (X. Liu); racharak@jaist.ac.jp (T. Racharak); nguyenml@jaist.ac.jp (L. Nguyen) GLOBE https://staff.aist.go.jp/xin.liu/ (X. Liu); https://sites.google.com/view/teeradaj (T. Racharak) Orcid 0000-0002-2958-9969 (K. W. Kun); 0000-0002-2336-7409 (X. Liu); 0000-0002-8823-2361 (T. Racharak); 0000-0002-2265-1010 (L. Nguyen) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) TransHExt, Framework 1. WeExt: Weighted Extension ISWC 2022 The framework of WeExt is shown in Figure 1. We introduce an add-on weight prediction module consisting of preprocessing and a neural weight predictor (denoted by 𝑛𝑀𝑝) to predict the weight for a given triple. The preprocessing is defined as the scoring function of a base model without the norm. We implement the neural weight predictor using a multi-layer feed-forward neural network. base model weight prediction module ν‘’οΏ½ t οΏ½ οΏ½ν‘œν‘ ν‘ ν‘  ν›  ν‘€ ν› β„Ž ν‘€ ν› β„Ž οΏ½ ν‘‘ 푐 ν‘œοΏ½ οΏ½ ν‘ ν‘ν‘œοΏ½ ν›  οΏ½ h ν‘“ν‘’ 푐 ν‘œ ν‘ ν‘ν‘œοΏ½ relation entity embed embedded embedded embedding embedding tail entity head entity relation Figure 1: The framework of WeExt. The green components are the base KGE model. The blue compo- nents are the weight prediction component. The preprocessing of the embedded entity and relation vary to the scoring function 𝑓 (β‹…) of the base KGE model. The preprocessing serves two purposes: one is to back-propagate the loss from the neural weight predictor to all learnable parameters, another is to make the loss from the scoring function and the loss from the neural weight predictor as consistent as possible in the adjustment of the parameters in the back-propagation. For TransH, the scoring function is: 2 𝑠 = βˆ’ β€–(h βˆ’ w⊀ ⊀ π‘Ÿ hwπ‘Ÿ ) + dr βˆ’ (t βˆ’ wπ‘Ÿ twπ‘Ÿ )β€–2 where dr is the relation-specific translation vector and wπ‘Ÿ is the relation-specific hyperplane. Correspondingly, the predicted weight 𝑀𝑝 can be calculated by the weighting prediction compo- nent is: 𝑀𝑝 = 𝑛𝑀𝑝((h βˆ’ w⊀ ⊀ π‘Ÿ hwπ‘Ÿ ) + dr βˆ’ (t βˆ’ wπ‘Ÿ twπ‘Ÿ )) We call this model TransHExt, i.e., extending TransH based on WeExt. 1.1. Training Protocol For a given positive training set 𝑒 𝑆 = {⟨(β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖 ) , 𝑀𝑖 ⟩}𝑖=0 , we generate a negative set 𝑒 𝑆 β€² = {(β„Žβ€² , π‘Ÿ, 𝑑 β€² )} = {(β„Žβ€²π‘– , π‘Ÿπ‘– , 𝑑𝑖 ) ∣ β„Žβ€²π‘– ∈ 𝐸 β§΅ {β„Žπ‘– }} βˆͺ {(β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖′ ) ∣ 𝑑𝑖′ ∈ 𝐸 β§΅ {𝑑𝑖 }}𝑖=0 . We measure the accuracy of the neural weight predictor with respect to the error on the weight prediction in each triple of 𝑆: π‘€βˆ’|π‘€βˆ’π‘€π‘ | π‘Žπ‘π‘(β„Ž, π‘Ÿ, 𝑑, 𝑀) = { 𝑀 , 𝑀𝑝 ∈ [0, 2𝑀] 0, π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ We adopt margin ranking loss as the loss function for the proposed models: β„’ = βˆ‘ βˆ‘ 𝛾 + 𝑓 (β„Ž, π‘Ÿ, 𝑑) + π‘Žπ‘π‘(β„Ž, π‘Ÿ, 𝑑, 𝑀) βˆ’ 𝑓 (β„Žβ€² , π‘Ÿ β€² , 𝑑 β€² ) ⟨(β„Ž,π‘Ÿ,𝑑),π‘€βŸ©βˆˆπ‘† (β„Žβ€² ,π‘Ÿ β€² ,𝑑 β€² )βˆˆπ‘† β€² where 𝛾 is the margin. Since negative triples represent unencoded facts in the real world, it is not necessary to measure the loss of prediction on this set. 1.2. Evaluation Protocol We evaluate the developed TransHExt on link prediction and weight prediction. For link prediction, given a test triple (β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖 ), β„Žπ‘– is removed and the head entity is replaced by each of the entities of the dictionary in turn to form all possible triples. Triple scores are calculated by the scoring function 𝑠 = 𝑓 (β„Ž, π‘Ÿπ‘– , 𝑑𝑖 ). We sort 𝑠 by ascending order, and get the rank of the 1 |𝑒| 𝑖-th correct triple in its all possible triples rk𝑖 . We adopt mean rank (MR= |𝑒| βˆ‘π‘–=1 rk𝑖 ), mean 1 |𝑒| |𝑒| reciprocal rank (MRR= |𝑒| βˆ‘π‘–=1 rk1 ) and Hits@N= βˆ‘π‘–=1 𝕀[rk𝑖 ≀ 𝑁 ] to measure the performance 𝑖 of the models on link prediction. The 𝕀[β‹…] is the indicator function, which outputs 1 if the input is true, 0 otherwise. This whole procedure is repeated while removing 𝑑𝑖 instead of β„Žπ‘– . For weight prediction, we predict the weight of a given triple and report the mean squared error (MSE) and mean absolute error (MAE) between the predicted weight and the real weight. 2. Experiment and Result We did experiments on CN15K, and PPI5K [8] datasets. CN15K is a subgraph of ConceptNet [4], containing 15,000 entities and 241,158 weighted triples in English. NL27k is extracted from NELL [9], a weighted KG obtained from webpage reading. NL27k contains 27,221 entities, 404 relations, and 175,412 weighted triples. PPI5k is a subset of the protein-protein interaction knowledge base STRING [5] that contains 271,666 weighted triples for 4,999 proteins and 7 interactions. STRING labels the interactions between proteins with the probabilities of occurrence. We implemented the neural weight predictor using a 4-layer neural network with 50 inputs, three hidden layers with 300, 100, 50 neurons, and one output layer, also with ReLU activations. We set learning rate πœ† for the stochastic gradient descent optimizer to 0.0001. We trained the models for 3000 epochs, evaluated the models every 100 epochs, and chose the best result according to MRR. The margin 𝛾 was set to 1. The dimension of embeddings was set to 50. Table 1 Results on Link Prediction MR MRR Hits@1 Hits@3 Hits@10 TransH 1711 0.000585 0.042132 0.085516 0.140822 CN15K TransHExt 1452 0.000689 0.035010 0.086012 0.155301 TransH 322 0.003104 0.143687 0.270522 0.390837 NL27K TransHExt 230 0.004351 0.085257 0.261615 0.430276 TransH 49 0.020226 0.002689 0.123945 0.310025 PPI5K TransHExt 26 0.038565 0.000093 0.158699 0.400097 Table 2 Results on Weight Prediction CN15K NL27K PPI5K MSE MAE MSE MAE MSE MAE URGE [10] 10.32 22.72 7.48 11.35 1.44 6 UKGE 8.61 19.9 2.36 6.9 0.95 3.79 TransHExt 4.34 12.68 3.05 9.77 0.33 2.16 The results of link prediction and weight prediction are shown in Table 1 and Table 2, respectively. TransHExt outperforms TransH on link prediction in general, but gets a worse Hits@1 score on CN15K, NL27K and PPI5K. We assume this may be caused by the preprocessing. We will explore more preprocessing methods for WeExt. For weight prediction, TransHExt outperforms both of URGE and UKGE on CN15K and PPI5K, but gets a worse performance than UKGE on NL27K. We hypothesize this is caused by the different weight distribution of the dataset. We will explore more in our future work. Acknowledgments This work was supported by JST SPRING, Grant Number JPMJSP2102. This work was also partially supported by JSPS Grant-in-Aid for Early-Career Scientists (Grant Number 22K18004), JSPS Grant-in-Aid for Scientific Research (Grant Number 21K12042), and the New Energy and Industrial Technology Development Organization (Grant Number JPNP20006). References [1] X. Hu, Y. Shu, X. Huang, Y. Qu, Edg-based question decomposition for complex question answering over knowledge bases, in: ISWC, 2021. [2] C. Xiong, R. Power, J. Callan, Explicit semantic ranking for academic search via knowledge graph embedding, WWW (2017). [3] Z. Wang, T. Chen, J. S. J. Ren, W. Yu, H. Cheng, L. Lin, Deep reasoning with knowledge graph for social relationship understanding, in: IJCAI, 2018. [4] R. Speer, J. Chin, C. Havasi, Conceptnet 5.5: An open multilingual graph of general knowledge, in: AAAI, 2017. [5] D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Si- monovic, A. C. J. Roth, A. Santos, K. Tsafou, M. Kuhn, P. Bork, L. J. Jensen, C. von Mering, String v10: protein–protein interaction networks, integrated over the tree of life, Nucleic Acids Research 43 (2015) D447 – D452. [6] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating on hyperplanes, in: AAAI, 2014. [7] X. Chen, M. Chen, W. Shi, Y. Sun, C. Zaniolo, Embedding uncertain knowledge graphs, AAAI (2019). [8] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowledge graph embeddings, in: AAAI, 2018. [9] T. M. Mitchell, W. W. Cohen, E. R. Hruschka, P. P. Talukdar, B. Yang, J. Betteridge, A. Carl- son, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. A. Platanios, A. Ritter, M. Samadi, B. Settles, R. C. Wang, D. Wijaya, A. K. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-ending learning, Communica- tions of the ACM 61 (2015) 103 – 115. [10] J. Hu, R. Cheng, Z. Huang, Y. Fang, S. Luo, On embedding uncertain graphs, CIKM (2017).