=Paper= {{Paper |id=Vol-3254/paper387 |storemode=property |title=TransHExt: a Weighted Extension for TransH on Weighted Knowledge Graph Embedding |pdfUrl=https://ceur-ws.org/Vol-3254/paper387.pdf |volume=Vol-3254 |authors=Kong Wei Kun,Xin Liu,Teeradaj Racharak,Le-Minh Nguyen |dblpUrl=https://dblp.org/rec/conf/semweb/KunLRN22 }} ==TransHExt: a Weighted Extension for TransH on Weighted Knowledge Graph Embedding== https://ceur-ws.org/Vol-3254/paper387.pdf
TransHExt: a Weighted Extension for TransH on
Weighted Knowledge Graph Embedding
Kong Wei Kun1 , Xin Liu2 , Teeradaj Racharak1 and Le-Minh Nguyen1
1
    School of Information Science, Japan Advanced Institute of Science and Technology
2
    Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST)


                                         Abstract
                                         Many methods have been proposed for embedding ordinary knowledge graphs (KG), which greatly
                                         promote various applications. However, current knowledge graph embedding algorithms cannot encode
                                         weighted knowledge graphs (WKG), which are a generalized form of the ordinary KG. This paper gives
                                         a promising approach that enables to extend existing models to encode weighted knowledge graphs.
                                         Taking TransH as an example, we propose TransHExt. TransHExt shows competitive performance in
                                         both link prediction task and weight prediction task.

                                         Keywords
                                         weighted Knowledge graph embedding, weighted extension


   Knowledge graphs (KG) are thriving and promoting many downstream applications in se-
mantic web such as question answering over RDF triples [1] and related areas such as academic
search [2] and social relationship recognition [3]. Facts encoded in KG are formalized as triple
(β„Ž, π‘Ÿ, 𝑑), in which β„Ž denotes a head entity, 𝑑 denotes a tail entity, and π‘Ÿ denotes a relation between
β„Ž and 𝑑.
   Weighted knowledge graphs (WKG) generalize (crisp) knowledge graphs by associating a
weight in (0, 1] to each triple. This formalism have been used to represent uncertainty [4],
and even out-of-band knowledge [5] in a growing number of scenarios. The weighted triples
can better model the interactions between the entities, such as the interactions of proteins in
STRING [5].
   Most knowledge graph embedding (KGE) models are targeted at learning the representation
of KGs without weights. To leverage the existing models, such as TransH [6] to embed facts
encoded in WKGs, we propose WeExt, an extension method that adds on a weight prediction
module to a KGE model (called the base model). To illustrate our intuition, we select TransH
and show its extension called TransHExt in this work. Our experiments show that TransHExt
outperforms TransH on link prediction task and UKGE [7] on weight prediction task.




The 21st International Semantic Web Conference, October 23–27, 2022, Hangzhou, China
Envelope-Open kong.diison@jaist.ac.jp (K. W. Kun); xin.liu@aist.go.jp (X. Liu); racharak@jaist.ac.jp (T. Racharak);
nguyenml@jaist.ac.jp (L. Nguyen)
GLOBE https://staff.aist.go.jp/xin.liu/ (X. Liu); https://sites.google.com/view/teeradaj (T. Racharak)
Orcid 0000-0002-2958-9969 (K. W. Kun); 0000-0002-2336-7409 (X. Liu); 0000-0002-8823-2361 (T. Racharak);
0000-0002-2265-1010 (L. Nguyen)
                                       Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
                   TransHExt, Framework
1. WeExt: Weighted Extension  ISWC 2022
The framework of WeExt is shown in Figure 1. We introduce an add-on weight prediction
module consisting of preprocessing and a neural weight predictor (denoted by 𝑛𝑀𝑝) to predict
the weight for a given triple. The preprocessing is defined as the scoring function of a base model
without the norm. We implement the neural weight predictor using a multi-layer feed-forward
neural network.

             base model                                 weight prediction module
                                                                                      ν‘’οΏ½
    t                                               οΏ½      οΏ½ν‘œν‘ ν‘ ν‘          ν›          ν‘€ ν› β„Ž        ν‘€   ν› β„Ž
                                                                                    οΏ½ ν‘‘ 푐 ν‘œοΏ½

οΏ½


                                                                  ν‘ ν‘ν‘œοΏ½ ν›                           οΏ½
 h
                                                                  ν‘“ν‘’ 푐 ν‘œ                         ν‘ ν‘ν‘œοΏ½



        relation            entity                   embed                    embedded         embedded
        embedding           embedding                tail entity              head entity      relation


Figure 1: The framework of WeExt. The green components are the base KGE model. The blue compo-
nents are the weight prediction component.


   The preprocessing of the embedded entity and relation vary to the scoring function 𝑓 (β‹…) of
the base KGE model. The preprocessing serves two purposes: one is to back-propagate the loss
from the neural weight predictor to all learnable parameters, another is to make the loss from
the scoring function and the loss from the neural weight predictor as consistent as possible in
the adjustment of the parameters in the back-propagation. For TransH, the scoring function is:
                                                                                2
                           𝑠 = βˆ’ β€–(h βˆ’ w⊀                    ⊀
                                        π‘Ÿ hwπ‘Ÿ ) + dr βˆ’ (t βˆ’ wπ‘Ÿ twπ‘Ÿ )β€–2

where dr is the relation-specific translation vector and wπ‘Ÿ is the relation-specific hyperplane.
Correspondingly, the predicted weight 𝑀𝑝 can be calculated by the weighting prediction compo-
nent is:
                         𝑀𝑝 = 𝑛𝑀𝑝((h βˆ’ w⊀                        ⊀
                                            π‘Ÿ hwπ‘Ÿ ) + dr βˆ’ (t βˆ’ wπ‘Ÿ twπ‘Ÿ ))
We call this model TransHExt, i.e., extending TransH based on WeExt.

1.1. Training Protocol
For a given positive training set
                                                                 𝑒
                                     𝑆 = {⟨(β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖 ) , 𝑀𝑖 ⟩}𝑖=0 ,
we generate a negative set
                                                                                                                 𝑒
             𝑆 β€² = {(β„Žβ€² , π‘Ÿ, 𝑑 β€² )} = {(β„Žβ€²π‘– , π‘Ÿπ‘– , 𝑑𝑖 ) ∣ β„Žβ€²π‘– ∈ 𝐸 β§΅ {β„Žπ‘– }} βˆͺ {(β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖′ ) ∣ 𝑑𝑖′ ∈ 𝐸 β§΅ {𝑑𝑖 }}𝑖=0 .

We measure the accuracy of the neural weight predictor with respect to the error on the weight
prediction in each triple of 𝑆:

                                                                 π‘€βˆ’|π‘€βˆ’π‘€π‘ |
                                    π‘Žπ‘π‘(β„Ž, π‘Ÿ, 𝑑, 𝑀) = {             𝑀
                                                                           ,       𝑀𝑝 ∈ [0, 2𝑀]
                                                                      0,            π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

We adopt margin ranking loss as the loss function for the proposed models:

                β„’ =          βˆ‘                βˆ‘             𝛾 + 𝑓 (β„Ž, π‘Ÿ, 𝑑) + π‘Žπ‘π‘(β„Ž, π‘Ÿ, 𝑑, 𝑀) βˆ’ 𝑓 (β„Žβ€² , π‘Ÿ β€² , 𝑑 β€² )
                        ⟨(β„Ž,π‘Ÿ,𝑑),π‘€βŸ©βˆˆπ‘† (β„Žβ€² ,π‘Ÿ β€² ,𝑑 β€² )βˆˆπ‘† β€²

where 𝛾 is the margin. Since negative triples represent unencoded facts in the real world, it is
not necessary to measure the loss of prediction on this set.

1.2. Evaluation Protocol
We evaluate the developed TransHExt on link prediction and weight prediction. For link
prediction, given a test triple (β„Žπ‘– , π‘Ÿπ‘– , 𝑑𝑖 ), β„Žπ‘– is removed and the head entity is replaced by each of
the entities of the dictionary in turn to form all possible triples. Triple scores are calculated
by the scoring function 𝑠 = 𝑓 (β„Ž, π‘Ÿπ‘– , 𝑑𝑖 ). We sort 𝑠 by ascending order, and get the rank of the
                                                                                        1   |𝑒|
𝑖-th correct triple in its all possible triples rk𝑖 . We adopt mean rank (MR= |𝑒|         βˆ‘π‘–=1 rk𝑖 ), mean
                         1            |𝑒|                                    |𝑒|
reciprocal rank (MRR= |𝑒|  βˆ‘π‘–=1 rk1 ) and Hits@N= βˆ‘π‘–=1 𝕀[rk𝑖 ≀ 𝑁 ] to measure the performance
                                   𝑖
of the models on link prediction. The 𝕀[β‹…] is the indicator function, which outputs 1 if the input
is true, 0 otherwise. This whole procedure is repeated while removing 𝑑𝑖 instead of β„Žπ‘– .
   For weight prediction, we predict the weight of a given triple and report the mean squared
error (MSE) and mean absolute error (MAE) between the predicted weight and the real weight.


2. Experiment and Result
We did experiments on CN15K, and PPI5K [8] datasets. CN15K is a subgraph of ConceptNet [4],
containing 15,000 entities and 241,158 weighted triples in English. NL27k is extracted from NELL
[9], a weighted KG obtained from webpage reading. NL27k contains 27,221 entities, 404 relations,
and 175,412 weighted triples. PPI5k is a subset of the protein-protein interaction knowledge
base STRING [5] that contains 271,666 weighted triples for 4,999 proteins and 7 interactions.
STRING labels the interactions between proteins with the probabilities of occurrence.
   We implemented the neural weight predictor using a 4-layer neural network with 50 inputs,
three hidden layers with 300, 100, 50 neurons, and one output layer, also with ReLU activations.
We set learning rate πœ† for the stochastic gradient descent optimizer to 0.0001. We trained the
models for 3000 epochs, evaluated the models every 100 epochs, and chose the best result
according to MRR. The margin 𝛾 was set to 1. The dimension of embeddings was set to 50.
Table 1
Results on Link Prediction
                                 MR    MRR     Hits@1 Hits@3 Hits@10
                      TransH    1711 0.000585 0.042132 0.085516 0.140822
                CN15K TransHExt 1452 0.000689 0.035010 0.086012 0.155301
                      TransH     322 0.003104 0.143687 0.270522 0.390837
                NL27K TransHExt 230 0.004351 0.085257 0.261615 0.430276
                      TransH      49 0.020226 0.002689 0.123945 0.310025
                PPI5K TransHExt   26 0.038565 0.000093 0.158699 0.400097

Table 2
Results on Weight Prediction
                                    CN15K          NL27K         PPI5K
                                 MSE MAE        MSE MAE       MSE MAE
                     URGE [10]   10.32 22.72    7.48 11.35    1.44     6
                     UKGE        8.61   19.9    2.36   6.9    0.95   3.79
                     TransHExt   4.34 12.68     3.05   9.77   0.33 2.16


  The results of link prediction and weight prediction are shown in Table 1 and Table 2,
respectively. TransHExt outperforms TransH on link prediction in general, but gets a worse
Hits@1 score on CN15K, NL27K and PPI5K. We assume this may be caused by the preprocessing.
We will explore more preprocessing methods for WeExt. For weight prediction, TransHExt
outperforms both of URGE and UKGE on CN15K and PPI5K, but gets a worse performance
than UKGE on NL27K. We hypothesize this is caused by the different weight distribution of the
dataset. We will explore more in our future work.


Acknowledgments
This work was supported by JST SPRING, Grant Number JPMJSP2102. This work was also
partially supported by JSPS Grant-in-Aid for Early-Career Scientists (Grant Number 22K18004),
JSPS Grant-in-Aid for Scientific Research (Grant Number 21K12042), and the New Energy and
Industrial Technology Development Organization (Grant Number JPNP20006).


References
 [1] X. Hu, Y. Shu, X. Huang, Y. Qu, Edg-based question decomposition for complex question
     answering over knowledge bases, in: ISWC, 2021.
 [2] C. Xiong, R. Power, J. Callan, Explicit semantic ranking for academic search via knowledge
     graph embedding, WWW (2017).
 [3] Z. Wang, T. Chen, J. S. J. Ren, W. Yu, H. Cheng, L. Lin, Deep reasoning with knowledge
     graph for social relationship understanding, in: IJCAI, 2018.
 [4] R. Speer, J. Chin, C. Havasi, Conceptnet 5.5: An open multilingual graph of general
     knowledge, in: AAAI, 2017.
 [5] D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Si-
     monovic, A. C. J. Roth, A. Santos, K. Tsafou, M. Kuhn, P. Bork, L. J. Jensen, C. von Mering,
     String v10: protein–protein interaction networks, integrated over the tree of life, Nucleic
     Acids Research 43 (2015) D447 – D452.
 [6] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating on
     hyperplanes, in: AAAI, 2014.
 [7] X. Chen, M. Chen, W. Shi, Y. Sun, C. Zaniolo, Embedding uncertain knowledge graphs,
     AAAI (2019).
 [8] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowledge graph
     embeddings, in: AAAI, 2018.
 [9] T. M. Mitchell, W. W. Cohen, E. R. Hruschka, P. P. Talukdar, B. Yang, J. Betteridge, A. Carl-
     son, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed,
     N. Nakashole, E. A. Platanios, A. Ritter, M. Samadi, B. Settles, R. C. Wang, D. Wijaya, A. K.
     Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-ending learning, Communica-
     tions of the ACM 61 (2015) 103 – 115.
[10] J. Hu, R. Cheng, Z. Huang, Y. Fang, S. Luo, On embedding uncertain graphs, CIKM (2017).