=Paper= {{Paper |id=Vol-3078/paper-58 |storemode=property |title=From Node Embeddings to Triple Embeddings |pdfUrl=https://ceur-ws.org/Vol-3078/paper-58.pdf |volume=Vol-3078 |authors=Valeria Fionda,Giuseppe Pirrò |dblpUrl=https://dblp.org/rec/conf/aiia/FiondaP21 }} ==From Node Embeddings to Triple Embeddings== https://ceur-ws.org/Vol-3078/paper-58.pdf
From Node Embeddings to Triple Embeddings
Valeria Fionda1 , Giuseppe Pirrò2
1
    DeMaCS, University of Calabria, via Pietro Bucci 30B, 87036, Rende, Italy
2
    DI, Sapienza Università di Roma, Piazzale Aldo Moro, 5, 00185 Roma, Italy


                                         Abstract
                                         An extended version of this paper has been published at the the 34th AAAI Conference on Artificial Intel-
                                         ligence (AAAI) with the title “Learning Triple Embeddings from Knowledge Graphs”. Graph embedding
                                         techniques allow to learn high-quality low-dimensional graph representations useful in various tasks,
                                         from node classification to clustering. Knowledge graphs are particular types of graphs characterized
                                         by several distinct types of nodes and edges. Existing knowledge graph embedding approaches have
                                         only focused on learning embeddings of nodes and predicates. However, the basic piece of information
                                         stored in knowledge graphs are triples and thus, an interesting problem is that of learning embeddings
                                         of triples as a whole. In this paper we report on Triple2Vec, a new technique to directly compute triple
                                         embeddings in knowledge graphs. Triple2Vec leverages the idea of line graph and extends it to the
                                         context of knowledge graphs. Embeddings are then generated by adopting the SkipGram model, where
                                         sentences are replaced with walks on a wighted version of the line graph.

                                         Keywords
                                         Knowledge Graphs, Triple Embeddings, Line Graph




1. Introduction
In the last years, learning graph representations using low-dimensional vectors has received
attention as viable support to various (machine) learning tasks, from node classification to
clustering [1]. Several approaches focused on computing node embeddings for homogeneous
graphs only including one type of edge (e.g., [2], [3]) or knowledge graphs (aka heterogeneous
information networks) characterized by several distinct types of nodes and edges [4] (e.g., [5],
[6], [7]). There have also been attempts considering edge embeddings in homogeneous graphs
by applying some operator (e.g., average, Hadamard product) to the embeddings of the edge
endpoint nodes [3]. In the knowledge graph landscape, node and predicate embeddings are
learned separately and these pieces of information could be aggregate (e.g., [8], [9], [10], [11]).
   In this paper we report about a technique to directly learn triple embeddings from knowledge
graph, which builds upon the notion of line graph of a graph; in particular, we leverage the fact
that here triples can be turned into nodes. As an example, the directed line graph obtained from
the knowledge graph in Fig. 1 (a) is shown in Fig. 1 (b), where the two copies of the node Lauren
Oliver, Americans correspond to the triples having nationality and citizenship as a predicate,
respectively. It would be tempting to directly applying embedding techniques to the nodes of

AIxIA 2021 Discussion Papers
" fionda@mat.unical.it (V. Fionda); pirro@di.uniroma1.it (G. Pirrò)
 0000-0002-7018-1324 (V. Fionda); 0000-0002-7499-5798 (G. Pirrò)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
                                                                                                                     Morgan Freeman,                                                                           Morgan Freeman,
                                      stateOfOrigin                                                                                                                                     w1                      stateOfOrigin,
                Morgan Freeman                                                                                         Americans
                                                                                                                                                                                                      w3          Americans
                                                                                                                                                                                                                                    w4
                                             Americans                            Invictus,                                                                              Invictus,           Morgan Freeman,      w7
             starring       nationality                                        Morgan Freeman
                                                                                                   Morgan Freeman,                                                       starring,             nationality,                    w6
                                                                                                     Americans                                                        Morgan Freeman           Americans                             Lauren Oliver,
     Invictus                                             nationality                                                                     Lauren Oliver,                                                         Lauren Oliver,
                                      citizenship                                                                      Lauren Oliver,      Americans                  w16              w2            w5           citizenship,
                                                                                                                                                                                                                                      nationality,
                                                                                                                        Americans                                                                                                     Americans
                                                                                                                                                                                                                  Americans
                                                     Lauren Oliver          Invictus,                                                                             Invictus,                                                           w8
       starring                                                                                                                                                   starring,                                                                      w9
                                                                           Matt Damon
                                                                                                                                                                 Matt Damon                                                w10
                                                                                                                                                                                                                                       Lauren Oliver,
                                                          almaMater                                                                         Lauren Oliver,
                                                                                                                                                                                                                                         almaMater,
  Matt Damon                University of Chicago                                                                                        University of Chicago        w15
                                                                                                                                                                                                          University of Chicago,    University of Chicago
                                              city                                                             University of Chicago,
                                                                                                                     Chicago                                       Matt Damon,                                     city,
                                                                            Matt Damon,                                                                                                                         Chicago                  w11
birthPlace                                                Chicago           Cambridge                                                                               birthPlace,
                                                                                                                                                                   Cambridge                                             w12
         Cambridge                                  country                                                                                                                                                                   Chicago,
                                                                                                                                    Chicago,                           w14                                                    country,
                                                                                                                                  United States
                                                                                                                                                                                                                            United States
                  country         UnitedStates                                      Cambridge,                                                                                      Cambridge,             w13
                                                                                   United States                                                                                     country,
                                                                     (a)                                                                                   (b)                     United States                                                      (c)



Figure 1: A knowledge graph (a), its directed line graph and (c) its triple line graph.
the directed line graph to obtain triple embeddings. However, we detect two main problems.
The first is that it is impossible to discern between the two triples encoded by the nodes Lauren
Oliver and Americans. The second is that the directed line graph is disconnected and as such, it
becomes problematic to learn triple embeddings via random walks. Therefore, we introduce the
notion of triple line graph 𝒢𝐿 of a knowledge graph 𝐺. In 𝒢𝐿 , nodes are the triples of 𝐺 and an
edge is introduced whenever the triples of 𝐺 share an endpoint.
   This construction guarantees that 𝒢𝐿 is connected if 𝐺 is connected. The triple line graph for
the graph in Fig. 1 (a) is shown in Fig. 1 (c). Unfortunately, directly working with 𝒢𝐿 may lead to
low-quality embeddings because edges in 𝒢𝐿 are added based on adjacency. Thus, we introduce
a mechanism to weight the edges of 𝒢𝐿 based on predicate relatedness [12]. The weight of an
edge between nodes of 𝒢𝐿 is equal to the semantic relatedness between the predicates in the
triples of 𝐺 represented by the two nodes. As an example, in Fig. 1 (c) the weight of the nodes
of 𝒢𝐿 (M. Damon, birthPlace, Cambridge) and (Cambridge, country, United States) will be
equal to the relatedness between birthPlace and country.
   Finally, to compute edge embeddings we generate truncated random walks, in the form of
sequences of nodes, on the weighted triple line graph. Note that weights based on semantic
relatedness will bias the random walker to obtain similar contexts for nodes in the weighted
triple line graph linked by related predicates. The underlying idea is that similar contexts will
lead to similar embeddings. Finally, the walks are fed into the Skip-gram model [13], which
will give the embeddings of the nodes of the weighted triple line graph that correspond to the
embeddings of the original triples.
   The remainder of the paper is organized as follows. We introduce some preliminary definitions
in Section 2. In Section 3, we introduce the notion of triple line graph of a knowledge graph
along with an algorithm to compute it. Section 4 describes the Triple2Vec approach to learn
triple embeddings from knowledge graphs. In Section 5, we discuss an experimental evaluation.
We conclude and sketch future work in Section 6.


2. Preliminaries
A Knowledge Graph (G) is a kind of heterogeneous information network. It is a node and edge
labeled directed multigraph 𝐺=(𝑉𝐺 , 𝐸𝐺 , 𝑇𝐺 ) where 𝑉𝐺 is a set of uniquely identified vertices
representing entities (e.g., D. Lynch), 𝐸𝐺 a set of predicates (e.g., director) and 𝑇𝐺 a set of triples
of the form (𝑠, 𝑝, 𝑜) representing directed labeled edges, where 𝑠, 𝑜 ∈ 𝑉𝐺 and 𝑝 ∈ 𝐸𝐺 . The
line graph 𝒢𝐿 =(𝑉𝐿 , 𝐸𝐿 ) of an undirected graph 𝐺=(𝑉𝐺 , 𝐸𝐺 ) is such that: (i) each node of 𝒢𝐿
represents an edge of 𝐺; (ii) two vertices of 𝒢𝐿 are adjacent iff, their corresponding edges in 𝐺
have a node in common. Starting from 𝐺 it is possible     ∑︀ to compute the number of nodes and
edges of 𝒢𝐿 as follows: (i) |𝑉𝐿 | = |𝐸𝐺 |; (ii) |𝐸𝐿 | ∝ 12 𝑣∈𝑉𝐺 𝑑2𝑣 − |𝐸𝐺 |, where 𝑑𝑣 denotes the
degree of the node 𝑣 ∈ 𝑉𝐺 .
  The concept of line graph has been extended to other types of graphs, including multigraphs
and directed graphs. The extension to multigraphs adds a different node in the line graph for
each edge of the original multigraph. If the graph 𝐺 is directed, the corresponding line graph
𝒢𝐿 will also be directed; its vertices are in one-to-one correspondence to the edges of 𝐺 and its
edges represent two-length directed paths in 𝐺.


3. Triple Line Graph of a Knowledge Graph
As we have discussed in the Introduction, apply the notion of line graph to knowledge graphs
would lead to counter-intuitive behaviors. Consider the graph 𝐺 in Fig. 1 (a). Fig. 1 (b) shows
the directed line graph 𝒢𝐿 , obtained by applying the standard definition, where nodes of 𝒢𝐿 are
in one-to-one correspondence to the edges of 𝐺. Moreover, from the same sub-figure it can be
noticed that each directed edge of 𝒢𝐿 corresponds to a path of length 2 in 𝐺.
   At this point, three main issues arise. First, the standard definition associates to each node of
the directed line graph the two endpoints of the corresponding edge; however, the edge labels
in a knowledge graph carry a semantic meaning, which is completely lost if only the endpoints
are considered. As an example, it is not possible to discern between the two copies of the nodes
Morgan Freeman, Americans. Second, the edges of the line graph are determined by considering
their direction. This disregards the fact that edges in 𝐺 witness some semantic relation between
their endpoints (i.e., entities) that can be interpreted bidirectionally. As an example, according
to the definition of directed line graph, the two nodes (Lauren Oliver, Americans) in 𝒢𝐿 remain
isolated since the corresponding edges do not belong to any two-length path in 𝐺 (see Fig. 1
(a)-(b)). However, consider the triple (Lauren Oliver, nationality, Americans): while the edge
label from Lauren Oliver to Americans states the relation nationality, the same label in the
opposite direction states the relation is nationality of.
   Hence, in the case of knowledge graphs, two nodes of the line graph must be connected by
an edge if they form a two-length path in the original knowledge graph no matter the edge
direction, as the semantics of edge labels can be interpreted bidirectionally. Finally, triples
encode some semantic meaning via predicates, and the desideratum is to preserve this semantics
when connecting two nodes (i.e., triples of 𝐺) in the line graph. Because of these issues, we
introduce the novel notion of triple line graph suitable for KGs.

Definition 1. (Triple Line Graph). Given a knowledge graph 𝐺 = (𝑉𝐺 , 𝐸𝐺 , 𝑇𝐺 ), the associated
triple line graph 𝒢𝐿 = (𝑉𝐿 , 𝐸𝐿 , 𝑤) is such that: (i) each node of 𝒢𝐿 represents a triple of 𝐺; (ii) two
vertices of 𝒢𝐿 , say 𝑠1 , 𝑝1 , 𝑜1 and 𝑠2 , 𝑝2 , 𝑜2 , are adjacent if, and only if, {𝑠1 , 𝑜1 } ∩ {𝑠2 , 𝑜2 } =
                                                                                                           ̸ ∅;
(iii) the function 𝑤 associates a weight in the range [0, 1] to each edge of 𝒢𝐿 .

  We devised an algorithm (outlined in Algorithm 1) to compute the triple line graph 𝒢𝐿 of a
knowledge graph 𝐺 that is at the core of Triple2Vec for the computation of triple embeddings.
After initializing the set of nodes and edges of 𝒢𝐿 to the empty set (line 1), the algorithm iterates
  Input : Knowledge Graph 𝐺
  Output : 𝒢𝐿 : the Triple Line Graph associated to 𝐺
    1: 𝒢𝐿 = {∅, ∅, ∅}
    2: for all (𝑠, 𝑝, 𝑜) in 𝐺 do
    3:   add the node (𝑠, 𝑝, 𝑜) to 𝒢𝐿
    4: end for
    5: for all 𝑠 ∈ 𝐺 do
    6:   𝐼(𝑠) = ∅
    7:   for all (𝑠, 𝑝, 𝑜) (resp., (𝑜, 𝑝, 𝑠)) in 𝐺 do
    8:      add 𝑠, 𝑝, 𝑜 (resp., 𝑜, 𝑝, 𝑠) to 𝐼(𝑠)
    9:      for all pair 𝑛, 𝑛′ in 𝐼(𝑠) do
   10:         add the edge (𝑛, 𝑛′ ) to 𝒢𝐿
   11:         set 𝑤(𝑛, 𝑛′ ) = computeEdgeWeight(𝑛, 𝑛′ )
   12:      end for
   13:   end for
   14: end for
   15: return 𝒢𝐿
                           Algorithm 1: BuildTripleLineGraph (𝐺)


over the triples of the input 𝐺 and add a node to 𝒢𝐿 for each visited triple (lines 2-3), thus
inducing a bijection from the set of triples of 𝐺 to the set of nodes of 𝒢𝐿 . Besides, if two triples
share a node in 𝐺 then an edge will be added between the corresponding nodes of 𝒢𝐿 (lines
5-14). In particular, the data structure 𝐼(𝑠) (line 6) keeps track, for each node 𝑠 of 𝐺, of the
triples in which 𝑠 appears as subject or object (lines 7-8). Since such triples correspond to
nodes of the triple line graph, by iterating over pairs of triples in 𝐼(𝑠) it is possible to add the
desired edge between the corresponding nodes of 𝒢𝐿 (lines 9-10). The algorithm also considers
a generic edge weighting mechanism (line 11) based on predicate relatedness (Section 4).
   By inspecting Algorithm 1, we observe that 𝒢𝐿 can be computed in time 𝒪(|𝑇 |2 ×𝑐𝑜𝑠𝑡𝑊 𝑒𝑖𝑔ℎ𝑡),
where 𝑐𝑜𝑠𝑡𝑊 𝑒𝑖𝑔ℎ𝑡 is the cost of computing the weight between nodes in 𝒢𝐿 .


4. Triple2Vec: Learning Triple Embeddings
We now describe Triple2Vec that directly learns triple embeddings from knowledge graphs.
Triple2Vec includes four main phases: (i) building of the triple line graph (Section 3); (ii) weighting
of the triple line graph edges; (iv) computing walks on the weighted triple line graph; and (v)
computing embeddings via the Skip-gram model.
Triple Line Graph Edge Weighting. We have mentioned in Section 3 that the number of
edges in the (triple) line graph can be large. This structure is much denser than that of the
original graph and may significantly affect the dynamics of the graph in terms of random walks.
To remedy this drawback, we introduce edge weighting mechanism (line 11 Algorithm 1). The
desideratum is to come up with a strategy to compute walks so that the neighborhood of a
triple will include triples that are semantically related. To this end, we leverage a predicate
relatedness measure [14].
   This measure is based on the Triple Frequency defined as 𝑇 𝐹 (𝑝𝑖 , 𝑝𝑗 )=log(1 + 𝐶𝑖,𝑗 ), where
𝐶𝑖,𝑗 counts the number of times the predicates 𝑝𝑖 and 𝑝𝑗 link the same subjects and objects.
Moreover, it uses the Inverse Triple Frequency defined as 𝐼𝑇 𝐹 (𝑝𝑗 , 𝐸)=log |{𝑝𝑖 :𝐶|𝐸| 𝑖,𝑗 >0}|
                                                                                                [12].
Based on TF and ITF, for each pair of predicates 𝑝𝑖 and 𝑝𝑗 we can build a (symmetric) matrix
𝐶𝑀 where each element is 𝐶𝑀 (𝑖, 𝑗)=𝑇 𝐹 (𝑝𝑖 , 𝑝𝑗 ) × 𝐼𝑇 𝐹 (𝑝𝑗 , 𝐸). The final predicate relatedness
matrix 𝑀𝑅 can be constructed such that 𝑀𝑅 (𝑝𝑖 , 𝑝𝑗 )=𝐶𝑜𝑠𝑖𝑛𝑒(𝑊𝑖 , 𝑊𝑗 ), where 𝑊𝑖 (resp., 𝑊𝑗 )
is the row of 𝑝𝑖 (resp., 𝑝𝑗 ) in 𝐶𝑀 . This approach guarantees that the more related predicates in
the triples representing two nodes in the triple line graph are, the higher the weight of the edge
between these nodes.
   Driving the random walks according to a relatedness criterion can capture both the graph
topology in terms triple-to-triple relations (i.e., edges in the triple line graph) and semantic
proximity in terms of relatedness between predicates in triples.
Computing Walks. Triple2Vec leverages a language model approach to learn the final edge
embeddings. As such, it requires a “corpus" of both nodes and sequences of nodes similarly to
word embeddings techniques that require words and sequences of words (i.e., sentences). We
leverage truncated graph walks. The idea is to start from each node of 𝒢𝐿 (representing a triple
of the original graph) and provide a context for each of such node in terms of a sequence of
other nodes. Although walks have been used by previous approaches (e.g., [3]), none of them
has tackled the problem of computing triple embeddings.
Computing Triple Embeddings. Once the “corpus" (in terms of the set of walks 𝒲) is
available, the last step of the Triple2Vec workflow is to compute the embeddings of the nodes of
𝒢𝐿 that will correspond to the embeddings of the triples of the input graph 𝐺. The embedding
we seek can be seen as a function 𝑓 : 𝑉𝐿 → 𝑅𝑑 , which projects nodes of the weighted triple
line graph 𝒢𝐿 into a low dimensional vector space, where 𝑑 ≪ |𝑉𝐿 |, so that neighboring nodes
are in proximity in the vector space. For every node 𝑢 ∈ 𝑉𝐿 , 𝑁 (𝑢) ⊂ 𝑉𝐿 is the set of neighbors,
which is determined by the walks computed. The co-occurrence probability of two nodes 𝑣𝑖
and 𝑣𝑖+1 in a set of walks 𝒲 is given by the Softmax function using their vector embeddings
𝑒𝑣𝑖 and 𝑒𝑣𝑖+1 :
                                 𝑝((𝑒𝑣𝑖 , 𝑒𝑣𝑖+1 ) ∈ 𝒲) = 𝜎(𝑒𝑇𝑣𝑖 𝑒𝑣𝑖+1 )                       (1)
where 𝜎 is the Softmax function and 𝑒𝑇𝑣𝑖 𝑒𝑣𝑖+1 is the dot product of the vectors 𝑒𝑣𝑖 and 𝑒𝑣𝑖+1 . As
the computation of (1) is computationally demanding [3], we use negative sampling to training
the Skip-gram model. Negative sampling randomly selects nodes that do not appear together in
a walk as negative examples, instead of considering all nodes in a graph. If a node 𝑣𝑖 appears
in a walk of another node 𝑣𝑖+1 , then the vector embedding 𝑒𝑣𝑖 is closer to 𝑒𝑣𝑖+1 as compared
to any other randomly chosen node. The probability that a node 𝑣𝑖 and a randomly chosen
node 𝑣𝑗 do not appear in a walk starting from 𝑣𝑖 is given by: 𝑝((𝑒𝑗 , 𝑒𝑖 ) ̸∈ 𝒲) = 𝜎(−𝑒𝑇𝑣𝑖 𝑒𝑣𝑗 ).
For any two nodes 𝑣𝑖 and 𝑣𝑖+1 , the negative sampling objective of the Skip-gram model to be
maximized is given by the following objective function:
                                                      𝑘
                                                     ∑︁
                       𝒪(𝜃) = log 𝜎(𝑒𝑇𝑣𝑖 𝑒𝑣𝑖+1 ) +         E𝑣𝑗 [log 𝜎(−𝑒𝑇𝑣𝑖 𝑒𝑣𝑗 )],              (2)
                                                     𝑗=1
Figure 2: Triple classification results in terms of Micro and Macro F1.
where 𝜃 denotes the set of all parameters and 𝑘 is the number of negative samples. For the
optimization of the objective function, we use the parallel asynchronous stochastic gradient
descent algorithm [15].


5. Experiments
We now report on the evaluation and comparison with related work. Triple2Vec has been
implemented in Python and uses the Gensim1 library to learn embeddings. In the experiments,
we used three real-world data sets. DBLP [16] about authors, papers, venues, and topics,
containing ∼16K nodes, ∼52K edges, and 4 predicate types. Authors are labeled with one
among four labels (i.e., database, data mining, machine learning, and information retrieval).
Foursquare [7] with ∼30K nodes and ∼83K edges including four different kinds of entities,
that is, users, places, points of interests and timestamps where each point of interest has also
associated one among 10 labels. Finally, we used a subset of Yago [16] in the domain of movies
including ∼22K nodes, ∼89K edges, and 5 predicate types, where each movie is assigned one or
more among 5 available labels.
   Since there are no benchmarks available for the evaluation of triple embedding approaches,
we constructed two benchmarks for triple classification and triple clustering by using the labels
available in the datasets. For DBLP the 4 labels for author nodes have been propagated to paper
nodes and from paper nodes to topic and venue nodes. For Yago, the labels for movies have
been propagated to actors, musicians and directors. For FourSquare, the 10 labels for points of
interest have been propagated to places, users and timestamps. With this reasoning, each node
has been labeled with a subset of labels and nodes of 𝒢𝐿 (i.e., the triples of 𝐺) have been labeled

    1
        https://radimrehurek.com/gensim
                    (a) Triple2Vec                             (b) metapath2vec
Figure 3: DBLP triple embedding visualization.

with the union of the labels associated with the triple endpoints. To each subset of labels we
assigned a different meta-label.
   We compared Triple2Vec to the following two groups of related approaches: (i) metap-
ath2vec [6], node2vec [3] implemented in StellarGraph (https://www.stellargraph.io) and Deep-
Walk [2] configured with the best parameters reported in their respective papers; that compute
embeddings for each node only (not for predicates), and a triple embedding was obtained by
averaging the embeddings of the triple endpoints; (ii) ConvE [9] and RotatE [11] proposed for
knowledge graph embedding configured with the best parameters reported in their respective
papers and implemented in the pykg2vec (https://github.com/Sujit-O/pykg2vec), that compute
embeddings for each node and each predicate, and triple embeddings were obtained by concate-
nating the embeddings of the triple endpoints and the predicate. For sake of space, we only
report the best results obtained by setting the parameters of Triple2Vec as follows: number of
walks per node 𝑛=10, maximum walk length 𝐿 = 100, window size (necessary for the context
in the Skip-gram model) 𝑤 = 10. Moreover, we used 𝑑=128 as a dimension of the embeddings.
The number of negative samples Γ is set to 50. All results are the average of 10 runs.
Results on Triple Classification. To carry out this task, we trained a one-vs-rest Logistic
regression model, giving as input the triple embeddings along with their labels (the labels of the
node of 𝒢𝐿 ). Then, we compute the Micro-F1 and Macro-F1 scores by varying the percentage of
training data. The results of the evaluation are reported in Fig. 2. We observe that Triple2Vec
consistently outperforms competitors. This is especially true in the DBLP and Yago datasets.
We also note that metapath2vec performs worse than node2vec and DeepWalk, although the
former has been proposed to work on knowledge graphs. This may be explained by the fact that
the metapaths used in the experiments [7], while being able to capture node embeddings, fail
short in capturing edge (triple) embeddings. As for the other group of approaches, we can see
that the performance are even worse than the first group in some cases although also predicate
embeddings are considered. This may be since the goal of these approaches is to learn entity
and predicate embeddings for link prediction. Another reason is the fact that these approaches
compute a single predicate and node embedding, which is the same for all triples in which it
appears.
Results on Triple Clustering and Visualization. To have a better account of how triple
embeddings are placed in the embedding space, we used t-SNE [17] to obtain a 2-d representation
of the triple embeddings (originally including 𝑑 dimensions) obtained from Triple2Vec and
metapath2vec. We only report results for DBLP (Fig. 3) as we observed similar trends in the
other datasets. Moreover, metapath2vec was the system giving the “most graphically readable”
results. We note that while Triple2Vec can clearly identify groups of triples (i.e., triples labeled
with the same labels), metapath2vec offers a less clear perspective. We can explain this behavior
with the fact that Triple2Vec defines a specific strategy for triple embeddings based on the
notion of semantic proximity, while triple embeddings for metapath2vec have been obtained
from the embedding of endpoint nodes according to predefined metapaths.


6. Concluding Remarks and Future Work
We discussed a technique to directly learn triple embeddings in knowledge graphs. While for
homogeneous graphs, there have been some sub-optimal proposals [3], for knowledge graphs,
this problem was never explored. The presented solution builds upon the notion of line graph
coupled with semantic proximity. Although existing approaches can be adapted we have shown
that: (i) simply aggregating the embedding of the triple endpoint nodes is problematic when
the endpoint nodes are connected by more than one predicate: (ii) even concatenating node
and predicate embeddings would not work since it is not possible to discriminate the role of the
same entities and predicates in all triples they appear.


References
 [1] H. Cai, V. W. Zheng, K. C.-C. Chang, A Comprehensive Survey of Graph Embedding:
     Problems, Techniques, and Applications, Transactions on Knowledge and Data Engineering
     30 (2018) 1616–1637.
 [2] B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: Online Learning of Social rRpresentations, in:
     Proc. of KDD, 2014, pp. 701–710.
 [3] A. Grover, J. Leskovec, node2vec: Scalable Feature Learning for Networks, in: Proc. of Int.
     Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864.
 [4] Q. Wang, Z. Mao, B. Wang, L. Guo, Knowledge Graph embedding: A Eurvey of Approaches
     and Applications, Trans. on Knowledge and Data Engineering 29 (2017) 2724–2743.
 [5] P. Ristoski, H. Paulheim, Rdf2vec: RDF Graph Embeddings for Data Mining, in: Proc. of
     International Semantic Web Conference, 2016, pp. 498–514.
 [6] X. Dong, N. V. Chawla, A. Swami, metapath2vec: Scalable Representation Learning for
     Heterogeneous Networks, in: Proc. of Int. Conference on Information and Knowledge
     Management, 2017, pp. 135–144.
 [7] R. Hussein, D. Yang, P. Cudré-Mauroux, Are Meta-Paths Necessary?: Revisiting Heteroge-
     neous Graph Embeddings, in: Proc. of Proc. of Conference on Information and Knowledge
     Management, 2018, pp. 437–446.
 [8] T. Trouillon, J. Welbl, S. Riedel, É. Gaussier, G. Bouchard, Complex Embeddings for Simple
     Link Prediction, in: Proc. of Int. Conf. on Machine Learning, 2016, pp. 2071–2080.
 [9] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2D Knowledge Graph
     Embeddings, in: Proc. of the AAAI Conference, 2018, pp. 1811–1818.
[10] H. Xiao, M. Huang, X. Zhu, TransG: A Fenerative Model for Knowledge Graph Embedding,
     in: Proc. of Association for Computational Linguistics, 2016, pp. 2316–2325.
[11] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, RotatE: Knowledge Graph Embedding by Rela-
     tional Rotation in Complex Space, in: Proc. of International Conference on Learning
     Representations, 2019.
[12] G. Pirrò, Building Relatedness Explanations from Knowledge Graphs, Semantic Web 10
     (2019) 963–990.
[13] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed Representations
     of Words and Phrases and their Compositionality, in: Proc. of Proc. of International
     Conference on Neural Information Processing, 2013, pp. 3111–3119.
[14] V. Fionda, G. Pirrò, Fact checking via evidence patterns, in: Proc. of International Joint
     Conference on Artificial Intelligence, 2018, pp. 3755–3761.
[15] B. Recht, C. Re, S. Wright, F. Niu, Hogwild: A Lock-Free Approach to Parallelizing
     Stochastic Gradient Descent, in: Proc. of Proc. of International Conference on Neural
     Information Processing, 2011, pp. 693–701.
[16] Z. Huang, N. Mamoulis, Heterogeneous Information Network Embedding for Meta path
     based Proximity, arXiv preprint arXiv:1701.05291 (2017).
[17] L. v. d. Maaten, G. Hinton, Visualizing Data Using t-SNE, J. of Machine Learning Research
     9 (2008) 2579–2605.