<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Enriching Wikidata with Semantified Wikipedia Hyperlinks</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Armand</forename><surname>Boschin</surname></persName>
							<email>armand.boschin@telecom-paris.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Télécom Paris</orgName>
								<orgName type="institution" key="instit2">Institut Polytechnique de Paris</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Thomas</forename><surname>Bonald</surname></persName>
							<email>thomas.bonald@telecom-paris.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Télécom Paris</orgName>
								<orgName type="institution" key="instit2">Institut Polytechnique de Paris</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Enriching Wikidata with Semantified Wikipedia Hyperlinks</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">463466E9162CBE40AB212213E2DBF8EC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:47+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Wikidata</term>
					<term>knowledge graph</term>
					<term>embedding</term>
					<term>relation prediction</term>
					<term>negative sampling</term>
					<term>relation typing</term>
					<term>machine learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We propose a novel approach to enrich Wikidata with the textual content of Wikipedia. Specifically, we leverage knowledge graph (KG) embedding models to classify the hyperlinks between Wikipedia articles and predict the corresponding facts. For instance, we would like to complete the triple (Berlin, *, Germany) with the relation capital of, given a hyperlink from Berlin to Germany in Wikipedia. While existing KG embedding models can be used for this task of relation prediction, they were not explicitly designed for it and their performance is not satisfactory. In this paper, we propose two methods that greatly improve the performance of these models on this task: first, a new negative sampling method that balances the roles of entities and relations during training; second, a method to exploit the types of entities in the selection of candidate relations. We obtain accuracy scores as high as 94% on the popular FB15k237 dataset and 75% on WDV5, an extraction of Wikidata. The efficiency of the approach is illustrated on some Wikipedia pages, where new facts unknown to Wikidata are predicted by our method.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>In the recent years, Wikipedia has become the largest open-source collection of knowledge. Its textual content is however mostly unstructured, the structured information being mainly limited to the content of infoboxes (e.g., place and date of birth for articles on humans). The hyperlinks make another structure which is not fully integrated in Wikidata yet. The main challenge is that, in order to know the meaning of an hyperlink, an agent needs to read the text in which the hyperlink is embedded. While some hyperlinks do not correspond to relevant facts, we claim that this is a rich source of information to complete Wikidata.</p><p>To illustrate this, one can look at the level 5 of Wikipedia vital articles 1 , that is about 40,000 pages serving as a centralized watchlist to track the quality of the most important articles. These Wikipedia pages are linked by slightly more than 3 million hyperlinks, which is far more than the approximately 200,000 facts linking the corresponding entities in Wikidata. For instance, there is a link from the page Henri Poincaré to the page Optics in Wikipedia. This suggests the existence of a relation linking the two entities, here field of work. This fact is not present in Wikidata.</p><p>Formally, a KG consists of a set of vertices called entities (e.g., person, place, date, concept) linked by directed edges in the form of triples (h, r, t) where h (resp. t) is the head (resp. the tail) entity and r is a relation carrying the semantic nature of the edge. When a triple is known to be true, it is called a fact.</p><p>In this paper, we address the issue of relation prediction: finding the relation linking some given head and tail entities. For instance, we would like to complete the triple (Berlin, *, Germany) with the relation capital of, assuming the fact is not in the KG. This task is also known as the semantification of a link. For this, we leverage the embedding of the entities and relations of the KG to compute scores on possible triples. Though most existing works on embeddings have focused on the task of link prediction, that is, completing either the triple (*, capital of, Germany) (head prediction) or (Berlin, capital of, *) (tail prediction), we show that embeddings for relation prediction can also perform notably well. We propose two techniques for that: first, we adapt the training of the models by balancing the role of entities and relations in the negative sampling step and then we use the types of entities to filter candidate relations.</p><p>These techniques prove very efficient, allowing a simple embedding model like TransE <ref type="bibr" target="#b2">[3]</ref> to reach accuracy of 94% on the popular FB15k237 dataset and 75% on WDV5, an extraction of Wikidata based on the level 5 of Wikipedia vital articles. This suggests that Wikidata can be significantly enriched by the semantification of Wikipedia hyperlinks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Contributions.</head><p>The main contributions of this work are the following:</p><p>• An approach to enrich Wikidata by the semantification of the hyperlinks of Wikipedia. • A novel negative sampling technique for improving the ability of KG embedding models to predict relations, without affecting their performance on link prediction. 2 Related work KG embedding. A KG embedding model is defined as a function f that computes a score for any triple (h, r, t) using some vector representations of h, t and 2 https://en.wikipedia.org/wiki/Wikipedia:Vital articles/Level/5 r. By extension, the vectors representing entities and relations are called embeddings. KG embeddings have been specifically designed for link prediction<ref type="foot" target="#foot_0">3</ref> : given an entity h (resp. t) and a relation r, the model is used to predict an entity t (resp. h) so that the fact (h, r, t) is the most likely to be true. This prediction is done by selecting the entity giving the highest score among all entities.</p><p>There are three categories of models depending on the form of the scoring function f and thus on the way entities and relations interact in the vector space (see <ref type="bibr" target="#b19">[20]</ref> for more details):</p><p>• Linear models, where h, r, t are linked by a linear relation in the vector space.</p><p>Some projections can be added to increase the expressiveness of the model. Examples of such models include TransE <ref type="bibr" target="#b2">[3]</ref> and TransH <ref type="bibr" target="#b21">[22]</ref>. • Bilinear models, where the relation r is a bilinear form of h and t in the vector space. Examples include RESCAL <ref type="bibr" target="#b9">[10]</ref> and ComplEx <ref type="bibr" target="#b17">[18]</ref>. • Deep models, based on neural networks, possibly including attention mechanisms <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b20">21]</ref>. These models give state-of-the-art performance in link prediction but are usually heavy and hard to train and prone to over-fitting.</p><p>Negative sampling. The scoring function f of an embedding model is expected to discriminate facts from false statements and thus needs to be trained with both. Since most KGs do not record false statements, training is usually done under the Closed World Assumption (CWA), i.e., unknown triples are considered as false. This may seem contradictory as the model is then used to predict unknown facts, that are expected to be true. This is however the only way to learn meaningful scoring functions f . The random generation of false statements is known as Negative Sampling (NS). It has a major impact on the performance of the trained model <ref type="bibr" target="#b6">[7]</ref>.</p><p>Given some known fact (h, r, t), the usual way to create a false statement from it (under the CWA) is to randomly choose either the head entity or the tail entity and to replace it with another random entity of the KG <ref type="bibr" target="#b2">[3]</ref>. This technique was improved in <ref type="bibr" target="#b21">[22]</ref> by using a Bernoulli parameter (see Section 3). The replacement of the relation is rarely considered. It is mentioned in <ref type="bibr" target="#b22">[23]</ref> but not precisely described nor studied, as it is not the main focus of that article. We propose a modification of the Bernoulli NS technique to include random replacement of the relation, to get high performance in both link prediction and relation prediction.</p><p>Type Filtering. Most KGs assign one or several type(s) to each entity through a rdf:type relation (e.g., the P31: "instance of " relation in Wikidata). The types of entities have mainly been used in link prediction, either to enforce type constraints in negative sampling or to select the candidate entities <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b22">23]</ref>.</p><p>KBs can also enforce type constraints on relations via rdfs:domain and rdfs:range constraints. In relation prediction, selecting candidate relations with these constraints seems natural but they can be missing or too coarse grained making the filtering either too restrictive or with no effect. In Wikidata, relation constraints are hints for the editors, not firm restrictions <ref type="foot" target="#foot_1">4</ref> . We propose a method to infer such constraints simply from the rdf:type relation of the KG at hand and use the resulting constraints in relation prediction. We show that it has a major impact on performance.</p><p>NLP for relation prediction There are two main tasks tackled by NLP methods. The first is relation prediction, also known as relation extraction, consisting in predicting the semantic relation linking two entities using sentences describing these entities. The best performing models rely on deep neural networks with attention mechanisms <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b24">25]</ref>. The second task is entity linking, that is linking relations of a KG to plain text surface forms. Some interesting articles are <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b23">24]</ref>. Our task of relation prediction in KG is different as it relies on the graph structure of the KG only and on not any textual content. A method combining both approaches is left as an interesting perspective for future research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Relation linking</head><p>Hyperlink semantification. Very few works exist on the semantification of Wikipedia hyperlinks using the graph structure of the KG only. The approach of Galarraga et al. <ref type="bibr" target="#b4">[5]</ref> is based on rule mining. A limit of this method is that it can only predict relations for entities matching the body of the mined rule. Our technique based on KG embedding applies to all links.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Background</head><p>Bernoulli Negative Sampling. The usual negative sampling technique (noted BerNS) relies on relation-specific Bernoulli distributions to choose between the head or the tail which entity of a fact should be replaced to maximize the probability of the resulting triple to be false <ref type="bibr" target="#b21">[22]</ref>. Formally, a Bernoulli parameter p r is computed for each relation r as follows:</p><formula xml:id="formula_0">p r = ρ r t,h ρ r t,h + ρ r h,t</formula><p>, where ρ r t,h (resp. ρ r h,t ) is the average number of tail entity per head entity (resp. head entity per tail entity) among all known facts involving r. This parameter p r is the probability to replace the head entity of the fact.</p><p>As an example, consider "author of " which is a one-to-many relation (one author and many potential books). In that case, the head entity (an author) should be more likely replaced than the tail entity (a book), yielding a false statement with greater probability.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model training.</head><p>Training a model comes down to finding its parameters (the embeddings) so that the scoring function gives high scores to facts and low scores to false statements. Given a training set of facts denoted (h, r, t), the corresponding false statements (h , r, t ) are generated by NS. Then for each pair of facts (h, r, t) and (h , r, t ), a loss measuring the gap between the corresponding scores is computed, (f (h, r, t), f (h , r, t )). This loss should be high for close scores. Examples include the logistic loss and the margin loss <ref type="bibr" target="#b15">[16]</ref>. Minimizing the overall loss (e.g., by gradient descent <ref type="bibr" target="#b12">[13]</ref>) gives a scoring function f that is expected to discriminate facts from false statements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Relation prediction</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Approach</head><p>Our approach relies on the following techniques. KG Embedding. KG embedding models can be used for relation prediction the same way they are used in the aforementioned link prediction: given two entities h and t, an embedding model and its scoring function f , the relations of the graph can be ranked by decreasing order of scores:</p><formula xml:id="formula_1">f (h, r 1 , t) &gt; f (h, r 2 , t) &gt; • • • &gt; f (h, r k , t).</formula><p>The relation r 1 is then predicted, corresponding to the fact (h, r 1 , t). Note that this method applies to the case of undirected links, by ranking the scores of the predictions for both directed links (h, t) and (t, h). This is especially useful when some relations have no reciprocal. Balanced Negative Sampling. Simple experiments show that off-the-shelf linear models like TransE perform really badly in relation prediction (3% of Hit@1 on FB15k237, see Table <ref type="table" target="#tab_3">2a</ref>). This suggests that the representation of relations is not as good as that of entities. It turns out that entities and relations play similar roles in the training procedure except for the NS step. Usually, only the entities are randomly replaced to get false statements (see Section 3). We propose a simple modification of BerNS to balance the roles of entities and relations during training. Rather than just replacing one of the two entities of a known fact, we replace the relation with some probability p, and apply BerNS otherwise (See Algorithm 1). This new method is called Balanced Negative Sampling (BalNS). The default value for p is set to 1  2 . Experiments have shown that the value of the parameter has no major impact on the performances of the approach as long as it is greater than 0.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 1: Balanced Negative Sampling (BalNS).</head><p>Input: (h, r, t), a fact Input: p, probability to replace the relation Output: (h , r , t ), a false statement Data: T , the facts in the KG Data: p r , Bernoulli parameter for relation r 1 (h , r , t ) ← (h, r, t) 2 while (h , r , t ) ∈ T do</p><formula xml:id="formula_2">3 u ← uniform random variable on [0, 1] 4 if u &lt; p then 5 r ← random relation 6 else 7 (h , t ) ← BerNS(h, r, t) 8 return (h , r , t )</formula><p>Type Filtering for relation prediction. Another key technique to improve the quality of relation prediction is through Type Filtering (TF). An entity e is said to have the type t if the fact (e, rdf:type, t) is known. To predict the relation linking h and t, only relations that are known to link entities of the type(s) of h to entities of the type(s) of t should be considered. Formally, we say that a relation r links type a to type b if there exists some known fact (h, r, t) with the head entity h of type a and the tail entity t of type b. Now for predicting the relation missing in (h, * , t), we propose to consider as candidates only the relations r linking any type of h to any type of t. The corresponding algorithm for relation prediction is described in Algorithm 2. Observe that if either the head entity h and/or the tail entity t is not typed, the candidate relations are then the relations that are involved in a training fact with either h as a head entity or t as a tail entity. In the end, if no relation meet any constraint, there is no filtering, i.e., all relations are selected. Regarding speed, this step has no significant impact on the global computation time with proper index: linking entities to their types and types to possible relations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 2: Relation prediction with Type Filtering (TF).</head><p>Input: h, t, entities Input: f , scoring function Output: r, relation linking h to t Data: T , the facts in the KG Data: R, the relations in the KG</p><formula xml:id="formula_3">1 A ← types of h 2 B ← types of t 3 if |A| &gt; 0 and |B| &gt; 0 then 4 R ← {r : ∀(a, b) ∈ A × B, ∃h , t : type(h ) = a, type(t ) = b, (h , r, t ) ∈ T } 5 else 6 R ← {r ∈ R : ∃e : (h, r, e) ∈ T } ∪ {r ∈ R : ∃e : (e, r, t) ∈ T } 7 if |R| = 0 then 8 R ← R 9 r ← arg max({f (h, r, t), r ∈ R}) 10 return r</formula><p>To summarize, our approach relies on the following steps:</p><p>1. Training the model (e.g., TransE or ComplEx) with BalNS (Algorithm 1). 2. Predicting relations with TF (Algorithm 2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Evaluation</head><p>Given an embedding model trained with BalNS and some known fact (h, r, t) of a test set, all relations selected by TF are ranked by decreasing score. The rank of the true relation r is recorded as the recovery rank (if the true relation r is not selected by TF, the rank is set to the maximum). Usual metrics of link prediction like Mean Reciprocal Rank (MRR: average of the inverses of the recovery ranks) and Hit@k (proportion of tests in which the recovery rank is at most k) can then be reported. In the filtered setting, any relation that is ranked better than r and that is known to lead to a fact (i.e., in the training set) is discarded, so that the model is not penalized for predicting known facts that are simply more likely than the target one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Experiments</head><p>The experiments aim at assessing the performance of our approach on existing KGs and at showing its practical interest on a real-world task, i.e., the semantification of Wikipedia hyperlinks. All experiments can be reproduced using the publicly available code<ref type="foot" target="#foot_2">5</ref> and data <ref type="foot" target="#foot_3">6</ref> .</p><p>Datasets The datasets used in the experiments are shown in Table <ref type="table" target="#tab_1">1</ref>. One of the most common datasets used to evaluate the quality of KG embeddings is a subset of Freebase called FB15k237 <ref type="bibr" target="#b16">[17]</ref>. The typing relation from Freebase is however not included in it and resources are no longer available online since the discontinuation of the Freebase project <ref type="bibr" target="#b1">[2]</ref>. Types were then imported from Wikidata using a matching between the two KBs. Attention was paid to prevent data leakage by removing any imported fact that could match an existing validation or test fact. For comparability reasons, the new facts were not used to train the embedding models, only for the TF step. Only 18,6% of entities are typed, see Table <ref type="table" target="#tab_1">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset</head><p>We also introduce WDV5, a new dataset containing the facts linking entities of Wikidata corresponding to the level 5 of Wikipedia vital articles (see Section 1). To type entities, only typing facts included in the dataset are used (i.e., all types are entities of WDV5). In particular, not all entities are typed (only 56%, see Table <ref type="table" target="#tab_1">1</ref>). It is important to note that WDV5 is a raw extract from Wikidata, without any pre-processing. As such, we expect the corresponding experiments to be more representative of real use cases than those based FB15k237.</p><p>For the semantification of Wikipedia hyperlinks, we use Wikivitals+, an extraction of the level 5 of Wikipedia vital articles and the hyperlinks between them. We only keep the pages that have a corresponding Wikidata entity. This dataset provides many hyperlinks that are natural candidates for true facts, after relation prediction.</p><p>Baseline In order to measure the impact of using a KG embedding model for ranking the candidates selected by TF, we compare our approach to a simple baseline that ranks the candidate relations by popularity in the training set, in number of facts.</p><p>Process Two off-the-shelf embedding models were chosen for the experiments:</p><p>• TransE <ref type="bibr" target="#b2">[3]</ref>, the simplest linear model, intuitive and fast to train and apply.</p><p>• ComplEx <ref type="bibr" target="#b17">[18]</ref>, the best bilinear model, with twice more parameters, longer to train and apply.</p><p>The models were trained using the Adam algorithm for optimization <ref type="bibr" target="#b5">[6]</ref>, dropout for regularization <ref type="bibr" target="#b14">[15]</ref> and early-stopping with 100 epochs of patience (on the filtered validation MRR for link prediction). All experiments were done using Python 3.8, PyTorch 1.7.0 <ref type="bibr" target="#b10">[11]</ref>, TorchKGE 0.16.25 <ref type="bibr" target="#b3">[4]</ref> pytorch-ignite 0.4.4 and a Nvidia Titan V GPU powered with Cuda 10.1. The hyper-parameters of the embedding models were tuned using hyperopt 0.2.5. The possible values along with those chosen are listed in the provided supplemental material.</p><p>In the case of FB15k237, the split between train, validation and test sets is set by Toutanova et al. <ref type="bibr" target="#b16">[17]</ref>. For WDV5, we split the dataset at random with 80% of the facts for training, 10% for validation (for choosing hyper-parameters) and 10% for testing. The reported metrics are averaged over 6 distinct random splits and independent training procedures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Performance</head><p>The results for relation prediction are shown in Table <ref type="table" target="#tab_3">2</ref> with metrics computed in a filtered setting, for different variants of the model so as to assess the respective gains of the proposed techniques:</p><p>• Original: The base model (either TransE or ComplEx) trained with BerNS.</p><p>• BalNS: The base model trained with BalNS.</p><p>• TF: The base model evaluated with Type Filtering (TF).</p><p>• BalNS &amp; TF: The base model trained with BalNS and evaluated with TF.</p><p>The original version of TransE is not efficient on FB15k237 (only 3% of Hit@1). ComplEx performs however notably well on the same dataset (89% of Hit@1). It seems less sensitive to the unbalanced role of entities and relations during training. We suspect however that the score of ComplEx on FB15k237 mainly results from over-fitting due to lack of new datasets in the KG embedding literature over the past few years and over-engineering of FB15k237 (it is the second version of the subset). This has already been argued in <ref type="bibr" target="#b0">[1]</ref> and it is confirmed by the fact that TransE and ComplEx have almost the same scores (around 45% of Hit@1) on the new dataset WDV5 which is a raw extraction from Wikidata.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Balanced Negative Sampling.</head><p>Training with BalNS has a strong impact on the relation prediction performance of the models: training TransE on FB15k237 with BalNS rather than BerNS increases the Hit@1 from 3% to 91%. This confirms the intuition that the relation embeddings were not well trained. The difference is less impressive for ComplEx on FB15k237 but the original ComplEx model performs already quite well on this dataset. On WDV5, there is a big increase in Hit@1 for both models: 24% for TransE and 28% for ComplEx.</p><p>Type filtering. TF has a strong impact on the performance of the models. Looking at Hit@1 on WDV5, TransE goes from 45% to 58% and ComplEx goes from 45% to 76%. Note that Type Filtering alone (the baseline) performs almost  as well as the original embedding models. It is however largely beaten by the combination of TF with scoring by an embedding model. The gain of using and embedding model is very important.</p><p>Complete model. The combination of BalNS and TF gives the best results. On FB15k237, the increase in performance of TransE is impressive (Hit@1 from 3% to 94%) and makes this model almost as efficient as ComplEx. This is obtained through additional facts imported from Wikidata for TF but the scores of TransE simply trained with BalNS (and without TF) are already close to those of ComplEx. On WDV5, all performance metrics are significantly improved by our approach. Both models that perform similarly in their original form remain close. On average, ComplEx beats TransE by 1% in Hit@1 but the scores of TransE are much more stable from one split to the other, as shown by the lower standard deviation. The intervals of fluctuation of MRR and Hit@1 tend to be reduced if the model is trained with BalNS. This is particularly true for TransE, whose standard deviation for each metric is very small.</p><p>It is remarkable to get almost identical performance with TransE and Com-plEx, knowing that TransE has half the number of parameters of ComplEx, is more geometrically intuitive and requires 6 times less operations for each gradient descent step during training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Application to Wikipedia Hyperlinks</head><p>In order to predict the relation associated to a hyperlink, we use the TransE embedding of WDV5 trained with BalNS and applied using TF. When two pages are linked and the corresponding Wikidata entities are involved in a fact of Wikidata (110,311 out of 3,008,116 hyperlinks), we can compare the predicted relation to the ground-truth. We obtain 84% of accuracy. This good score is expected as the model is trained on WDV5 facts and some hyperlinks indeed correspond to existing facts. However, it is interesting to look at cases where the prediction is different from the true fact. We have observed that the model can hardly predict directed relations (e.g., parent-child) or semantically close relations (e.g., employer and educated at for links between scholars and universities). This is not surprising as the only available data is the structure of the KG. Some other mistakes come from the embedding model itself, for example headquarter location always has a lower score than twinned administrative body for some reason, making the headquarter predictions all wrong.</p><p>In Table <ref type="table">3</ref>, we report for two pages the semantified hyperlinks that got the highest scores. It is reassuring to see that most of the resulting facts are true, many of them being however already known in Wikidata. A few mistakes could be avoided using a little bit of context (i.e., text information) but these results suggest that our approach is able to correctly semantify many links.</p><p>It seems however difficult to produce automatically a full dataset in this way. First, the scores of embedding models are usually not normalized so comparing them works fine when done locally (e.g., looking at the links of a particular page) but comparing the scores of the three million possible facts is not feasible. Second, many facts that get a high score are very likely but require additional information not present in data. For example the three most likely facts resulting from semantified hyperlinks of Wikivitals+ are:</p><p>• (Serbia, member of, World Trade Organization): Serbia's application is still under review. • (Taiwan, member of, World Trade Organization): Taiwan is already a member of the WTO through the Chinese Taipei but not in its name. • (Kosovo, member of, Interpol ): Kosovo's application was rejected in 2018.</p><p>Clearly, some additional textual content is needed in these cases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusion</head><p>We have proposed a novel approach to relation prediction by KG embedding. Our approach is based on two key ideas: Balanced Negative Sampling in the training of the embedding model, and Type Filtering to select candidate relations. We have shown that this approach performs well using an embedding model as simple as TransE, opening the way to robust and explainable predictions. Our results suggest that the model can be used to enrich Wikidata, by the semantification of Wikipedia hyperlinks associated with known entities. This approach is however not yet fully automatable and performance still needs to be increased for that goal. For future work, we would like to further improve our negative sampling technique by replacing the relation with some probability that depends on the considered fact (h, r, t), instead of some fixed probability. It seems also necessary to integrate some context from textual data for example (like the description of the relations and the articles themselves) in order to help the embedding model in its choices. A fully automatized process of enriching Wikidata with semantified Wikipedia hyperlinks seems however not out of reach.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>• A novel filtering technique for relation prediction where candidate relations are selected through the types of the head and tail entities.• A new dataset, WDV5, consisting of the facts between entities of Wikidata corresponding to the level 5 of Wikipedia vital articles 2 .</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>Key features of the datasets used in experiments.</figDesc><table><row><cell></cell><cell cols="5">Entities/nodes Facts/edges Relations Types Typed entities</cell></row><row><cell>FB15k237</cell><cell>14,541</cell><cell>310,116</cell><cell>237</cell><cell>73</cell><cell>2,719</cell></row><row><cell>WDV5</cell><cell>39,062</cell><cell>231,744</cell><cell cols="2">607 1,206</cell><cell>22,883</cell></row><row><cell>Wikivitals+</cell><cell cols="2">39,062 3,008,116</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 2 :</head><label>2</label><figDesc>Results of relation prediction on FB15k237 and WDV5.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">Note that some authors refer to this task as relation prediction, see<ref type="bibr" target="#b8">[9]</ref> for instance. We make a clear distinction between link prediction (head or tail entity unknown) and relation prediction (relation unknown).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">https://www.wikidata.org/wiki/Help:Property constraints portal</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">https://gitlab.telecom-paris.fr/aboschin/hyperlinks-semantification</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">https://netset.telecom-paris.fr/pages/wikivitals+.html</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Head</head><p>Predicted  </p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Realistic re-evaluation of knowledge graph completion methods: An experimental study</title>
		<author>
			<persName><forename type="first">F</forename><surname>Akrami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Saeef</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1145/3318464.3380599</idno>
		<idno>3318464.3380599</idno>
		<ptr target="https://doi.org/10.1145/" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data</title>
				<meeting>the 2020 ACM SIGMOD International Conference on Management of Data<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1995" to="2010" />
		</imprint>
	</monogr>
	<note>SIGMOD &apos;20</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Freebase: A collaboratively created graph database for structuring human knowledge</title>
		<author>
			<persName><forename type="first">K</forename><surname>Bollacker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Evans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Paritosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Sturge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Taylor</surname></persName>
		</author>
		<idno type="DOI">10.1145/1376616.1376746</idno>
		<idno>1376616.1376746</idno>
		<ptr target="https://doi.org/10.1145/" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data</title>
				<meeting>the 2008 ACM SIGMOD International Conference on Management of Data<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="1247" to="1250" />
		</imprint>
	</monogr>
	<note>SIGMOD &apos;08</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Translating Embeddings for Modeling Multi-relational Data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bordes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Usunier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Garcia-Duran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Yakhnenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 26</title>
				<editor>
			<persName><forename type="first">C</forename><forename type="middle">J C</forename><surname>Burges</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Welling</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>Ghahramani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><forename type="middle">Q</forename><surname>Weinberger</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="2787" to="2795" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">TorchKGE: Knowledge Graph Embedding in Python and PyTorch</title>
		<author>
			<persName><forename type="first">A</forename><surname>Boschin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">KDD-IWKG</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<date type="published" when="2020-08">2020. Aug 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Rule Mining for Semantifying Wikilinks</title>
		<author>
			<persName><forename type="first">L</forename><surname>Galárraga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Symeonidou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Moissinac</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LDOW@WWW</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Adam: A method for stochastic optimization</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">3rd International Conference on Learning Representations, ICLR 2015</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</editor>
		<meeting><address><addrLine>San Diego, CA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">May 7-9, 2015. 2015</date>
		</imprint>
	</monogr>
	<note>Conference Track Proceedings</note>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Analysis of the impact of negative sampling on link prediction in knowledge graphs</title>
		<author>
			<persName><forename type="first">B</forename><surname>Kotnis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Nastase</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1708.06816</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Type-constrained representation learning in knowledge graphs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Krompaß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Baier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tresp</surname></persName>
		</author>
		<editor>Arenas, M., Corcho, O., Simperl, E., Strohmaier, M., d&apos;Aquin, M., Srinivas, K., Groth, P., Dumontier, M., Heflin, J., Thirunarayan, K., Thirunarayan, K., Staab, S.</editor>
		<imprint>
			<date type="published" when="2015">2015. 2015</date>
			<publisher>Springer International Publishing</publisher>
			<biblScope unit="page" from="640" to="655" />
			<pubPlace>Cham</pubPlace>
		</imprint>
	</monogr>
	<note>The Semantic Web -ISWC</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Learning attention-based embeddings for relation prediction in knowledge graphs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Nathani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chauhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kaul</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 57th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A Three-way Model for Collective Learning on Multi-relational Data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nickel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tresp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">P</forename><surname>Kriegel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 28th International Conference on International Conference on Machine Learning</title>
				<meeting>the 28th International Conference on International Conference on Machine Learning<address><addrLine>Bellevue, WA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Omnipress</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="809" to="816" />
		</imprint>
	</monogr>
	<note>ICML&apos;11</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Automatic differentiation in PyTorch</title>
		<author>
			<persName><forename type="first">A</forename><surname>Paszke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gross</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chintala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Devito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Desmaison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Antiga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lerer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31st Conference on Neural Information Processing Systems</title>
				<meeting>the 31st Conference on Neural Information Processing Systems<address><addrLine>Long Beach, CA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-10">Oct 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Cross-sentence n-ary relation extraction with graph LSTMs</title>
		<author>
			<persName><forename type="first">N</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Poon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Quirk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">T</forename><surname>Yih</surname></persName>
		</author>
		<idno type="DOI">10.1162/tacl_a_00049</idno>
		<ptr target="https://doi.org/10.1162/tacla00049" />
	</analytic>
	<monogr>
		<title level="j">Transactions of the Association for Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="101" to="115" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">An overview of gradient descent optimization algorithms</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ruder</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1609.04747</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Falcon 2.0: An entity and relation linking tool over wikidata</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sakor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Vidal</surname></persName>
		</author>
		<idno type="DOI">10.1145/3340531.3412777</idno>
		<ptr target="https://doi.org/10.1145/3340531.3412777" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th ACM International Conference on Information and Knowledge Management</title>
				<meeting>the 29th ACM International Conference on Information and Knowledge Management<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3141" to="3148" />
		</imprint>
	</monogr>
	<note>CIKM &apos;20</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Dropout: A simple way to prevent neural networks from overfitting</title>
		<author>
			<persName><forename type="first">N</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">56</biblScope>
			<biblScope unit="page" from="1929" to="1958" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Knowledge Representation and Rule Mining in Entity-Centric Knowledge Bases</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Suchanek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lajus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Boschin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Weikum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Reasoning Web. Explainable Artificial Intelligence: 15th International Summer School 2019</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Krötzsch</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Stepanova</surname></persName>
		</editor>
		<meeting><address><addrLine>Bolzano, Italy; Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2019-09-24">September 20-24, 2019. Sep 2019</date>
			<biblScope unit="page" from="110" to="152" />
		</imprint>
	</monogr>
	<note>Tutorial Lectures</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Representing Text for Joint Embedding of Text and Knowledge Bases</title>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pantel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Poon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Choudhury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gamon</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D15-1174</idno>
		<ptr target="https://doi.org/10.18653/v1/D15-1174" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2015 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Lisbon, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1499" to="1509" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Trouillon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Dance</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Welbl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riedel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Gaussier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bouchard</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1702.06879</idno>
		<title level="m">Knowledge Graph Completion via Complex Tensor Factorization</title>
				<imprint>
			<date type="published" when="2017-02">Feb 2017</date>
		</imprint>
	</monogr>
	<note>cs, math, stat</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Relation Classification via Multi-Level Attention CNNs</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>De Melo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/P16-1123</idno>
		<ptr target="https://doi.org/10.18653/v1/P16-1123" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 54th Annual Meeting of the Association for Computational Linguistics<address><addrLine>Berlin, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-08">Aug 2016</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1298" to="1307" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Knowledge Graph Embedding: A Survey of Approaches and Applications</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Guo</surname></persName>
		</author>
		<idno type="DOI">10.1109/TKDE.2017.2754499</idno>
		<ptr target="https://doi.org/10.1109/TKDE.2017.2754499" />
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="2724" to="2743" />
			<date type="published" when="2017-12">Dec 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Knowledge Graph Embedding via Graph Attenuated Attention Networks</title>
		<author>
			<persName><forename type="first">R</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2019.2963367</idno>
		<ptr target="https://doi.org/10.1109/ACCESS.2019.2963367" />
	</analytic>
	<monogr>
		<title level="m">IEEE Access</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="5212" to="5224" />
		</imprint>
	</monogr>
	<note>conference Name</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Knowledge graph embedding by translating on hyperplanes</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence</title>
				<meeting>the Twenty-Eighth AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1112" to="1119" />
		</imprint>
	</monogr>
	<note>AAAI&apos;14</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Representation learning of knowledge graphs with hierarchical types</title>
		<author>
			<persName><forename type="first">R</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence</title>
				<meeting>the Twenty-Fifth International Joint Conference on Artificial Intelligence</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="2965" to="2971" />
		</imprint>
	</monogr>
	<note>IJCAI&apos;16</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Relation linking for wikidata using bag of distribution representation</title>
		<author>
			<persName><forename type="first">X</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing and Chinese Computing</title>
				<editor>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Zhao</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Feng</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Hong</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="652" to="661" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Graph convolution over pruned dependency trees improves relation extraction</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D18-1244</idno>
		<ptr target="https://doi.org/10.18653/v1/D18-1244" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2018 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Brussels, Belgium</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018-11">Oct-Nov 2018</date>
			<biblScope unit="page" from="2205" to="2215" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
