=Paper=
{{Paper
|id=Vol-1409/paper-02
|storemode=property
|title=Rule Mining for Semantifying Wikilinks
|pdfUrl=https://ceur-ws.org/Vol-1409/paper-02.pdf
|volume=Vol-1409
|dblpUrl=https://dblp.org/rec/conf/www/GalarragaSM15
}}
==Rule Mining for Semantifying Wikilinks==
Rule Mining for Semantifying Wikilinks
Luis Galárraga, Danai Symeonidou, Jean-Claude Moissinac
Télécom ParisTech, Paris, France
{luis.galarraga, danai.symeonidou, jean-claude.moissinac}@telecom-paristech.fr
ABSTRACT In some other cases, the semantic connection encoded in a
Wikipedia-centric Knowledge Bases (KBs) such as YAGO wikilink can be vague and opaque and even not modeled in
and DBpedia store the hyperlinks between articles in Wiki- the schema of the KB. For example, Obama’s article also
pedia using wikilink relations. While wikilinks are signals of links to the articles for cocaine and ovarian cancer.
semantic connection between entities, the meaning of such In this work, we show how to leverage the already seman-
connection is most of the times unknown to KBs, e.g., for tified wikilinks to semantify the others. This is achieved
89% of wikilinks in DBpedia no other relation between the by learning frequent semantic patterns from the relations in
entities is known. The task of discovering the exact relations the KB and the wikilinks. If we observe that people often
that hold between the endpoints of a wikilink is called wik- link to the countries where they come from, we can suggest
ilink semantification. In this paper, we apply rule mining that unsemantified wikilinks from people to countries con-
techniques on the already semantified wikilinks to propose vey a nationality relationship. This example also implies
relations for the unsemantified wikilinks in a subset of DB- that the types of entities play an important role when se-
pedia. By mining highly supported and confident logical mantifying wikilinks. For instance, the fact that France links
rules from KBs, we can semantify wikilinks with very high to Spain suggests that the implicit relation carried by the
precision. wikilink holds between countries (or even places) and there-
fore discards any relation with an incompatible signature. If
we assume that wikilinks between countries encode a trade
1. INTRODUCTION partnership, we can formulate this pattern as a logical rule:
Some of the most prominent KBs such as DBpedia [1]
linksTo(x, y)∧is(x, Country)∧is(y, Country) ⇒ deals(x, y)
or YAGO [19] build upon accurate information extraction
on the semi-structured parts of Wikipedia articles such as Given an unsemantified wikilink between two countries, this
infoboxes, Wikipedia categories and hyperlinks between ar- rule will predict that they must be trade partners. Such
ticles, namely wikilinks. Even though wikilinks account for predictions could be proposed as candidate facts to popu-
more than 25% of the non-literal facts in DBpedia, they late KBs. Still, this application scenario would require the
are rarely exploited. Nevertheless, the fact that two entities rules to have certain quality, i.e., they should be statistically
are connected via a hyperlink accurately suggests a seman- significant and draw correct conclusions in most cases. This
tic connection between them. The goal of this paper is to would avoid capturing noisy or irrelevant patterns and make
discover the exact meanings of such connections. wrong predictions.
Some wikilinks are already semantified in KBs. YAGO The process of learning logical rules from structured data
and DBpedia, for example, know that Barack Obama links is known as Rule Mining. In this paper, we resort to a
to USA and is also a citizen and the President of that coun- method called AMIE [6] to mine logical rules from KBs. We
try. KBs can extract such information because it is usually then use the rules to draw conclusions and compute a list
available in the infoboxes; however if the information lies of the most likely candidate meanings (relations) between
somewhere outside the infoboxes, KBs will not see it, lead- the entities of unsemantified wikilinks. Using a straightfor-
ing to unsemantified wikilinks (see [9, 21] for automatic pop- ward inference method, we can discover meanings for 180K
ulation of infoboxes from text). This is the case for 89% of unsemantified wikilinks with very high precision.
wikilinks in DBpedia. For instance, the Wikipedia article of In addition to the semantification of wikilinks, and to fur-
Barack Obama links to the article of the 2009 Nobel Prize, ther emphasize their value, we discuss their effect in the
but DBpedia does not know that he won the Nobel Prize. task of rule mining. We observe that sometimes, they can
increase the confidence of the obtained rules. For instance,
assuming that a rule mining approach learns the rule:
currentM ember(x, y) ⇒ team(x, y)
we observe that by requiring the existing of a wikilink be-
tween the entities:
linksT o(x, y) ∧ currentM ember(x, y) ⇒ team(x, y)
Copyright is held by the owner/author(s).
WWW2015 Workshop: Linked Data on the Web (LDOW2015). we achieve higher confidence. This observation could be
leveraged by data inference and link prediction approaches. stances of coaches and soccer leagues. While this method
It also provides additional insights about the KB. also makes use of the instance information to mine patterns,
it does not aim to discover relations between entities. Thus,
2. RELATED WORK it does not make use of any other relations holding between
the endpoints of wikilinks. In the same spirit, [16] builds
Link prediction. The task of discovering semantic links
upon EKPs and uses the instance information to map both
between entities in KBs is often referred in the literature
entities and classes to a vector space. A similarity function
as link prediction. Due to the prominence of the Semantic
on this space is used to compute the distance of an entity to
Web, the problem has been extensively studied using multi-
the prototypical vector of classes and predict the types for
ple paradigms.
untyped entities.
Statistical graphical models such as Bayesian Networks [5]
and Markov Logic Networks (MLN) [17] offer a theoretically
rigorous framework for data inference in KBs. Given a KB 3. PRELIMINARIES
and a set of soft weighted rules expressed in first order logic,
MLNs support multiple inference tasks such as probability 3.1 Rule Mining
calculation for queries and predictions, and MAP (Maximum Our proposal to semantify wikilinks relies on logical rules
a Posteriori) inference. The major drawback of such meth- mined from a KB and its wikilinks. In this paper we use a
ods is that in the original formulation they do not scale to the logical notation to represent rules and facts in a KB, e.g., the
size of current KBs. Nevertheless, there have been initiatives fact that Angela Merkel is a citizen of Germany is expressed
to extend the applicability of MLNs to large datasets [14]. as nationality(Angela Merkel, Germany). An atom is a fact
Some approaches represent KBs as matrices or tensors [12, where at least one of the arguments of the relation is a vari-
13]. Under this paradigm, for instance, a KB can be repre- able, e.g., nationality(x, Germany). We say that an atom
sented as a three-dimensional tensor where the fact r(x, y) is holds in a KB if there exists an assignment for the variables
encoded as 1 in the cell with coordinates (r, x, y). Methods in the atom that results in a fact in the KB. Moreover,we say
such as RESCAL [12], among others [13, 18] resort to tensor that two atoms are connected if they share at least one vari-
factorization and latent factor analysis on the matrix repre- able. The building blocks for logical rules are conjunctions
sentation of the KB, in order to estimate the confidence of of transitively connected atoms. For example, the rule that
the missing cells, i.e., how likely the missing facts are true says that married couples have the same nationality can be
based on the latent features in the data. Even though the expressed as:
scores are often given a probabilistic interpretation, they are
nationality(x, y) ∧ spouse(x, z) ⇒ nationality(z, y)
not probabilities in a strict sense. Unlike our approach, this
line of methods does not rely on explicitly formulated rules This is a Horn rule. The left-hand side of the implication is a
to perform inference. conjunction of connected atoms called the body, whereas the
A third family of approaches [7, 20, 3] resorts to em- right-hand side is the head. In this paper, we focus on closed
bedding models to formulate the link prediction problem. Horn rules, i.e., rules where each variable occurs in at least
In [20], entities are represented as vectors in an embed- two atoms of the rule. Closed Horn rules always conclude
ding space, while relations are defined as transformations concrete facts for assignments of the variables to values in
on those vectors, e.g., the transformation nationality maps the KB. If the KB knows nationality(Barack Obama, USA)
the vector of Barack Obama to the vector of USA. Methods and spouse(Barack Obama, Michelle Obama), our example
based on embedding methods are very effective at predicting rule will conclude nationality(Michelle Obama, USA). If the
values for functional relations, e.g., place of birth and still conclusion of a rule does not exist in the KB, we call it
perform fairly well for one-to-many relations, e.g., children. a prediction. Rule Mining approaches require a notion of
Unlike the previous methods, the approach proposed in [10] counter-examples and precision for rules, to account for the
relies on a graph representation for KBs and applies ran- cases where the rules err. In the next section we describe
dom walks and path ranking methods to discover new facts such notions as well as a method to learn closed Horn rules
in large KBs. In a similar fashion [11] mines frequent meta- from potentially incomplete KBs.
paths on data graphs, i.e., sequences of data types connected
by labeled edges, and uses them to predict links between en- 3.2 AMIE
tities. AMIE [6] is a system that learns closed Horn rules of the
All the approaches mentioned so far tackle the link pre- form:
diction problem in KBs in a general way. Our approach in B1 ∧ · · · ∧ Bn ⇒ r(x, y) Abbrev. B ⇒ r(x, y)
contrast, has a more focused scope, since we aim at predict-
ing semantic links for entities for which there exists a signal AMIE assesses the quality of rules in two dimensions: sta-
of semantic connection, namely a wikilink. tistical significance and confidence. The first dimension is
Wikilinks for type induction. Some approaches have measured by the support of the rule. This metric is defined
leveraged the semantic value conveyed by wikilinks for the according to the following formula:
task of type inference in KBs. The work presented in [15]
supp(B ⇒ r(x, y)) := #(x, y) : ∃z1 , ..., zm : B ∧ r(x, y)
represents the set of wikilinks as a directed graph where
each entity is replaced by its more specific type in the DB- In other words, the support is the number of distinct assign-
pedia type hierarchy. The method discovers frequent sub- ments of the head variables for which the rule concludes a
graph patterns on such graph. These are called Encyclopedic fact in the KB. Support is defined to be monotonic; given
Knowledge Patterns (EKP). EKPs can be used to describe a rule, the addition of a new atom will never increase its
classes of entities and therefore predict the types for untyped support. Moreover, support is a measure of statistical evi-
entities, e.g., instances of soccer players will often link to in- dence, thus, it does not gauge the precision of the rule, i.e.,
how often it draws correct or incorrect conclusions. This the cases, this corresponds to its location. We also observe
requires a notion of negative examples. Since KBs do not that in our dataset, 81% of the links for these classes are
encode negative information, rule mining approaches resort not semantified. Rule mining techniques can help us learn
to different assumptions to derive counter-evidence. Meth- the patterns suggested by Table 1 and semantify more links.
ods based on traditional association rule mining [8] resort to For example, the fact that organizations link to the places
the Closed World Assumption (CWA). Under the CWA, any where they are located can be expressed as:
conclusion of the rule that is absent in the KB, is a counter-
linksT o(x, y) ∧ is(x, Org) ∧ is(y, Loc) ⇒ location(x, y)
example. This mechanism, however, contradicts the Open
World Assumption that KBs make. In constrast, AMIE Such a rule would allow us to predict the relation location
uses the Partial Completeness Assumption (PCA) to deduce for unsemantified wikilinks between organizations and loca-
counter-examples. The PCA is the assumption that if a KB tions. This is a link prediction task and has a great value
knows some r-values for an instance, then it knows all its for web-extracted KBs such as YAGO or DBpedia.
values. If a rule predicts a second nationality for Barack We start by constructing a training set K from DBpedia
Obama, knowing that he is American, the PCA will count 3.81 consisting of 4.2M facts and 1.7M entities, including
such deduction as a counter-example. On other hand if the people, places and organizations. We enhance this dataset
KB did not know any nationality for Obama, then such case with the type information about the entities, i.e., 8M rdf:type
would be disregarded as evidence, while the CWA would still statements, and the wikilinks between those entities. Since
count it as negative evidence. Notice that, the PCA is per- we can only learn from already semantified wikilinks, we
fectly safe for functional relations, e.g., place of birth and restrict the set of wikilinks to those where both endpoints
still feasible for quasi-functions such as nationality. participate in a relation in the data, i.e., linksT o(a, b) ∈ K
The confidence of a rule under the PCA follows the for- iff ∃ r, r0 , x, y : (r(x, a) ∨ r(a, x)) ∧ (r0 (y, a) ∨ r0 (a, y)). This
mula: procedure led us to a training set K with a total of 18M
supp(B ⇒ r(x, y)) facts. We ran AMIE on this dataset and configured it to
pcaconf (B ⇒ r(x, y)) := mine closed Horn rules of the form:
#(x, y) : ∃z1 , . . . , zk , y 0 : B ∧ r(x, y 0 )
The PCA confidence normalizes the support of the rule (num- linksT o∗ (x, y) ∧ B ∧ is(x, C) ∧ is(y, C 0 ) ⇒ r(x, y)
ber of positive examples) over the number of both the posi- where linksTo is an alias for wikiPageWikiLink, linksTo*
tive and the negative examples according to the PCA. denotes either linksTo or linksTo−1 , ”is” is a synonym for
AMIE uses support and confidence as quality metrics for rdf:type and B is a conjunction of up to 2 atoms. We call
rules and the user can threshold on these metrics. In ad- them semantification rules. With support and PCA confi-
dition, AMIE implements a set of strategies to guarantee dence thresholds 100 and 0.2 respectively, AMIE found 3546
good runtime and rules of good quality. Examples of such semantification rules on the training set K. Table 2 shows
strategies are prune by support and the skyline technique. To examples of those rules.
prune the search space efficiently, AMIE relies on the mono- We then use the rules to draw predictions of the form
tonicity of support, that is, once a rule has dropped below p := r(a, b), i.e., r(a, b) ∈ / K. We restrict even further
the given support threshold, the system can safely discard the set of predictions, by requiring the arguments to be
the rule and all its derivations with more atoms. The skyline the endpoints of unsemantified wikilinks, more precisely,
technique, on the other hand, is an application of the Occam @ r0 : r0 6= linksT o ∧ r0 (a, b) ∈ K. Recall that those predic-
Razor principle: among a set of hypotheses with the same tions may have a different degree of confidence depending
predictive power, the one with fewer assumptions (the sim- on the confidence of the rules that are used to deduce them.
plest) should be preferred. If the system has already learned Moreover, a prediction can in principle be deduced by multi-
a rule of the form B ⇒ r(x, y) and then finds a more specific ple rules since AMIE explores the search space of rules in an
version of the rule, i.e., B ∧ rn (xn , yn ) ⇒ r(x, y), the more exhaustive fashion. To take this observation into account,
specific rule will be output only if it has higher confidence. we define the confidence of a prediction p according to the
following formula:
4. SEMANTIFYING WIKILINKS |R|
Our approach to semantify wikilinks relies on the intu-
Y
conf (p) := 1 − (1 − [φ(Ri , p) × pcaconf (Ri )]) (1)
ition that (a) wikilinks often convey a semantic connection i=1
between entities, (b) some of them are already semantified
in KBs, (c) the types of the entities in the wikilink define the where R is the set of semantification rules and φ(Ri , p) = 1
signature of its implicit relation and (d) the already semanti- if Ri ` p, i.e., if p is concluded from rule Ri ; otherwise
fied wikilinks can help us semantify the others. The already φ(Ri , p) = 0. The rationale behind Formula 1 is that the
semantified wikilinks constitute our training set. From this more rules lead to a prediction, the higher the confidence on
training set, we mine a set of semantic patterns in the form that prediction should be. The confidence is then defined as
of logical rules. the probability that at least one of the rules Ri that con-
To justify our intuition, we look at the types of the end- cludes p applies. This can be calculated as 1 minus the prob-
points of semantified wikilinks in DBpedia. We restrict our ability that none of the rules holds. The latter probability
analysis to the classes Person, Place and Organization. Ta- is defined as the product of the probabilities that each rule
ble 1 shows the most common relations holding between in isolation does not hold, in other words (1 − pcaconf (Ri )).
pairs of those entities for which there exists at least one wik- Formula 1 thus, makes two strong assumptions. First, it
ilink. For example, we observe that when a person links to a confers a probabilistic interpretation to the PCA confidence.
place, in 56% of the cases, the person was born in that place. 1
We learn rules on DBpedia 3.8 to corroborate some of their
Similarly, when an organization links to a place, in 19% of predictions automatically in DBpedia 3.9
Domain Range Relation - % occurrences
Person Person successor 18% associatedBand 11% associatedMusicalArtist 11%
Person Place birthPlace 56% deathPlace 18% nationality 8%
Person Organization team 53% almaMater 8% party 5%
Place Place isPartOf 29% country 28% location 13%
Place Person leaderName 42% architect 32% saint 12%
Place Organization owner 24% tenant 16% operatedBy 12%
Organization Organization sisterStation 18% associatedBand 15% associatedMusicalArtist 15%
Organization Person currentMember 22% bandMember 20% formerBandMember 20%
Organization Place location 19% city 17% hometown 13%
Table 1: Top-3 relations encoded in wikilinks between instances of Person, Place and Organization in DBpedia.
Rule PCA. Conf.
linksT o(x, y) ∧ parent(x, y) ∧ successor(y, x) ∧ is(x, P erson) ∧ is(y, P erson) ⇒ predecessor(x, y) 1.0
linksT o(x, y) ∧ picture(x, y) ∧ is(x, ArchitecturalStructure) ∧ is(y, P opulatedP lace) ⇒ location(x, y) 0.94
linksT o(y, x) ∧ owner(x, y) ∧ subsidiary(y, x) ∧ is(y, Co.) ∧ is(x, Co.) ⇒ owningCompany(x, y) 1.0
Table 2: Some semantification rules mined by AMIE on DBpedia.
Precision@1 Precision@3 Rules without wikilink 857
0.77 ± 0.10 0.67 ± 0.07 Rules with wikilink 1509
Rules with confidence gain 1389
Table 3: Average MAP@1 and MAP@3 scores for Weighted average gain (wag) 0.03
semantification of wikilinks on DBpedia. Rules with gain ≥ 0.1 139
Table 5: Statistics about rule mining with and with-
Second, it assumes that rules are independent events. While out wikilinks.
we do not claim these assumptions to be correct, they still
provide a naive baseline to estimate the likelihood of facts
without resorting to more sophisticated approaches for data
5. WIKILINKS FOR RULE MINING
inference. As we show later, such a naive estimator delivers The skyline technique implemented in AMIE prevents the
satisfactory results in our scenario. system from reporting low quality rules. If AMIE finds two
Given an unsemantified wikilink l := linksT o(a, b), For- rules B ⇒ r(x, y) and B ∧ rn (xn , yn ) ⇒ r(x, y) and the
mula 1 allows us to propose a list of candidate meanings for latter has lower confidence, the system will not output it
l. If among the set of predictions there are several facts of because it is worse in all dimensions, i.e., it has also lower
the form ri (a, b), then each relation ri is a semantification support. We therefore investigate the confidence gain car-
candidate for l with confidence conf (ri (a, b)). For each un- ried by the addition of wikilink atoms in rules.
semantified link, we propose a list of semantification candi- We first run AMIE on the DBpedia mapping-based triples.
dates sorted by confidence. Our procedure proposes relation In a second run, we add the wikilinks to the mapping-based
candidates for 180K unsemantified wikilinks in the training triples and instruct the system to mine, when possible, rules
set. Since, we can semantify only 1% of them by automat- of the form linksT o∗ (x, y) ∧ B ⇒ r(x, y), i.e., if the skyline
ically checking our predictions in DBpedia 3.9, we evaluate technique does not prune the longer rule. In both cases, we
the precision of our approach on a sample of 60 unseman- set a threshold of 100 positive examples for support and no
tified wikilinks. We then evaluate the correcteness of their confidence threshold. We report our findings in Table 5. We
rankings of semantification candidates as follows: for each observe that requiring the head variables to be connected via
wikilink we count the number of correct candidates at top a wikilink increases the number of rules from 857 to 1509.
1 and top 3 of the ranking, we then add up these counts This occurs because in the second run, AMIE sometimes
and divide them by the total number of candidates at top mines versions of the rules with and without the linksTo ∗
1 and top 3 respectively. This gives us an estimation of the atom. In other words, for some rules the addition of a wik-
precision of our approach. Table 3 shows the estimated pre- ilink atom provides a confidence gain. This is the case for
cision values drawn from the sample as well as the size of 1389 rules as Table 5 shows. We are interested in finding
the Wilson Score Interval [4] at confidence 95%. The results how much confidence gain is carried by those rules. Thus,
imply that, for example, the precision at top 1 for the whole we define the gain of a wikilink rule as a variant of the gain
set of wikilinks lies in the interval 77% ± 10% with 95% metric used in association rule mining [2]:
probability. gain(R) := supp(R) × (pcaconf (R) − pcaconf (R¬linksTo ))
Table 4 shows some examples of wikilinks and the rank-
ing of semantification candidates proposed by our approach. That is, the gain of a wikilink rule is the product of its
The number in parentheses corresponds to the confidence of support and the difference in confidence with respect to the
the semantification candidate. The candidates evaluated as rule without the linksTo ∗ atom. Table 5 reports an aver-
correct according to the our evaluation are in italics. age gain of 0.03. We find, however, that for 10% of rules,
WikiLink Semantification candidates
Interstate 76 (west) → Colorado State Highway routeJunction (1.0)
J. Bracken Lee → Herbert B. Maw predecessor (1.0), parent(0.998), governor(0.882)
WHQX → WTZE sisterStation (1.0)
Table 4: Some examples of semantification candidates for wikilinks. The correct candidates are in italics.
Rule ∆-conf [4] L. D. Brown, T. T. Cai, and A. DasGupta. Interval
producer(x, y) ∧ recordLabel(x, y) ⇒ artist(x, y) 0.34
estimation for a binomial proportion. Statistical
debutT eam(x, y) ⇒ team(x, y) 0.28
of f icialLanguage(x, y) ⇒ spokenIn(x, y) 0.19
Science, 2001.
[5] N. Friedman, L. Getoor, D. Koller, and A. Pfeffer.
Table 6: Confidence gain for some rules when spe- Learning probabilistic relational models. In IJCAI,
cialized with a linksTo atom on the head variables. 1999.
[6] L. Galárraga, C. Teflioudi, K. Hose, and F. Suchanek.
AMIE: Association rule mining under incomplete
the gain can be higher than 0.1. We show some of those evidence in ontological knowledge bases. In WWW,
rules with their corresponding confidence gain in Table 6. It 2013.
follows that, in the majority of cases, the wikilinks do not [7] A. Garcı́a-Durán, A. Bordes, and N. Usunier. Effective
provide a significant confidence gain to rule mining in DB- blending of two and three-way interactions for
pedia. The reason lies on the fact that for 99% of the triples modeling multi-relational data. In ECML-PKDD,
in the DBpedia mapping-based dataset, there is a wikilink 2014.
between the arguments of the triples, that is, the addition [8] B. Goethals and J. Van den Bussche. Relational
of a wikilink atom does not provide additional information Association Rules: Getting WARMER. In Pattern
to the rule. On the other hand, for certain relations, the ar- Detection and Discovery, volume 2447. Springer Berlin
guments are not sometimes not connected with a wikilink. / Heidelberg, 2002.
This is the case for 100K triples. In such cases, the addition
[9] D. Lange, C. Böhm, and F. Naumann. Extracting
of a linksT o∗ atom may convey a confidence gain that can
structured information from wikipedia articles to
be used to improve the quality of the rules.
populate infoboxes. In CIKM, 2010.
All our datasets and experimental results are available un-
[10] N. Lao, T. Mitchell, and W. W. Cohen. Random walk
der http://luisgalarraga.de/semantifying-wikilinks.
inference and learning in a large scale knowledge base.
In EMNLP, 2011.
6. CONCLUSIONS [11] C. Meng, R. Cheng, S. Maniu, P. Senellart, and
While none of the major Wikipedia-centric KBs make fur- W. Zhang. Discovering meta-paths in large
ther use of the wikilinks, in this work we have shown that heterogeneous information networks. In WWW, 2015.
they often encode latent relations between entities. Such re- [12] M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way
lations may not be captured in KBs. We have shown that model for collective learning on multi-relational data.
rule mining techniques and naive inference methods are a In ICML, 2011.
feasible alternative to accurately discover those implicit se- [13] M. Nickel, V. Tresp, and H.-P. Kriegel. Factorizing
mantics. This wikilink semantification task can be seen as a YAGO: Scalable machine learning for linked data. In
particular case of the link prediction problem in KBs. With WWW, 2012.
this work, we aim at turning the attention to the wikilinks, [14] F. Niu, C. Ré, A. Doan, and J. Shavlik. Tuffy: Scaling
as they convey valuable information that can help improve up statistical inference in markov logic networks using
the completeness of KBs. an rdbms. VLDB Endowment., 2011.
[15] A. Nuzzolese, A. Gangemi, V. Presutti, and
7. ACKNOWLEDGMENTS P. Ciancarini. Encyclopedic knowledge patterns from
This work is supported by the Chair “Machine Learning wikipedia links. In ISWC. 2011.
for Big Data” of Télécom ParisTech and Labex DigiCosme [16] A. G. Nuzzolese, A. Gangemi, V. Presutti, and
(project ANR-11-LABEX-0045-DIGICOSME) operated by P. Ciancarini. Type inference through the analysis of
ANR as part of the program “In- vestissement d’Avenir” Idex wikipedia links. In LDOW, 2012.
Paris-Saclay (ANR-11-IDEX-0003-02). [17] M. Richardson and P. Domingos. Markov logic
networks. Mach. Learn., 62(1-2):107–136, Feb. 2006.
8. REFERENCES [18] A. P. Singh and G. J. Gordon. Relational learning via
[1] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, collective matrix factorization. In KDD, 2008.
R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a [19] F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A
web of open data. In ISWC, 2007. Core of Semantic Knowledge. In WWW, 2007.
[2] R. J. Bayardo. Mining the most interesting rules. [20] Z. Wang, J. Zhang, J. Feng, and Z. Chen. Knowledge
pages 145–154, 1999. graph embedding by translating on hyperplanes. In
[3] A. Bordes, N. Usunier, A. Garcı́a-Durán, J. Weston, AAAI, 2014.
and O. Yakhnenko. Translating embeddings for [21] F. Wu and D. S. Weld. Autonomously semantifying
modeling multi-relational data. In NIPS, 2013. wikipedia. In CIKM, 2007.