<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning to Propagate Knowledge in Web Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pasquale Minervini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudia d'Amato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Fanizzi</string-name>
          <email>nicola.fanizzig@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volker Tresp</string-name>
          <email>volker.tresp@siemens.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science - University of Bari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Siemens AG, Corporate Technology</institution>
          ,
          <addr-line>Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The increasing availability of structured machine-processable knowledge in the WEB OF DATA calls for machine learning methods to support standard pattern matching and reasoning based services (such as query-answering and inference). Statistical regularities can be efficiently exploited to overcome the limitations of the inherently incomplete knowledge bases distributed across the Web. This paper focuses on the problem of predicting missing class-memberships and property values of individual resources in Web ontologies. We propose a transductive inference method for inferring missing properties about individuals: given a class-membership/property value prediction problem, we address the task of identifying relations encoding similarities between individuals, and efficiently propagating knowledge across their relations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Standard query answering and reasoning services for the Semantic Web [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] (SW) mainly
rely on deductive inference. However, purely deductive inference suffers from several
limitations [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]: reasoning tasks might be computationally complex, do not always
address uncertainty and only rely on axiomatic prior knowledge. However, purely
deductive reasoning with SW representations suffers from several limitations owing to its
complexity and the inherent incompleteness and incoherence of distributed knowledge
bases (KBs) in a Web-scale scenario modeled by formal ontologies. In such a
context many complex tasks (e.g. query answering, clustering, ranking, etc.) are ultimately
based on assessing the truth of ground facts. Deciding on the truth of specific facts
(assertions) in SW knowledge bases requires to take into account the open-world form of
reasoning adopted in this context: a failure on ascertaining the truth value of a given fact
does not necessarily imply that such fact is false, but rather that its truth value cannot
be deductively inferred from the KB (e.g. because of incomplete or uncertain
knowledge). Other issues are related to the distributed nature of the data across the Web. Cases
of contradictory answers or flawed inferences may be caused by distributed pieces of
knowledge that may be mutually conflicting.
      </p>
      <p>
        The prediction of the truth value of an assertion can be cast as a classification
problem to be solved through statistical learning: individual resources in an ontology can be
regarded as statistical units, and their properties can be statistically inferred even when
they cannot be deduced from the KB. Several approaches have been proposed in the SW
literature (see [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] for a recent survey). A major issue with the methods proposed so far
is that the induced statistical models (as those produced by kernel methods, tensor
factorization, etc.) are either difficult to interpret by experts and to integrate in logic-based
SW infrastructures, or computationally impractical.
1.1
      </p>
      <sec id="sec-1-1">
        <title>Related Work</title>
        <p>
          A variety of methods have been proposed for predicting the truth value of assertions in
Web ontologies, including generative models [
          <xref ref-type="bibr" rid="ref18 ref21">18,21</xref>
          ], kernel methods [
          <xref ref-type="bibr" rid="ref16 ref4 ref8">4,8,16</xref>
          ],
upgrading of propositional algorithms [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], matrix and tensor factorization methods [
          <xref ref-type="bibr" rid="ref17 ref26 ref9">9,17,26</xref>
          ].
An issue with existing methods is that they either rely on a possibly expensive search
process, or induce statistical models that are often not easy to interpret by human
experts. Kernel methods induce models (such as separating hyperplanes) in a
highdimensional feature space implicitly defined by a kernel function. The underlying
kernel function itself usually relies on purely syntactic features of the neighborhood graphs
of two individual resources (such as their common subtrees [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]). In both cases, there is
not necessarily a direct translation in terms of domain knowledge. Latent variable and
matrix or tensor factorization methods such as [
          <xref ref-type="bibr" rid="ref17 ref21 ref26 ref9">9, 17, 21, 26</xref>
          ] try to explain the
observations in terms of latent classes or attributes, which also may be non-trivial to describe
using the domain’s vocabulary. The approaches in [
          <xref ref-type="bibr" rid="ref15 ref18">15, 18</xref>
          ] try to overcome this
limitation by making use of more complex features and knowledge representation formalisms
jointly with the ontology’s terminology: these methods involve either a search process
in a possibly very large feature space or complex probabilistic inference tasks, which
might not be feasible in practice.
1.2
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Contribution</title>
        <p>
          We propose a transductive inference method for predicting the truth value of assertions,
which is based on the following intuition: examples (each represented by a individual
in the ontology) that are similar in some aspects tend to be linked by specific relations.
Yet it may be not straightforward to find such relations for a given learning task. Our
approach aims at identifying such relations, and permits the efficient propagation of
information along chains of related entities. It turns out to be especially useful with
realworld shallow ontologies [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] (i.e. those with a relatively simple fixed terminology and
populated by very large amounts of instance data such as citation or social networks),
in which the characteristics of related entities tend to be correlated. These features are
particularly relevant in the context of the Linked Open Data [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] (LOD). Unlike other
approaches, the proposed method can be used to elicit which relations link examples
with similar characteristics. The proposed method is efficient, since the complexity of
both inference and learning grows polynomially with the number of statistical units.
        </p>
        <p>
          As in graph-based semi-supervised learning (SSL) methods [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], we rely on a
similarity graph among examples for propagating knowledge. However, SSL methods are
often designed for propositional representations; our method addresses the problem of
learning from real ontologies, where examples can be interlinked by heterogeneous
relations. In particular, this paper makes the following contributions:
– A method to efficiently propagating knowledge among similar examples: it
leverages a similarity graph, which plays a critical role in the propagation process.
– An approach to efficiently learning the similarity matrix, by leveraging a set of
semantically heterogeneous relations among examples in the ontology.
To the best of our knowledge, our approach is the first to explicitly identify relations
that semantically encode similarities among examples w.r.t. a given learning task.
        </p>
        <p>The remainder of the paper is organized as follows. In the next section, we review
the basics of semantic knowledge representation and reasoning tasks, and we introduce
the concept of transductive learning in the context of semantic KBs. In Sect. 3, we
illustrate the proposed method based on an efficient propagation of information among
related entities, and address the problem of identifying the relations relevant to the
learning task. In Sect. 4, we provide empirical evidence for the effectiveness of the
proposed method. Finally, in Sect. 5, we summarize the proposed approach, outline its
limitations and discuss possible future research directions.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Transductive Learning with Web Ontologies</title>
      <p>
        We consider ontological KBs using Description Logics (DLs) as a language to describe
objects and their relations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] Basics elements are atomic concepts NC = fC; D; : : :g
interpreted as subsets of a domain of objects (e.g. Person or Article), and atomic
roles NR = fR; S; : : :g interpreted as binary relations on such a domain (e.g. friendOf
or authorOf). Domain objects are represented by individuals NI = fa; b; : : :g: each
represents a domain entity (such as a person in a social network, or an article in a
citation network).
      </p>
      <p>In RDF(S)/OWL 1, concepts, roles and individuals are referred to as classes,
properties and resources, respectively, and are identified by their URIs. Complex concept
descriptions can be built using atomic concepts and roles by means of specific
constructors offered by expressive DLs.</p>
      <p>A Knowledge Base (KB) K = hT ; Ai contains a TBox T , made by terminological
axioms, and an ABox A, made by ground axioms, named assertions, involving
individuals. Ind(A) denotes the set of individuals occurring in A. As inference procedure,
Instance Checking consists in deciding whether K j= Q(a) (where Q is a query concept
and a is an individual) holds. Because of the Open-World Assumption (OWA), instance
checking may provide three possible outcomes, i.e. i) K j= Q(a), ii) K j= :Q(a)
and iii) K 6j= Q(a) ^ K 6j= :Q(a). This means that failing to deductively infer the
membership of an individual a to a concept Q does not imply that a is a member of its
complement :Q.</p>
      <p>
        Given the inherent incompleteness of deductive inference under open-world
reasoning, in this work we focus on transductive inference [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]: this consists in reasoning from
observed (training) cases to a specific set of test cases, without necessarily generalizing
to unseen cases. This differs from inductive inference, where training cases are used to
infer a general model, which is then applied to test cases.
      </p>
      <p>
        The main motivation behind the choice of transductive learning is described by the
main principle in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]: “If you possess a restricted amount of information for solving
some problem, try to solve the problem directly and never solve a more general problem
1 OWL 2 W3C Recommendation: http://www.w3.org/TR/owl-overview/
as an intermediate step. It is possible that the available information is sufficient for a
direct solution but is insufficient for solving a more general intermediate problem”.
      </p>
      <p>On the ground of the available information, the proposed approach aims at learning
a labeling function for a given target class that can be used for predicting whether
individuals belong to a class C (positive class) or to its complement :C (negative class)
when this cannot be inferred deductively. The problem can be defined as follows:
Definition 2.1 (Transductive Class-Membership Learning). Given:
– a target class C in a KB K;
– a set of examples X Ind(A) partitioned into:
a set of positive examples: X+ , fa 2 X j K j= C(a)g;
a set of negative examples: X , fa 2 X j K j= :C(a)g;
a set of neutral (unlabeled) examples: X0 , fa 2 X j a 62 X+ ^ a 62 X g;
– a space of labeling functions F with domain X and range f 1; +1g, i.e.</p>
      <p>F , ff j f : X ! f+1; 1gg;
– a cost function cost( ) : F 7! R, specifying the cost associated to each labeling
functions f 2 F ;
Find f 2 F minimizing cost( ) w.r.t. X:
f arg min cost(f ):</p>
      <p>f 2F</p>
      <p>The transductive learning task is cast as the problem of finding a labeling function
f for a target class C, defined over a finite set of labeled and unlabeled examples X
(represented by a subset of the individuals in the KB), and minimizing some arbitrary
cost criterion.</p>
      <p>Example 2.1 (Transductive Class-Membership Learning). Consider an ontology
modeling an academic domain. The problem of learning whether a set of persons is affiliated
to a given research group or not, provided a set of positive and negative examples of
affiliates, can be cast as a transductive class-membership learning problem: examples
(a subset of the individuals in the ontology, each representing a person), represented by
the set X, can be either positive, negative or neutral depending on their membership to
a target class ResearchGroupAffiliate. Examples can be either unlabeled (if their
membership to the target class cannot be determined) or labeled (if positive or
negative). The transductive learning problem reduces to finding the best labeling function f
(according to a given criterion, represented by the cost function) providing membership
values for examples in X.</p>
      <p>
        In this work, we exploit the relations holding among examples to propagate
knowledge (in the form of label information) through chains of related examples. Inspired
by graph-based semi-supervised transductive learning, the criterion on which the cost
function will be defined follows the semi-supervised smoothness assumption [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: if two
points in a high-density region are close, then so should be the corresponding outputs.
Transductive and semi-supervised learning are closely related: in both settings, test
examples are available during the learning task (in the form of unlabeled examples). In
the proposed method, the learning task is reduced to finding a labeling function f which
varies smoothly across the similarity graph defined over examples.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Knowledge Propagation</title>
      <p>In this section, we present our method for solving the learning problem in Def. 2.1 in
the context of Web ontologies. In Sect. 3.1 we show that a similarity graph between
examples can be used to efficiently propagate label information among similar examples.
The effectiveness of this approach strongly depends on the choice of a similarity graph
(represented by its adjacency matrix W). In Sect. 3.2, we show how the matrix W can
be learned from examples, by leveraging their relationship within the ontology.
3.1</p>
      <sec id="sec-3-1">
        <title>Transductive Inference as an Optimization Problem</title>
        <p>We now propose a solution to the transductive learning problem in Def. 2.1. As
discussed in the end of Sect. 2, we need a labeling function f defined over examples X,
which is both consistent with the training labels, and varies smoothly among similar
examples (according to the semi-supervised smoothness assumption). In the following,
we assume that a similarity graph over examples in X is already provided. Such a graph
is represented by its adjacency matrix W, such that Wij = Wji 0 if xi; xj 2 X
are similar, and 0 otherwise (for simplicity we assume that Wii = 0). In Sect. 3.2 we
propose a solution to the problem of learning W from examples.</p>
        <p>
          Formally, each labeling function f can be represented by a finite-size vector, where
fi 2 f 1; +1g is the label for the i-th element in the set of examples X. According
to [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], labels can be enforced to vary smoothly among similar examples by considering
a cost function with the following form:
        </p>
        <p>E(f ) ,
2
1 XjXj jXj</p>
        <p>X Wij (fi
i=1 j=1
fj )2 +
jXj
X fi2;
i=1
(1)
where the first term enforces the labeling function to vary smoothly among similar
examples (i.e. those connected by an edge in the similarity graph), and the second term
is a L2 regularizer (with weight &gt; 0) over f . A labeling for unlabeled examples X0
is obtained by minimizing the function E( ) in Eq. 1, constraining the value of fi to 1
(resp. 1) if xi 2 X+ (resp. xi 2 X ).</p>
        <p>
          Let L , X+ [ X and U , X0 represent labeled and unlabeled examples
respectively. Constraining f to discrete values, i.e. fi 2 f 1; +1g; 8xi 2 X0, has two main
drawbacks:
– The function f only provides a hard classification (i.e. fU 2 f 1; +1gjUj), any
measure of confidence;
– E defines the energy function of a discrete Markov Random Field, and calculating
the marginal distribution over labels fU is inherently difficult [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          To overcome these problems, in [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] authors propose a continuous relaxation of fU ,
where labels for unlabeled examples are represented by real values, fU 2 RjUj, which
also express a measure of the classification confidence. This allows for a simple,
closedform solution to the problem of minimizing E for a fixed fL, where fL represents the
labels of labeled examples.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Application to Class-Membership Learning We can solve the learning problem in</title>
        <p>
          Def. 2.1 by minimizing the cost function E( ) in Eq. 1. It can be rewritten as [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]:
E(f ) = f T (D
        </p>
        <p>W)f + f T = f T (L + I)f ;
where D is a diagonal matrix such that Dii = PjjX=j1 Wij and L , D W is the graph
Laplacian of W. Reordering the matrix W, the graph Laplacian L and the vector f w.r.t.
their membership to L and U , they can be rewritten as:</p>
        <p>W =</p>
        <p>WLL WLU ;
WUL WUU</p>
        <p>L =</p>
        <p>
          LLL LLU ; f =
LUL LUU
fL :
fU
The problem of finding the fU minimizing the cost function E for a fixed value for fL
has a closed form solution:
fU = (LUU + I) 1WULfL:
(2)
(3)
(4)
Complexity A solution for Eq. 4 can be computed efficiently in nearly-linear time
w.r.t. jXj. Indeed computing fU can be reduced to solving a linear system in the form
Ax = b, with A = (LUU + I), b = WULfL and x = fU . A linear system Ax = b
with A 2 Rn n can be solved in nearly linear time w.r.t. n if the coefficient matrix
A is symmetric diagonally dominant2 (SDD), e.g. using the algorithm in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], whose
time-complexity is O m log1=2 n , where m is the number of non-zero entries in A
and n is the number of variables in the system of linear equations. In Eq. 4, the matrix
(LUU + I) is SDD (since LUU is a principal submatrix of L, which is SDD [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]). An
efficient parallel solver for SDD linear systems is discussed in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Learning to Propagate Knowledge in Web Ontologies</title>
        <p>As discussed in Sect. 3.1, the proposed approach to knowledge propagation relies on a
similarity graph, represented by its adjacency matrix W.</p>
        <p>The underlying assumption of this work is that some relations among examples
in the KB might encode a similarity relation w.r.t. a specific target property or class.
Identifying such relations can help propagate information through similar examples.</p>
        <p>
          In the literature, this effect goes under the name of Guilt-by-Association [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]:
related examples influence each other, and some relations (e.g. friendship in a social
network) can encode some form of similarity w.r.t. a set of properties (such as political
views, hobbies, religious beliefs). However, depending on the learning task at hand, not
all relations are equally effective at encoding similarity relations. For example in a
social network, friends may tend to share common interests, while quiet people may tend
to prefer talkative friends and vice-versa [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>In this work, we represent each relation by means of an adjacency matrix W~ , such
that W~ ij = W~ ji = 1 iff the relation rel(xi; xj ) between xi and xj holds in the
ontology; wrel might represent any generic relation between examples (e.g. friendship
or co-authorship). For simplicity, we assume that W~ ii = 0; 8i.
2 A matrix A is SDD iff A is symmetric (i.e. A = AT ) and 8i : Aii
Pi6=j jAijj.</p>
        <p>
          Given a set of adjacency matrices W , fW~ 1; : : : ; W~ rg (one for each relation
type), we can define W as a linear combination of the matrices in W:
r
W , X
i=1
iW~ i;
with i
where i, is a parameter representing the contribution of W~ i to the adjacency matrix of
the similarity graph W. Non-negativity in ensures that W has non-negative weights,
and therefore the corresponding graph Laplacian L is positive semidefinite [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] (PSD),
leading to the unique, closed form solution in Eq. 4.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Probabilistic Interpretation as a Gaussian Random Field Let us consider the relax</title>
        <p>
          ation of the energy function in Eq. 2, such that labels f are allowed to range in RjXj
(f 2 RjXj). It corresponds to the following probability density function over f :
The probability density function in Eq. 6 defines a Gaussian Markov Random Field [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
(GMRF) f N (0; ), where = 1 and = (L+ I) are respectively its
covariance and inverse covariance (or precision) matrix, and j j indicates the determinant of
the covariance matrix .
        </p>
        <p>
          The covariance matrix and its inverse fully determine the independence relations
among variables in a GMRF [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]: if ij 6= 0, then there is an edge between fi and fj in
the minimal I-map GMRF of p. A zero element in the inverse covariance matrix implies
that two variables are conditionally independent given all the other variables.
Parameters Learning The parametric form of W is fully specified by the
parameters in Eq. 5, which may be unknown. We will estimate the parameters by means of
Leave-One-Out (LOO) Error minimization: given that propagation can be performed
efficiently, we are able of directly minimizing the LOO error, consisting in the
summation of reconstruction errors obtained by considering each labeled example, in turn,
as unlabeled, and predicting its label (as in [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]). This leads to a computationally
efficient procedure for evaluating the matrix W, and yields more flexibility as arbitrary
loss functions are allowed. Let Ui , U [ fxig and Li , L fxig: the labeling vector
f and matrices W and L, for any given xi 2 L, can be rewritten as follows:
f =
fLi ;
fUi
        </p>
        <p>W =</p>
        <sec id="sec-3-4-1">
          <title>WLiLi WLiUi ;</title>
          <p>WUiLi WUiUi
L =</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>LLiLi LLiUi ;</title>
          <p>LUiLi LUiUi
where w.l.o.g. we assume that the left-out example xi 2 L corresponds to the first
element in Ui (in the enumeration used for the block representation in Eq. 7). Let `(x; x^)
be a generic, differentiable loss function (e.g. `(x; x^) = jx x^j for the absolute loss, or
`(x; x^) = (x x^)2=2 for the quadratic loss). The LOO Error is defined as follows:
Q( ) ,
where eT , (1; 0; : : : ; 0) 2 Ru+1 and ^fi , eT (LUiUi + I) 1WUiLi fLi represents the
continuous label value assigned to xi as if such a value was not known in advance. The
vector eT is needed to select the first value of fUi only, i.e. the inferred continuous label
associated to the left-out example xi 2 L. This leads to the definition of the following
criterion for learning the optimal set of parameters , f ; g:</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Definition 3.1 (Minimum LOO Error Parameters). Given a set of labeled (resp. un</title>
        <p>labeled) examples L (resp. U ) and a similarity matrix W defined by parameters
(according to the parametric form of W in Eq. 5), the minimum LOO Error Parameters
LOO are defined as follows:</p>
        <p>LOO , arg min Q( ) + jj jj2;
(9)
where the function Q is defined as in Eq. 8 and &gt; 0 is a small positive scalar that
weights a regularization term over (useful for avoiding some parameters to diverge).</p>
        <p>The objective function in Def. 3.1 is differentiable and can be efficiently minimized
by using gradient-based function minimization approaches such as L-BFGS.</p>
        <p>Let Zi = (LUiUi + I). The gradient of Q w.r.t. a parameter 2 is given by:
fUi
(10)
Complexity of the Gradient Calculation Let zi = @W@UiLi fLi @@Zi fUi .
Calculating Z 1</p>
        <p>i zi can be reduced to solving a linear system in the form Ax = b, with
A = Zi = (LUiUi + I) and b = zi. As discussed in Sect. 3.1, this calculation has a
nearly-linear complexity in the number of non-zero elements in A, since Zi is SDD.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Empirical Evaluation</title>
      <p>The transductive inference method discussed in Sect. 3, which we will refer to as
Adaptive Knowledge Propagation (AKP), was experimentally evaluated 3 in comparison with
other approaches proposed in the literature on a variety of assertion prediction
problems. In the following, we describe the setup of experiments and their outcomes.
4.1</p>
      <sec id="sec-4-1">
        <title>Setup</title>
        <p>
          In empirical evaluations, we used an open source DL reasoner 4. In experiments, we
considered the DBPEDIA 3.9 Ontology [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. DBPEDIA [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] makes available structured
information extracted from Wikipedia the LOD cloud providing unique identifiers for
the described entities that can be dereferenced over the Web. DBPEDIA 3.9, released in
September 2013, describes 4.0 million entities.
3 Sources and datasets are available at http://lacam.di.uniba.it/phd/pmm.html
4 Pellet v2.3.1 – http://clarkparsia.com/pellet/
Experimental Setting As discussed in Sect. 3.2, parameters = f ; g in AKP are
estimated by minimizing the Leave-One-Out error Q, as described in Eq. 9. We solved
the problem by using Projected Gradient Descent, according to the gradient formulation
in Eq. 10 (enforcing 0 and &gt; 0), together with an intermediate line search to
assess the step size. The regularization parameter in Eq. 9 was fixed to = 10 8.
In this work, each of the adjacency matrices W = fW~ 1; : : : ; W~ rg is associated to a
distinct atomic role in the ontology, linking at least two examples.
        </p>
        <p>
          Before each experiment, all knowledge inherent to the target class was removed
from the ontology. Following the evaluation procedures in [
          <xref ref-type="bibr" rid="ref16 ref28">16, 28</xref>
          ], members of the
target concepts were considered as positive examples, while an equal number of negative
examples was randomly sampled from unlabeled examples. Remaining instances (i.e.
neither positive nor negative) were considered as neutral examples.
        </p>
        <p>
          Results are reported in terms of Area Under the Precision-Recall Curve
(AUCPR), a measure to evaluate rankings also used in e.g. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], and calculated using the
procedure described in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. In each experiment, we considered the problem of predicting
the membership to each of several classes; for each of such classes, we performed a
10-fold cross validation (CV), and report the average AUC-PR obtained using each of
the considered methods. Since the folds used to evaluate each of the methods do not
vary, we report statistical significance tests using a paired, non-parametric difference
test (Wilcoxon T test). We also report diagrams showing how using a limited quantity
of randomly sampled labeled training instances (i.e. 10%; 30%; 50%; : : :, a plausible
scenario for a number of real world settings with limited labeled training data), and
using the remaining examples for testing, affects the results in terms of AUC-PR.
Setup of the Compared Methods We compared our method with state-of-the-art
approaches proposed for learning from ontological KBs. Specifically, we selected two
kernel methods: Soft-Margin SVM [23, pg. 223] (SM-SVM) and Kernel Logistic
Regression (KLR), jointly with the Intersection SubTree [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] (IST) kernel for ontological
KBs, and the SUNS [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] relational prediction model. The relational graph used by both
the RDF kernel and SUNS was materialized as follows: all hs; p; oi triples were
retrieved by means of SPARQL-DL queries (where p was either an object or a data-type
property) together with all direct type and direct sub-class relations.
        </p>
        <p>
          As in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], IST kernel parameters were ranging in d 2 f1; 2; 3; 4g and ist 2
f0:1; 0:3; : : : ; 0:9g). In order to obtain a ranking among instances (provided by
softlabels f in AKP), we applied the logistic function s to the decision boundary f instead
of the standard sign function, commonly used in the classification context (thus
obtaining s(f ( )) : X ! [0; 1]). In SM-SVM, C 2 f0:0; 10 6; 10 4; : : : ; 104; 106g,
while in KLR the weight k associated to the L2 regularizer was found considering
k 2 f10 4; 10 3; : : : ; 104g. In SUNS, parameters were selected by means of a
10fold CV within the training set by grid optimization, with t 2 f2; 4; 6; : : : ; 24g and
s 2 f0; 10 2; 10 1; : : : ; 106g.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
        <p>
          Similarly to [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], we evaluated the proposed approach on two prediction tasks, namely
predicting party affiliations to either the Democratic and the Republican party for US
0.8
R
P
-0.6
C
U
A
0.4
0.2
        </p>
        <p>AUC-PR results – DBpedia 3.9 Fragment</p>
        <p>AKP (A)
SVM (IST)
KLR (IST)</p>
        <p>SUNS
10% 30% 50% 70% 90%
Percentage of labeled examples used during training
Method</p>
        <p>AKP (A)
SM-SVM (IST)</p>
        <p>KLR (IST)</p>
        <p>
          SUNS
presidents and vice-presidents. The experiment illustrated in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] uses a small RDF
fragment containing the president and vicePresident predicates only. In this
experiment, we used a real-life fragment of DBPEDIA 3.9 (obtained by means of a
crawling process), containing a number of irrelevant and possibly noisy entities and relations.
Following the procedure in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], the DBPEDIA 3.9 RDF graph was traversed starting
from resources representing US presidents and vice-presidents: all immediate
neighbors, i.e. those with a recursion depth of 1, were retrieved, together with their related
schema information (direct classes and their super-classes, together with their
hierarchy). All extracted knowledge was used to create an ALCH ontology fragment, with
78795 axioms, 16606 individuals, 132 properties and 11 classes.
        </p>
        <p>In this experiment, 82 individuals representing US presidents and vice-presidents
were interlinked by 25 relations represented by atomic roles. The proposed method,
denoted as AKP (A), makes use of such atomic roles to identify relations holding among
the examples in the ontology.</p>
        <p>Experimental results are summarized in Fig. 1. We observe that AUC-PR values
obtained with AKP (A) are significantly higher than results obtained by other methods
considered in comparison (p &lt; 0:05, except for three cases in which p &lt; 0:10). Results
show how presidents and vice-presidents linked by simple relations such as president
and vicePresident tend to be affiliated to the same political party.</p>
        <p>AKP (A) is able to identify which atomic roles are likely to link same party
affiliates. As expected, it recognizes that relations represented by the president and
vicePresident atomic roles should be associated to higher weights, which means
that presidents and their vice-presidents tend to have similar political party affiliations.
AKP (A) also recognizes that presidents (or vice-presidents) linked by the successor
atomic role are unlikely to have similar political party affiliations.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Works</title>
      <p>In this work, we proposed a semi-supervised transductive inference method for
statistical learning in the context of the WEB OF DATA. Starting from the assumption that
some relations among entities in a Web ontology can encode similarity information
w.r.t. a given prediction task (pertaining a particular property of examples, such as a
class-membership relation), we proposed a method (named Adaptive Knowledge
Propagation, or AKP) for efficiently learning the best way to propagate knowledge among
related examples (each represented by an individual) in a Web ontology.</p>
      <p>We empirically show that the proposed method is able to identify which relations
encode similarity w.r.t. a given property, and that their identification can provide an
effective method for predicting unknown characteristics of individuals. We also show that
the proposed method can provide competitive results, in terms of AUC-PR, in
comparison with other state-of-the-art methods in literature.</p>
      <p>
        We only considered relations between statistical units (i.e. training examples)
encoded by atomic roles. However, those do not always suffice: for example, in the
research group affiliation prediction task discussed in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], individuals representing
researchers in the AIFB PORTAL ontology are not related by any atomic role. We are
currently investigating other approaches to identifying meaningful relations among
individuals, for example by means of Conjunctive Query Answering [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Other research
directions involve the study of different objective functions and optimization methods.
Acknowledgments This work fulfills the objectives of the PON 02 00563 3489339
project “PUGLIA@SERVICE - Internet-based Service Engineering enabling Smart
Territory structural development” funded by the Italian Ministry of University and
Research (MIUR).
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F</given-names>
          </string-name>
          . (eds.):
          <article-title>The Description Logic Handbook</article-title>
          . Cambridge University Press (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hendler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>The Semantic Web</article-title>
          .
          <source>Scientific American</source>
          <volume>284</volume>
          (
          <issue>5</issue>
          ),
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
          (May
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Becker</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>DBpedia - a crystallization point for the Web of Data</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <fpage>154</fpage>
          -
          <lpage>165</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bloehdorn</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sure</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Kernel methods for mining instance data in ontologies</article-title>
          . In: Aberer,
          <string-name>
            <surname>K.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ISWC'07. LNCS</source>
          , vol.
          <volume>4825</volume>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>71</lpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chapelle</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , Scho¨lkopf,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Zien</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.):
          <article-title>Semi-Supervised Learning</article-title>
          . MIT Press (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>M.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kyng</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pachocki</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          :
          <article-title>Solving SDD linear systems in nearly mlog1/2n time</article-title>
          .
          <source>In: Shmoys [24]</source>
          , pp.
          <fpage>343</fpage>
          -
          <lpage>352</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goadrich</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The relationship between Precision-Recall and ROC curves</article-title>
          . In: Cohen,
          <string-name>
            <surname>W.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ICML'06</source>
          . pp.
          <fpage>233</fpage>
          -
          <lpage>240</lpage>
          . ACM (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Induction of robust classifiers for web ontologies through kernel machines</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>11</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Franz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schultz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sizov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>TripleRank: Ranking Semantic Web Data by Tensor Decomposition</article-title>
          . In: Bernstein,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , et al. (eds.)
          <source>International Semantic Web Conference. LNCS</source>
          , vol.
          <volume>5823</volume>
          , pp.
          <fpage>213</fpage>
          -
          <lpage>228</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Linked Data: Evolving the Web into a Global Data Space</article-title>
          .
          <source>Synthesis Lectures on the Semantic Web</source>
          , Morgan &amp; Claypool Publishers (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Learning of OWL Class Descriptions on Very Large Knowledge Bases</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst</source>
          .
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <fpage>25</fpage>
          -
          <lpage>48</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Kro¨tzsch,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Rudolph</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>Foundations of Semantic Web Technologies</article-title>
          . Chapman &amp; Hall/CRC (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Koller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedman</surname>
          </string-name>
          , N.:
          <article-title>Probabilistic Graphical Models: Principles and Techniques</article-title>
          . MIT Press (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Koutra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ke</surname>
          </string-name>
          , T.Y.,
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chau</surname>
            ,
            <given-names>D.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pao</surname>
            ,
            <given-names>H.K.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Unifying</surname>
          </string-name>
          Guiltby-Association Approaches:
          <article-title>Theorems and Fast Algorithms</article-title>
          . In: Gunopulos,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ECML/PKDD'11. LNCS</source>
          , vol.
          <volume>6912</volume>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>260</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>H.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koul</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Honavar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Learning Relational Bayesian Classifiers from RDF Data</article-title>
          . In: Aroyo,
          <string-name>
            <surname>L.</surname>
          </string-name>
          , et al. (eds.)
          <source>International Semantic Web Conference (1)</source>
          . LNCS, vol.
          <volume>7031</volume>
          , pp.
          <fpage>389</fpage>
          -
          <lpage>404</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. Lo¨sch, U.,
          <string-name>
            <surname>Bloehdorn</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Graph kernels for RDF data</article-title>
          . In: Simperl,
          <string-name>
            <surname>E.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ESWC'12. LNCS</source>
          , vol.
          <volume>7295</volume>
          , pp.
          <fpage>134</fpage>
          -
          <lpage>148</lpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Nickel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tresp</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kriegel</surname>
            ,
            <given-names>H.P.:</given-names>
          </string-name>
          <article-title>A Three-Way Model for Collective Learning on MultiRelational Data</article-title>
          . In: Getoor,
          <string-name>
            <surname>L.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ICML'11</source>
          . pp.
          <fpage>809</fpage>
          -
          <lpage>816</lpage>
          .
          <string-name>
            <surname>Omnipress</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Ochoa-Luna</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cozman</surname>
            ,
            <given-names>F.G.</given-names>
          </string-name>
          :
          <article-title>An Algorithm for Learning with Probabilistic Description Logics</article-title>
          . In: Bobillo,
          <string-name>
            <surname>F.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of the 5th International Workshop on Uncertainty Reasoning for the Semantic Web, URSW09. CEUR Workshop Proceedings</source>
          , vol.
          <volume>654</volume>
          , pp.
          <fpage>63</fpage>
          -
          <lpage>74</lpage>
          . CEUR-WS.org (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spielman</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          :
          <article-title>An efficient parallel solver for SDD linear systems</article-title>
          .
          <source>In: Shmoys [24]</source>
          , pp.
          <fpage>333</fpage>
          -
          <lpage>342</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Lo¨sch, U.,
          <string-name>
            <surname>Tresp</surname>
          </string-name>
          , V.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.:
          <article-title>Mining the Semantic Web: Statistical learning for next generation knowledge bases</article-title>
          .
          <source>Data Min. Knowl. Discov</source>
          .
          <volume>24</volume>
          (
          <issue>3</issue>
          ),
          <fpage>613</fpage>
          -
          <lpage>662</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nickles</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tresp</surname>
          </string-name>
          , V.:
          <article-title>Statistical Relational Learning with Formal Ontologies</article-title>
          . In: Buntine,
          <string-name>
            <surname>W.L.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML/PKDD'09. LNCS</source>
          , vol.
          <volume>5782</volume>
          , pp.
          <fpage>286</fpage>
          -
          <lpage>301</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Shadbolt</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>W.:</given-names>
          </string-name>
          <article-title>The Semantic Web Revisited</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>21</volume>
          (
          <issue>3</issue>
          ),
          <fpage>96</fpage>
          -
          <lpage>101</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Shawe-Taylor</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Cristianini</surname>
          </string-name>
          , N.:
          <article-title>Kernel Methods for Pattern Analysis</article-title>
          . Cambridge University Press (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Shmoys</surname>
          </string-name>
          , D.B. (ed.):
          <source>Symposium on Theory of Computing, STOC</source>
          <year>2014</year>
          , New York, NY, USA, May 31 - June 03,
          <year>2014</year>
          . ACM (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Spielman</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Algorithms</surname>
          </string-name>
          , Graph Theory, and
          <article-title>Linear Equations in Laplacian Matrices</article-title>
          .
          <source>In: Proceedings of ICM'10</source>
          . pp.
          <fpage>2698</fpage>
          -
          <lpage>2722</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Tresp</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bundschus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Materializing and querying learned knowledge</article-title>
          .
          <source>In: Proceedings of IRMLeS'09</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.N.</given-names>
          </string-name>
          :
          <article-title>Statistical learning theory</article-title>
          .
          <source>Wiley</source>
          ,
          <volume>1</volume>
          <fpage>edn</fpage>
          .
          <source>(Sep</source>
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28. de Vries,
          <string-name>
            <surname>G.K.D.</surname>
          </string-name>
          :
          <article-title>A Fast Approximation of the Weisfeiler-Lehman Graph Kernel for RDF Data</article-title>
          . In: Blockeel,
          <string-name>
            <surname>H.</surname>
          </string-name>
          , et al. (eds.)
          <article-title>ECML/PKDD (1)</article-title>
          . LNCS, vol.
          <volume>8188</volume>
          , pp.
          <fpage>606</fpage>
          -
          <lpage>621</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , et al.:
          <article-title>Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms</article-title>
          . In: Scho¨lkopf,
          <string-name>
            <surname>B.</surname>
          </string-name>
          , et al. (eds.) NIPS. pp.
          <fpage>1585</fpage>
          -
          <lpage>1592</lpage>
          . MIT Press (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lafferty</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          :
          <article-title>Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions</article-title>
          . In: Fawcett,
          <string-name>
            <surname>T.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ICML'03</source>
          . pp.
          <fpage>912</fpage>
          -
          <lpage>919</lpage>
          . AAAI Press (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>