<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Provision and Usage of Provenance Data in the WebIsALOD Knowledge Graph</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sven Hertling</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heiko Paulheim</string-name>
          <email>heikog@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data and Web Science Group, University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The WebIsALOD dataset provides a linked data endpoint to the WebIsA database, which harvests millions of subsumption relations from a large scale Web crawl using text patterns. For each of the relations, the dataset also contains rich provenance data, such as the text pattern used, the original sentence in which the pattern was found, and the source on the Web. In this paper, we describe several alternatives and design decisions for providing statement-level provenance information at large scale for the WebIsALOD dataset. Furthermore, we show the practical impact of that provenance information for computing con dence scores approximating the correctness of each subsumption relation.</p>
      </abstract>
      <kwd-group>
        <kwd>Provenance</kwd>
        <kwd>Knowledge Graph</kwd>
        <kwd>Rei cation</kwd>
        <kwd>Singleton Property</kwd>
        <kwd>Named Graph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The WebIsALOD Knowledge Graph
WebIsALOD is a large-scale, cross-domain Semantic Web Knowledge Graph,
which provides subsumption relations between entities recognized on the Web.
The knowledge graph has been created from an initial extraction of this
information, i.e., the WebIsADB dataset [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], in order to provide a service in line
with Linked Data standards and best practices [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        The main idea of the WebIsADB is to extract hypernymy relations from
a huge and xed web crawl called CommonCrawl1. The extraction method is
based on 58 Hearst-like lexico-syntactic patterns [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which are frequent patterns
to describe hypernymy relations. For example, the sentence Still, people use
Gmail and other Web services implies the hypernymy relation between Gmail
and Web service, which can be captured by the pattern NP and other NP.2 The
original dataset contains 400,533,808 relations, 120,992,255 unique hyponyms,
and 107,691,822 unique hypernyms. Thus, the knowledge graph contains many
more instances than the popular public knowledge graphs such as DBpedia [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        For providing the WebIsADB as Linked Data, we represent the hypernymy
relations using SKOS3 via the skos:broader relation. As described in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the
Copyright c 2018 for this paper by its authors. Copying permitted for private and
academic purposes.
1 https://commoncrawl.org
2 NP stands for noun phrase.
3 https://www.w3.org/TR/skos-reference/
„Web“
„service“ isa:hasHead
      </p>
      <p>isa:hasPostModifier
„Web service“</p>
      <p>
        rdfs:label
isa:concept/Web_service_
original dataset contains a lot of noisy extractions, therefore, we train a machine
learning model to compute a con dence score for each relation (see section 4).
The nal resulting dataset consists of the original 400,533,808 hypernymy
relations, together with a con dence score and provenance metadata (see below), as
well as 2,593,181 instance links to DBpedia [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and 23,771 class links to YAGO
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. All in all, the dataset consists of 5.4B triples [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The dataset is available
online as a Linked Data service, a SPARQL endpoint, and as an RDF dump.4
2
      </p>
      <p>Provenance Information Provided
For each single relation, the WebIsADB also collects the information how it was
created { i.e., the originating sentence, its source, and the pattern that was used
to nd the relation. Furthermore, statistical metadata is computed from that
information, i.e., the overall number of observations, the number of di erent
patterns and the number of di erent sources in which the relation was found.
Additionally, we include a pointer to a scienti c literature source for each pattern
(i.e., the paper in which the pattern was proposed). Where possible, we reused
constructs from the PROV ontology5, while we created our own properties where
no suitable concepts were de ned in that ontology</p>
      <p>For each entity (i.e., hypernym or hyponym), we also provide information
generated during the syntactic analysis which is performed to extract the
statement, i.e., the head noun and potential pre and post modi ers. The big picture
4 http://webisa.webdatacommons.org/
5 https://www.w3.org/TR/prov-o/
isa:Web_service rdf:subject</p>
      <p>skos:broader rdf:predicate
isa:_Gmail rdf:object
isa:Web_service
isa:_Gmail
isa:broader_2
isa:broader_1
(a) Rei cation
(c) n-ary Relation
isa:source4082</p>
      <p>prov:wasDerivedFrom
_:43274</p>
      <p>prov:wasGeneratedBy
isa:activity54
isa:source4082</p>
      <p>prov:wasDerivedFrom
_:43274</p>
      <p>prov:wasGeneratedBy
isa:activity54
isa:Web_service</p>
      <p>skos:broader
isa:_Gmail
isa:prov43274</p>
      <p>isa:source4082
isa:Web_service prov:wasDerivedFrom</p>
      <p>isa:broader_43274 rdf:singletonPropertyOf
isa:_Gmail prov:wasGeneratedBy</p>
      <p>isa:activity54
(b) Singleton Property
isa:Web_service</p>
      <p>nd:provenancePartOf
isa:Web_service_@1 nd:provenanceExtent</p>
      <p>skos:broader
isa:_Gmail_@1 nd:provenanceExtent</p>
      <p>nd:provenancePartOf
isa:_Gmail
(d) NdFluents</p>
      <p>skos:broader
isa:source4082</p>
      <p>prov:wasDerivedFrom
_:43274</p>
      <p>prov:wasGeneratedBy
isa:activity54
isa:source4082
prov:wasDerivedFrom
prov:wasGeneratedBy</p>
      <p>isa:activity54
(e) Named Graphs
is shown in Fig. 1. Note that for each relation, multiple sources and patterns can
be provided.
3</p>
      <p>Alternatives for Providing Provenance Metadata
In the WebIsALOD knowledge graph, we provide provenance information on
statement level, i.e., for each single triple with a skos:broader relation,
metadata has to e attached to that very triple. To that end, we explored di erent
alternatives, which are shown in Fig. 2. Each of them has its own advantages
and disadvantages.
3.1</p>
    </sec>
    <sec id="sec-2">
      <title>RDF Rei cation</title>
      <p>RDF provides a means called rei cation to make statements about statements.
For each statement to be rei ed, a single RDF node representing the triple is
created, which has a relation to the subject, the predicate, and the object.</p>
      <p>On the positive side, RDF rei cation is well understood, since it is rather
intuitive and covered in many Semantic Web documentations, tutorials, and text
books6. On the negative side, the number or RDF triples is drastically increased
{ a single triple has to be replaced by four triples to allow for rei cation.
6 e.g., the W3C RDF primer, https://www.w3.org/TR/rdf-primer/
:43274 a rdf:Statement .
:43274 rdf:subject isa: GMail .
:43274 rdf:predicate skos:broader .
:43274 rdf:object isa:Web Service .
:43274 prov:wasDerivedFrom isa:source4082 .
:43274 prov:wasGeneratedBy isa:activity54 .</p>
      <p>SELECT DISTINCT ?label WHERE f
?s1 a rdf:Statement ;
rdf:subject isa: GMail ;
rdf:predicate skos:broader ;
rdf:object ?x .
?s2 a rdf:Statement ;
rdf:subject ?x ;
rdf:predicate skos:broader ;
rdf:object ?y .
?y rdfs:label ?label .
?s1 isaont:hasConfidence ?c1 .
?s2 isaont:hasConfidence ?c2 .</p>
      <p>FILTER(?c1&gt;0.75 &amp;&amp; ?c2&gt;0.75)
g</p>
      <p>In our case, this would mean that to represent 400M subsumption relations,
1.6B RDF triples would be required for the statements alone, not including any
provenance information. Likewise, SPARQL queries against such a dataset
involving both the subsumptions as well as the provenance information can become
rather complex.</p>
      <p>In the following, we will use the example of querying for ancestors of a xed
concept which are two levels up (i.e., broader terms of broader terms of a xed
concept), and both subsumption relations are required to have a minimum
condence of 0.75. Using rei cation, this query would look as in gure 3b.
3.2</p>
    </sec>
    <sec id="sec-3">
      <title>Singleton Properties</title>
      <p>
        An alternative proposed in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is to de ne a singleton property for each
subsumption relation. This property can be made an instance of the desired relation
(in our case: skos:broader) and is then used as a subject of the attached
provenance information. This approach is slightly less verbose than RDF rei cation.
      </p>
      <p>
        On the downside, in the case of WebIsALOD, the resulting schema with
400M direct subproperties of skos:broader could be regarded as slightly
deteriorated, and there are experience reports with large-scale knowledge graph
that hint at some triple stores su ering from such large numbers of singleton
properties [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Moreover, with this approach, additional important properties of
the skos:broader relation, in particular, transitivity, has to be taken particular
care of when implementing the semantics of rdf:singletonPropertyOf.
      </p>
      <p>The above query reformulated with singleton properties is shown in gure
3.3</p>
      <p>
        n-ary Relations
While RDF in its native form only supports binary relations, a pattern for the
representation of n-ary relations has been proposed as well [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], naming the
utilization for representing context information of a relation as a possible use
isa: GMail isa:broader 43274 isa:Web Service .
isa:broader 43274 rdf:singletonPropertyOf skos:broader .
isa:broader 43274 prov:wasDerivedFrom isa:source4082.
isa:broader 43274 prov:wasGeneratedBy isa:activity54 .
      </p>
      <p>SELECT DISTINCT ?label WHERE f
isa: GMail ?p1 ?x .
?x ?p2 ?y .
?p1 rdf:singletonPropertyOf skos:broader .
?p2 rdf:singletonPropertyOf skos:broader .
?y rdfs:label ?label .
?p1 isaont:hasConfidence ?c1 .
?p2 isaont:hasConfidence ?c2 .</p>
      <p>FILTER(?c1&gt;0.75 &amp;&amp; ?c2&gt;0.75)
g
case. The relation itself is represented as a blank node here, and the original
relation is broken down into two.</p>
      <p>The verbosity of this variant is fairly low, requiring only the blank node
for the relation as an additional resource. Transitivity of the skos:broader
relation can also be built into the model by using OWL 2 property chains, i.e.,
exploiting the transitivity of skos:broader would require an OWL 2 reasoner,
compared to standard OWL Lite reasoning for exploiting the simple transitivity
of the original de nition. Moreover, in the original description of the pattern,
OWL constraints for the new relations are also de ned, using both universal
and existential quanti cation, and hence leaving even the tractable OWL EL
fragment.</p>
      <p>Using n-ary relations, the above query would look like in gure 5b.
3.4</p>
    </sec>
    <sec id="sec-4">
      <title>NdFluents</title>
      <p>
        NdFluents is an ontology and a set of design patterns proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It is an
extension of the 4dFluents ontology for adding temporal context to statements
without changing the RDF data model [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], which it extends to arbitrary context
information beyond temporal context. The authors argue that this approach is
better suited for preserving inference than using RDF rei cation.
      </p>
      <p>The NdFluents approach foresees the creation of a \copy" for both the subject
and the object to attach context information to.</p>
      <p>We can observe that the number of triples is even larger than for rei
cation. Moreover, the new resources need to be created for each single statement</p>
      <p>SELECT DISTINCT ?label WHERE f
?gmail1 nd:provenancePartOf isa: GMail .
?gmail1 skos:broader ?x1 .
?x1 nd:provenancePartOf ?x.
?x2 nd:provenancePartOf ?x .
?x2 skos:broader ?y1 .
?y1 nd:provenancePartOf ?y .
?y rdfs:label ?label .
?x1 nd:provenanceExtent ?e1 .
?x2 nd:provenanceExtent ?e2 .
?x1 isaont:hasConfidence ?c1 .
?x2 isaont:hasConfidence ?c2 .</p>
      <p>FILTER(?c1&gt;0.75 &amp;&amp; ?c2&gt;0.75)
g
a resource is involved in. For example, the resource isa:_president_ has 1,821
hyponyms and 4,656 hypernyms, which would require the creation of 6,477 new
resources alone for representing the resource isa:_president_. In total, for
WebIsALOD, 400M relations would require the creation of 800M additional
resources, i.e., increasing the number of resources in the dataset by a factor of
more than four.</p>
      <p>The query above, formulated against an NdFluents dataset, is shown in gure
3.5</p>
    </sec>
    <sec id="sec-5">
      <title>RDF Graphs</title>
      <p>
        RDF named graphs form collections of RDF statements, which are said to belong
to a certain graph. Such a collection of RDF statements in an RDF graph is
assigned a URI (which makes it a named graph) and can be used as a subject
and/or object of other statements [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Often, RDF named graphs are represented
using RDF quads.7 For WebIsALOD, we turn every subsumption into its own
named graph, which is then used as a subject of further provenance information
(in the WebIsALOD main graph).
      </p>
      <p>
        Like RDF rei cation, named graphs are also easily understood, and the RDF
quad notation allows for relatively simple formulation of statements and e cient
SPARQL queries. Furthermore, the use of RDF rei cation is often discouraged
in the Linked Data context in favor of using graphs and quads instead [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. On the
downside, RDF graphs are originally meant to hold a collection of RDF triples,
whereas creating a single named graph for each triple, as in our case, can be
regarded as a slightly abusive use of named graphs.
      </p>
      <p>The query example would look like in gure 7b.
3.6</p>
    </sec>
    <sec id="sec-6">
      <title>Design Decision</title>
      <p>Looking at the considerations above may lead to di erent conclusions,
depending on which criteria are deemed more important. Our aim was to provide a
7 https://www.w3.org/TR/n-quads/
isa: GMail skos:broader isa:Web Service isa:prov43274.
isa:prov43274 prov:wasDerivedFrom isa:source4082 .
isa:prov43274 prov:wasGeneratedBy isa:activity54 .</p>
      <p>SELECT DISTINCT ?label WHEREf</p>
      <p>GRAPHisa?:g1GMfail skos:broader ?x .
gGRAPH ?g2 f</p>
      <p>?x skos:broader ?y .
?gy rdfs:label ?label.
?g1 isaont:hasConfidence ?c1.
?g2 isaont:hasConfidence ?c2.</p>
      <p>FILTER(?c1&gt;0.75 &amp;&amp; ?c2&gt;0.75)
g
dataset which is versatile enough to satisfy di erent use cases, as well as allows
good usability and understandability to ease adoption as much as possible.
Additionally, given the sheer data volume, the verbosity should not be too high,
i.e., not multiply the original dataset's size by a larger factor. Another important
aim was to allow exploitation of the transitivity of the skos:broader, i.e., easily
retrieving all hyponyms or hypernyms of a concept.</p>
      <p>
        Apart from those theoretic aspects, practical considerations also played a role
in the design decision. Due to the sheer volume of the dataset, we had to pick
an RDF triple store which can handle such a large knowledge graph, therefore,
we chose Virtuoso [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which is free software and at the same time has been
shown to be highly scalable [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Consulting the documentation, we found that
Virtuoso also recommends the use of Named Graphs, whereas the documentation
states that \the RDF rei cation vocabulary can be used [...] It is however very
ine cient and is not supported by any speci c optimization."8 Therefore, RDF
Named Graphs were ultimately used to implement provenance information in
the WebIsALOD knowledge graph.
4
      </p>
      <p>
        Exploitation of Provenance Information
Since the extraction of the original WebIsADB dataset was focused on
coverage rather than correctness, it contains quite a few noisy extractions. Hence,
we had to apply some post re nement of the knowledge graph to be able to
serve a dataset which as a useful quality [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Instead of ltering statements,
we have decided to follow the spirit of the original dataset, i.e., not reducing
the coverage, but to rather provide con dence values for each statement. That
way, consumers of the dataset can control the trade o between coverage and
correcntess themselves, depending on the use case at hand. At the same time,
the con dence scores are also used to order the results in the dataset's front end,
showing the most trusted statements at the top.
      </p>
      <p>
        As shown in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], rating statements only by frequency is not a good indicator
of quality. Basically, each statement observed with more than one pattern and
8 https://www.openlinksw.com/weblog/oerling/?id=1572
on more than one source has the same likelihood of being correct, regardless of
the actual frequency. At the same time, this likelihood is fairly low (below 35%),
which makes this approach not suitable for curating a dataset of high quality.
      </p>
      <p>On the other hand, the information contained in the provenance metadata
can be a useful indicator for rating the correctness of a statement: e.g., some
patterns may be prone to creating more noise than others, and a larger spread
of patterns and sources may be a better indicator for statement correctness.</p>
      <p>
        For the WebIsALOD dataset, we trained a machine learning model to capture
such meta-patterns and exploited it to rate the correctness of all statements in
the dataset. More precisely, we had a ground truth dataset annotated by means
of crowd sourcing, indicating the correctness or incorrectness of a statement.
This dataset was then used to train a classi er to tell correct from incorrect
statements, and the con dence score provided by the classi er is added to the
provenance data as a con dence score of the statement. A RandomForest
classi er has been shown to achieve an area under the ROC curve of up to .84, i.e.,
it can assign rather precise con dence scores. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
      </p>
      <p>Using those scores, it is possible to set a threshold for the quality of the
relations when querying the knowledge graph, as in the examples above.
5</p>
      <p>Conclusion
In this paper, we have shown how provenance information is used in the
WebIsALOD knowledge graph. The dataset contains a large volume of provenance
metadata, which is attached to individual statements.</p>
      <p>Adding statement level provenance information to a dataset of that size does
not come without challenges. We have explored di erent alternatives and
decided to use named graphs for providing provenance information. However, this
is a decision that we deemed suitable for the knowledge graph at hand, and other
datasets with other characteristics (e.g., di erent sizes, larger number of
statements sharing the same provenance information), and/or another underlying
tool stack, might be better suited using other approaches.</p>
      <p>
        We hope that this paper can inspire other dataset providers to add
negrained provenance information, since provenance information is still not used
by the majority of datasets on the LOD cloud [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and that the experience
shared in this paper might serve as helpful advice for implementing provenance
in a way that suits the dataset at hand.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hayes</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stickler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Named graphs</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <volume>247</volume>
          {
          <fpage>267</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Erling</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Virtuoso, a hybrid rdbms/graph column store</article-title>
          .
          <source>IEEE Data Eng. Bull</source>
          .
          <volume>35</volume>
          (
          <issue>1</issue>
          ), 3{
          <issue>8</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gimenez-Garc</surname>
            <given-names>a</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.M.</given-names>
            ,
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Maret</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Nd uents: An ontology for annotated statements with inference preservation</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          . pp.
          <volume>638</volume>
          {
          <fpage>654</fpage>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hearst</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Automatic acquisition of hyponyms from large text corpora</article-title>
          .
          <source>In: Proceedings of COLING '92</source>
          . pp.
          <volume>539</volume>
          {
          <issue>545</issue>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hernandez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Krotzsch, M.:
          <article-title>Reifying rdf: What works well with wikidata</article-title>
          ?
          <source>SSWS ISWC 1457</source>
          ,
          <issue>32</issue>
          {
          <fpage>47</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hertling</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Webisalod: providing hypernymy relations extracted from the web as linked open data</article-title>
          .
          <source>In: ISWC</source>
          . pp.
          <volume>111</volume>
          {
          <issue>119</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isele</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jentzsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morsey</surname>
            , M., van Kleef,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>DBpedia { A Largescale, Multilingual Knowledge Base Extracted from Wikipedia</article-title>
          .
          <source>Semantic Web Journal</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ) (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Morsey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngomo</surname>
            ,
            <given-names>A.C.N.</given-names>
          </string-name>
          :
          <article-title>Dbpedia sparql benchmark{ performance assessment with real queries on real data</article-title>
          .
          <source>In: ISWC</source>
          . pp.
          <volume>454</volume>
          {
          <issue>469</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ngomo</surname>
            ,
            <given-names>A.C.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaveri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Introduction to linked data and its lifecycle on the web</article-title>
          .
          <source>In: Reasoning Web International Summer School</source>
          . pp.
          <volume>1</volume>
          {
          <issue>99</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Don't like rdf rei cation?: making statements about statements using singleton property</article-title>
          .
          <source>In: Proceedings of the 23rd international conference on World wide web</source>
          . pp.
          <volume>759</volume>
          {
          <fpage>770</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>De ning n-ary relations on the semantic web (</article-title>
          <year>2006</year>
          ), https://www.w3.org/TR/swbp-n-aryRelations/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Knowledge Graph Re nement: A Survey of Approaches and Evaluation Methods</article-title>
          . Semantic
          <string-name>
            <surname>Web</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ringler</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>One knowledge graph to rule them all? analyzing the di erences between dbpedia, yago, wikidata &amp; co</article-title>
          .
          <source>In: 40th German Conference on Arti cial Intelligence</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Schmachtenberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Adoption of the linked data best practices in di erent topical domains</article-title>
          .
          <source>In: ISWC</source>
          . pp.
          <volume>245</volume>
          {
          <issue>260</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Seitner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eckert</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faralli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meusel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulheim</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponzetto</surname>
            ,
            <given-names>S.P.:</given-names>
          </string-name>
          <article-title>A large database of hypernymy relations extracted from the web</article-title>
          .
          <source>In: LREC</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kasneci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia</article-title>
          .
          <source>In: 16th international conference on World Wide Web</source>
          . pp.
          <volume>697</volume>
          {
          <issue>706</issue>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Welty</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fikes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Makarios</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A reusable ontology for uents in owl</article-title>
          .
          <source>In: FOIS</source>
          . vol.
          <volume>150</volume>
          , pp.
          <volume>226</volume>
          {
          <issue>236</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>