<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linking Wikidata to the rest of the Semantic Web</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andra Waagmeester</string-name>
          <email>andra@micelio.be</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Egon Willighagen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nuria Queralt Rosinach</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elvira Mitraka</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Burgstaller-Muehlbacher</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tim E. Putman</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julia Turner</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lynn M Schriml</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Pavlidis</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew I Su</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benjamin M Good</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University</institution>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Molecular and Experimental Medicine, the Scripps Research Institute</institution>
          ,
          <addr-line>La Jolla</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Micelio</institution>
          ,
          <addr-line>Antwerp 2180</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>The University of British Columbia</institution>
          ,
          <addr-line>Vancouver, British Columbia</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Maryland Baltimore</institution>
          ,
          <addr-line>Baltimore, MD</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Keywords: federated queries, SPARQL, Wikidata Wikidata is the linked database of Wikipedia and other sister projects from the Wikimedia foundation. Similar to Wikipedia, Wikidata is not limited to a select set of knowledge domains, in principle it can capture all knowledge. However, external resources can contain much more granularity on a given topic. Research in the life sciences for example can generate vast amounts of data (e.g. highthroughput screening), so much that storing all in Wikidata will not be possible. More e ective would be to consider Wikidata the central hub between more detailed life-science hubs on the semantic web. We present how Wikidata can be linked to and used to link other semantic sources. We compare four federated SPARQL query patterns to create the links. To link Wikidata to the semantic web, we have followed four routes: (1) Using the FILTER operator in SPARQL; (2) Using IRIs which are compiled on the y based on existing -value based- shared identi ers, using SPARQL's BIND operator; (3) Storing Wikidata-item identi ers in external data sources as mappings to local identi ers; and (4) Storing remote IRI's as a Wikibase property value of type URL.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Background</title>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>queries, Wikidata and an external source can be linked through a SPARQL
query. Since both identi ers will be stored as literals, linking through IRIs is
not an option. Using lters in SPARQL is then an option. The FILTER
operator from SPARQL limits the results of a graph pattern based on boolean
or regular expressions. This only works if the results of both graph patterns
are not substantial in size or complexity such that the SPARQL endpoint can
handle processing both returns. We demonstrate this using graph patterns from
Wikidata and WikiPathways.</p>
      <p>Using the BIND operator in SPARQL queries
When, however, the external source contains IRIs for their identi able concepts,
but the equivalent Wikidata property is still available only as a value, a link
can be made using the BIND operator, where an IRI is composed based on
the property value stored in Wikidata. We demonstrate this using the BIND
operator by a federated query which connects Wikidata with Uniprot, allowing
merges of protein annotations from both Wikidata and Uniprot.</p>
    </sec>
    <sec id="sec-3">
      <title>Remote IRI mappings between Wikidata and remote resource</title>
      <p>External resources, like WikiPathways and DisGeNET, actively store mappings
between their concepts and the equivalent items in Wikidata. When resources
maintain such mappings, a federated query where the remote query pattern and
the Wikidata query pattern share the same variables is su cient to merge results.</p>
    </sec>
    <sec id="sec-4">
      <title>Local IRI mappings between Wikidata and remote resource</title>
      <p>We have proposed a Wikidata property (P2888) that allows to create
mappings between Wikidata and external resources on the semantic web as Wikidata
statements. This property has been accepted and we are currently populating
items with statements expressing similarity between Wikidata items and others
concept descriptions on the semantic web. This Wikidata property is based on
skos:exactMatch.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Linking Wikidata to other hubs on the semantic web opens up new routes of
validation between di erent resources and toward better integration. We have
demonstrated four potential paths to link content from Wikidata with other
resources on the semantic web, using federated SPARQL queries.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>