<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Wikidata WikiProject COVID-19: modelling the pandemic in real time</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tiago LUBIANA</string-name>
          <email>tiago.lubiana.alves@usp.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Systems Biology Laboratory, University of Sa ̃o Paulo</institution>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The COVID-19 crisis has led to a surge in biomedical data related to COVID-19.
This wealth of information, continuously updated, prompts efforts for processing,
understanding, and interpreting these data. Wikidata is an open, public domain, knowledge
graph representing concepts in a variety of domains and interlinks them by relations. The
interest of the biomedical community for Wikidata is on the rise, and Wikidata is poised
to become a global knowledge graph for the life sciences [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>This work is a case report on the WikiProject COVID-19 on Wikidata and
possible implications for the biomedical community. The author contributes to managing the
project since its creation on the 16th of March of 2020. The project is the fruit of the
effort of its more than 50 participants (spread across nations) in collaboration with the
broader Wikidata community.</p>
      <p>The Wikidata WikiProject COVID-19 is a collaborative, multinational effort to
improve the representation of pandemic-related content on Wikidata. The project has
developed collaborative models using EnititySchemas and natural language descriptions. The
data is accessible in a web-based format via the Wikidata API and wrapper packages in
Python and R. The contributions of the WikiProject COVID-19 participants have led to
a powerful resource for the life sciences community to parse our collective knowledge.</p>
      <p>The WikiProject is subdivided into branches for ontological representations of
different areas of knowledge. The areas range from the modeling of epidemiological
information (case, death, hospitalization, and recovery counts) to curating emergency
measures, numbers on hospital beds, as well as concepts directly related to the biology of
SARS-CoV-2.</p>
      <p>Project members have developed a variety of data models to reconcile external data
to Wikidata. These include, but are not limited to, data models, and Shape Expression
Entity Schemas on hospitals, preprints, outbreaks themselves, emergency measures,
macromolecular complexes, virus strains. These models have been used to integrate datasets
in semantic format. In Wikidata, these datasets become integrated with OBO ontologies
such as the Gene Ontology and the Disease Ontology and bibliometric information on
scholars and institutions.</p>
      <p>
        Also, Wikidata provides linked information on a range of encyclopedic topics (from
physical constants to demographic information), totaling more than 90 million items.
This integrated information is available openly and can be queried via a SPARQL
endpoint, which enables complex queries to get insights into our collective knowledge of
COVID-19.[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
      </p>
      <p>
        Queries such as “Which drugs inhibit proteins that bind SARS-CoV-2 proteins?”
can be made in the user-friendly Wikidata SPARQL system (https://query.wikidata.org/).
Third-party applications, such as the bibliometrics-oriented Scholia[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], make the data
even more accessible. Noticeably, as Wikidata is fed with more knowledge by the
biomedical community, these queries are automatically updated, providing an invaluable
source of updated, integrated biomedical information on SARS-CoV-2 and the
COVID19 pandemic.
      </p>
      <p>The WikiProject is open for new participants, and participation does not require any
previous expertise. Alongside editing, project participants discuss data models, automate
integration, and collaborate with other ongoing efforts to curate and improve the usability
of COVID-19 data. Crucially, project participants are improving the inner workings of
the Wikidata system for integrating biomedical knowledge, preparing the community for
handling the next crises.</p>
      <p>To sum up, WikiProject Wikidata COVID-19 is a collaborative international
effort improving the availability of open, semantically-linked data about the virus and the
pandemic. Any individual can contribute to its efforts by adding reference information
about their topic of expertise or preference. This data is publicly available and provides
a comprehensive resource for SARS-CoV-2 related knowledge, which is both
machinereadable and human-readable. It is a significant asset for ontologists, computational
biologists, and life scientists in general.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Waagmeester</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupp</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burgstaller-Muehlbacher</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Good</surname>
            <given-names>BM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Griffith</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Griffith</surname>
            <given-names>OL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanspers</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hermjakob</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hudson</surname>
            <given-names>TS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hybiske</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keating</surname>
            <given-names>SM</given-names>
          </string-name>
          . Science Forum:
          <article-title>Wikidata as a knowledge graph for the life sciences</article-title>
          .
          <source>Elife</source>
          .
          <source>2020 Mar</source>
          <volume>17</volume>
          ;9:
          <fpage>e52614</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Addshore</surname>
          </string-name>
          ,
          <string-name>
            <surname>Mietchen</surname>
            ,
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willighagen</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Wikidata</surname>
          </string-name>
          <article-title>Queries around the SARS-CoV-2 virus and pandemic</article-title>
          . 2020: Zenodo. https://doi.org/10.5281/zenodo.3977414
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Nielsen</surname>
            <given-names>F A</given-names>
          </string-name>
          ˚,
          <string-name>
            <surname>Mietchen</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willighagen</surname>
            <given-names>E</given-names>
          </string-name>
          .
          <article-title>Scholia, scientometrics and wikidata</article-title>
          .
          <source>InEuropean Semantic Web Conference 2017 May</source>
          <volume>28</volume>
          (pp.
          <fpage>237</fpage>
          -
          <lpage>259</lpage>
          ). Springer, Cham.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>