<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A real-time visual dashboard for Wikidata edits</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Damien Graux</string-name>
          <email>grauxd@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabrizio Orlandi</string-name>
          <email>orlandif@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian Lynch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Isobel Mahon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Odhran Mullen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alex Mahon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Flora Molnar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lexes Mantiquilla</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADAPT SFI Research Centre &amp; Trinity College Dublin</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <fpage>41</fpage>
      <lpage>46</lpage>
      <abstract>
        <p>During the last decades, the Web has seen the development of openly editable datasets on which users can suggest modifications at any moment. Recently, Wikidata as been the first large-scale Mediawikibased dataset structured according to the Semantic Web standards. In this article, we propose the first version of a visual dashboard to allow real-time visualisation of Wikidata changes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Over the past two decades many data sources have been published on the Web.
Most of the time, they follow the recommendations and standards promoted by
the World Wide Web Consortium (W3C) within the Semantic Web movement,
driven by the desire to create a “Web of data” from the conventional “Web of
documents”. These datasets, generally represented thanks to the RDF format [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
and accessible via the SPARQL language [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], deal with subjects ranging from
generalist knowledge such as DBpedia [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], YAGO [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or Wikidata [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to specific
knowledge such as legal court cases [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], source codes [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or medical
information [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Thus, the amount of semantic data now (publicly) accessible makes it
possible to create new applications combining for instance several datasets at
once.
      </p>
      <p>Nevertheless, among the nowadays available datasets, multiple ones are
actually open, meaning that users are able to contribute and pour new content
directly into the knowledge base. This paradigm therefore allows each user to
correct, amend, or refine the dataset. However, from a dataset maintainer
perspective, such a feature increases the complexity of keeping track of the multiple
data updates received. Practically, there exist various ways to follow changes of
open data: from the history textual logs available for example on each Wikipedia
page to the charts associated with each code-source repository of GitHub.</p>
      <p>
        In particular, in 2014, Wikidata [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] –a collaboratively edited multilingual
knowledge graph hosted by the Wikimedia Foundation– was released and it
is a common source of open data that Wikimedia projects such as Wikipedia
can use, and anyone else, under a public domain license. Practically, Wikidata
currently contains 88 783 052 items and 1 258 940 393 edits have been made since
the project launch by at least 23 555 active users1. As a consequence, Wikidata
is at the moment the largest collaboratively edited semantic knowledge base.
      </p>
      <p>In this article, we describe the current efforts we are conducting to visually
present the changes over Wikidata in (quasi) real time. The proposed interface,
keeps track of the edits sent to Wikidata and updates our dashboard on-the-fly,
letting users access to the latest status of the knowledge base.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Requirements and Technical Aspects</title>
      <p>We extracted some technical requirements for the design of our application. The
requirements elicitation process was performed having a particular use-case in
mind. The end-user would be a Wikidata ‘expert’ who would like to monitor
Wikidata edits in (quasi) real time in order to potentially identify anomalies, or
discover interesting editing patterns (e.g. most active users and resources). The
high-level requirements are:
– The data must be obtained from the Wikidata API.
– The visualisations displayed (charts and graphs) should be using the data
collected from the API.
– The visualisations must be updated in quasi real time (a delay of a few
seconds is acceptable).
– The user must be able to navigate through the web-app, select and expand
different visualisations.
– The system should differentiate between edits performed by bots and
humans.
– The system should display information about the most active users and
resources (in edits volume).
– The type and time of each edit should be taken into account in the
visualisations, along with contextual links pointing to the original edits on Wikidata.</p>
      <p>A live web application has been selected as the most suitable form of
presentation and interaction of the system. So to allow multiple online web users to
experience our interface simultaneously. In order to deal with the real-time
aspects of the application accordingly, we decided to use the ReactJS2 framework,
as a ready-to-go, well documented and widely used library. Using the endpoints
from the Wikidata API, we created queries to search for all the relevant
information in their database. Specifically, we wanted to observe the recent changes that
are provided by the Mediawiki software3. The interface with the API was
developed using pure JavaScript, without any additional libraries (e.g. jQuery). We
then used HTML and CSS alongside the ReactJS framework to design a simple
user interface. For the charts, we relied on the Nivo4 JavaScript library, which
provided us with React components to help with graphing data. This created
very responsive and customisable graphs.
1 From https://www.wikidata.org/wiki/Wikidata:Statistics (August 18th 2020)
2 https://reactjs.org/
3 https://www.mediawiki.org/wiki/API:RecentChanges
4 https://nivo.rocks/</p>
    </sec>
    <sec id="sec-3">
      <title>Wikidata Live Changes Web App</title>
      <p>As shown in the application walk-through (see Figure 1), the user interface is
made up of three parts. The homepage is the first page the user lands on and
serves as a navigation hub providing the user with an array of options as to
where to go next while also showing a few live statistics. From the homepage
the user may choose between three buttons, the feed, the dashboard (Figure 2),
or the user stats (a subcomponent of the dashboard). The feed allows the user
to have a clear overview of the data coming in. The dashboard, the main part of
the project, is where all the visualisations based on the incoming stream of data
are located with each plot being interactive allowing for it to be made fullscreen
or the data paused. Making a plot fullscreen gives the user information about
the plot they are looking at and adds labels to the plot, the user can also hover
their mouse over a data point to see a preview for what said point represents.</p>
      <p>More precisely, the dashboard (Figure 2), which is the central interface of
the webapp, presents at a glance several visualisations: the most recent activity
as a list of coming events, the recent edit size, the most active users, the most
active pages, the largest recent edits and the proportion of edit flags. Moreover,
each of these graphics is clickable, leading to a dedicated page providing more
information. For instance Figure 3 presents details on the most recent edits:
showing if the page has been freshly created or not, its size and who committed
the changes. On a similar note, Figure 4 displays additional information on the
most active users (whether they are human beings or bots) such as the size of
the edits they made. Last but not least, the detailed interfaces also embed a
“hovering” feature which allows to quickly glance inside sub-windows at some
Wikidata resources (articles or user) without leaving the application.</p>
      <p>Practically, it is important to note that the visualisations “start” when the
user enters the page, meaning that the webapp does not keep track of the
previously occurred events but rather begins “stacking” the edits made on Wikidata
from the moment of connection. In addition, since there are often a dozen of
changes per seconds, we included a pause functionality, in order to stop the
application from displaying the coming changes in the interface. Once the pause
button is pushed, the interface is frozen and the application keeps reading the
edits in the back-end so that users would be served with the fresh data after
releasing the pause.
In this article, we described and shared our web-app to visualise Wikidata’s edits
in (quasi) real time . The presented interface is hosted on:</p>
      <p>https://isobelm.github.io/Software-Engineering/
under an MIT license5, providing users a live example of what the application
could be locally, would someone be interested in deploying the interfaces at their
premises. The data visualised by our website would allow researchers and
Wikidata practitioners to easily identify anomalies or malicious edits to its databases.</p>
      <p>We presented in this short article the first version of our live interface focused
on Wikidata’s edits. Practically, we are currently setting up a user validation
experiment in order to improve the different snippets. On a different note, we are
also planning to improve the webapp with additional features such as: allowing
users to focus on specific Wikidata articles or letting users customize their
dashboard. Moreover, we paid attention during the development not to restrict our
architecture to the specific case of Wikidata, such that we can also add other
data sources to our interfaces by adding calls to an additional API.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This research was conducted with the financial support of the European Union’s
Horizon 2020 research and innovation programme under the Marie
SklodowskaCurie Grant Agreements No. 801522 and No. 713567 at the ADAPT SFI
Research Centre at Trinity College Dublin. The ADAPT SFI Centre for Digital
5 Project’s code base: https://github.com/isobelm/Software-Engineering</p>
      <p>Media Technology is funded by Science Foundation Ireland through the SFI
Research Centres Programme and is co-funded under the European Regional
Development Fund (ERDF) through Grant #13/RC/2106.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Junior</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orlandi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graux</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hossari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Sullivan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Hartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Dirschl</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Knowledge graph-based legal search over german court cases</article-title>
          . In: ESWC (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kubitza</surname>
            ,
            <given-names>D.O.</given-names>
          </string-name>
          , B¨ockmann,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Graux</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Semangit: A linked dataset from git</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <fpage>215</fpage>
          -
          <lpage>228</lpage>
          . Springer (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isele</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jentzsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morsey</surname>
            , M., van Kleef,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia</article-title>
          .
          <source>Semantic Web Journal</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          (
          <year>2015</year>
          ), http://jens-lehmann.org/files/2014/swj_dbpedia.pdf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Manola</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McBride</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al.:
          <article-title>RDF primer</article-title>
          .
          <source>W3C recommendation 10(1-107)</source>
          ,
          <volume>6</volume>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kasneci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>Yago: A core of semantic knowledge</article-title>
          .
          <source>In: Proceedings of the 16th International Conference on World Wide Web</source>
          . pp.
          <fpage>697</fpage>
          -
          <lpage>706</lpage>
          . WWW'07,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2007</year>
          ). https://doi.org/10.1145/1242572.1242667
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Vrandeˇci´c, D., Kr¨otzsch, M.:
          <article-title>Wikidata: a free collaborative knowledgebase</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>57</volume>
          (
          <issue>10</issue>
          ),
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>W3C SPARQL Working Group</surname>
          </string-name>
          , et al.
          <source>: SPARQL 1</source>
          .
          <article-title>1 overview (</article-title>
          <year>2013</year>
          ), http://www.w3.org/TR/sparql11-overview/
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wishart</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knox</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
          </string-name>
          , A.C., Cheng, D.,
          <string-name>
            <surname>Shrivastava</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tzur</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gautam</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassanali</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Drugbank: a knowledgebase for drugs, drug actions and drug targets</article-title>
          .
          <source>Nucleic acids research 36(suppl 1)</source>
          ,
          <fpage>D901</fpage>
          -
          <lpage>D906</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>