=Paper= {{Paper |id=Vol-2778/paper4 |storemode=property |title=A Real-Time Visual Dashboard for Wikidata Edits |pdfUrl=https://ceur-ws.org/Vol-2778/paper4.pdf |volume=Vol-2778 |authors=Damien Graux,Fabrizio Orlandi,Brian Lynch,Isobel Mahon,Odhran Mullen,Alex Mahon,Flora Molnar,Lexes Mantiquilla |dblpUrl=https://dblp.org/rec/conf/semweb/GrauxOLMMMMM20 }} ==A Real-Time Visual Dashboard for Wikidata Edits== https://ceur-ws.org/Vol-2778/paper4.pdf
A real-time visual dashboard for Wikidata edits

    Damien Graux(�) , Fabrizio Orlandi(�) , Brian Lynch, Isobel Mahon,
      Odhran Mullen, Alex Mahon, Flora Molnar, and Lexes Mantiquilla

          ADAPT SFI Research Centre & Trinity College Dublin, Ireland
                        {grauxd,orlandif}@tcd.ie




      Abstract. During the last decades, the Web has seen the development
      of openly editable datasets on which users can suggest modifications at
      any moment. Recently, Wikidata as been the first large-scale Mediawiki-
      based dataset structured according to the Semantic Web standards. In
      this article, we propose the first version of a visual dashboard to allow
      real-time visualisation of Wikidata changes.



1    Introduction

Over the past two decades many data sources have been published on the Web.
Most of the time, they follow the recommendations and standards promoted by
the World Wide Web Consortium (W3C) within the Semantic Web movement,
driven by the desire to create a “Web of data” from the conventional “Web of
documents”. These datasets, generally represented thanks to the RDF format [4]
and accessible via the SPARQL language [7], deal with subjects ranging from
generalist knowledge such as DBpedia [3], YAGO [5] or Wikidata [6] to specific
knowledge such as legal court cases [1], source codes [2] or medical informa-
tion [8]. Thus, the amount of semantic data now (publicly) accessible makes it
possible to create new applications combining for instance several datasets at
once.
    Nevertheless, among the nowadays available datasets, multiple ones are ac-
tually open, meaning that users are able to contribute and pour new content
directly into the knowledge base. This paradigm therefore allows each user to
correct, amend, or refine the dataset. However, from a dataset maintainer per-
spective, such a feature increases the complexity of keeping track of the multiple
data updates received. Practically, there exist various ways to follow changes of
open data: from the history textual logs available for example on each Wikipedia
page to the charts associated with each code-source repository of GitHub.
    In particular, in 2014, Wikidata [6] –a collaboratively edited multilingual
knowledge graph hosted by the Wikimedia Foundation– was released and it
is a common source of open data that Wikimedia projects such as Wikipedia
can use, and anyone else, under a public domain license. Practically, Wikidata
currently contains 88 783 052 items and 1 258 940 393 edits have been made since


                                        41




          c 2020 for this paper by its authors. Use permitted under Creative Com-
Copyright �
mons License Attribution 4.0 International (CC BY 4.0).
A real-time visual dashboard for Wikidata edits


 the project launch by at least 23 555 active users1 . As a consequence, Wikidata
 is at the moment the largest collaboratively edited semantic knowledge base.
     In this article, we describe the current efforts we are conducting to visually
 present the changes over Wikidata in (quasi) real time. The proposed interface,
 keeps track of the edits sent to Wikidata and updates our dashboard on-the-fly,
 letting users access to the latest status of the knowledge base.

 2    Requirements and Technical Aspects
 We extracted some technical requirements for the design of our application. The
 requirements elicitation process was performed having a particular use-case in
 mind. The end-user would be a Wikidata ‘expert’ who would like to monitor
 Wikidata edits in (quasi) real time in order to potentially identify anomalies, or
 discover interesting editing patterns (e.g. most active users and resources). The
 high-level requirements are:
   – The data must be obtained from the Wikidata API.
   – The visualisations displayed (charts and graphs) should be using the data
     collected from the API.
   – The visualisations must be updated in quasi real time (a delay of a few
     seconds is acceptable).
   – The user must be able to navigate through the web-app, select and expand
     different visualisations.
   – The system should differentiate between edits performed by bots and hu-
     mans.
   – The system should display information about the most active users and
     resources (in edits volume).
   – The type and time of each edit should be taken into account in the visualisa-
     tions, along with contextual links pointing to the original edits on Wikidata.
     A live web application has been selected as the most suitable form of pre-
 sentation and interaction of the system. So to allow multiple online web users to
 experience our interface simultaneously. In order to deal with the real-time as-
 pects of the application accordingly, we decided to use the ReactJS2 framework,
 as a ready-to-go, well documented and widely used library. Using the endpoints
 from the Wikidata API, we created queries to search for all the relevant informa-
 tion in their database. Specifically, we wanted to observe the recent changes that
 are provided by the Mediawiki software3 . The interface with the API was devel-
 oped using pure JavaScript, without any additional libraries (e.g. jQuery). We
 then used HTML and CSS alongside the ReactJS framework to design a simple
 user interface. For the charts, we relied on the Nivo4 JavaScript library, which
 provided us with React components to help with graphing data. This created
 very responsive and customisable graphs.
  1
    From https://www.wikidata.org/wiki/Wikidata:Statistics (August 18th 2020)
  2
    https://reactjs.org/
  3
    https://www.mediawiki.org/wiki/API:RecentChanges
  4
    https://nivo.rocks/


                                           42
A real-time visual dashboard for Wikidata edits




            Fig. 1. Walk-through presenting the available graphic interfaces.



 3    Wikidata Live Changes Web App

 As shown in the application walk-through (see Figure 1), the user interface is
 made up of three parts. The homepage is the first page the user lands on and
 serves as a navigation hub providing the user with an array of options as to
 where to go next while also showing a few live statistics. From the homepage
 the user may choose between three buttons, the feed, the dashboard (Figure 2),
 or the user stats (a subcomponent of the dashboard). The feed allows the user
 to have a clear overview of the data coming in. The dashboard, the main part of
 the project, is where all the visualisations based on the incoming stream of data
 are located with each plot being interactive allowing for it to be made fullscreen
 or the data paused. Making a plot fullscreen gives the user information about


                                           43
A real-time visual dashboard for Wikidata edits




                        Fig. 2. The main page of the dashboard.




 the plot they are looking at and adds labels to the plot, the user can also hover
 their mouse over a data point to see a preview for what said point represents.

     More precisely, the dashboard (Figure 2), which is the central interface of
 the webapp, presents at a glance several visualisations: the most recent activity
 as a list of coming events, the recent edit size, the most active users, the most
 active pages, the largest recent edits and the proportion of edit flags. Moreover,
 each of these graphics is clickable, leading to a dedicated page providing more
 information. For instance Figure 3 presents details on the most recent edits:
 showing if the page has been freshly created or not, its size and who committed
 the changes. On a similar note, Figure 4 displays additional information on the
 most active users (whether they are human beings or bots) such as the size of
 the edits they made. Last but not least, the detailed interfaces also embed a
 “hovering” feature which allows to quickly glance inside sub-windows at some
 Wikidata resources (articles or user) without leaving the application.

     Practically, it is important to note that the visualisations “start” when the
 user enters the page, meaning that the webapp does not keep track of the previ-
 ously occurred events but rather begins “stacking” the edits made on Wikidata
 from the moment of connection. In addition, since there are often a dozen of
 changes per seconds, we included a pause functionality, in order to stop the ap-
 plication from displaying the coming changes in the interface. Once the pause
 button is pushed, the interface is frozen and the application keeps reading the
 edits in the back-end so that users would be served with the fresh data after
 releasing the pause.


                                           44
A real-time visual dashboard for Wikidata edits




                          Fig. 3. A chart showing recent edits.


 4      Conclusion
 In this article, we described and shared our web-app to visualise Wikidata’s edits
 in (quasi) real time . The presented interface is hosted on:
              https://isobelm.github.io/Software-Engineering/
 under an MIT license5 , providing users a live example of what the application
 could be locally, would someone be interested in deploying the interfaces at their
 premises. The data visualised by our website would allow researchers and Wiki-
 data practitioners to easily identify anomalies or malicious edits to its databases.
     We presented in this short article the first version of our live interface focused
 on Wikidata’s edits. Practically, we are currently setting up a user validation ex-
 periment in order to improve the different snippets. On a different note, we are
 also planning to improve the webapp with additional features such as: allowing
 users to focus on specific Wikidata articles or letting users customize their dash-
 board. Moreover, we paid attention during the development not to restrict our
 architecture to the specific case of Wikidata, such that we can also add other
 data sources to our interfaces by adding calls to an additional API.

 Acknowledgments
 This research was conducted with the financial support of the European Union’s
 Horizon 2020 research and innovation programme under the Marie Sk�lodowska-
 Curie Grant Agreements No. 801522 and No. 713567 at the ADAPT SFI Re-
 search Centre at Trinity College Dublin. The ADAPT SFI Centre for Digital
  5
      Project’s code base: https://github.com/isobelm/Software-Engineering


                                           45
A real-time visual dashboard for Wikidata edits




                         Fig. 4. Histogram of most active users.


 Media Technology is funded by Science Foundation Ireland through the SFI
 Research Centres Programme and is co-funded under the European Regional
 Development Fund (ERDF) through Grant #13/RC/2106.


 References
 1. Junior, A.C., Orlandi, F., Graux, D., Hossari, M., O’Sullivan, D., Hartz, C., Dirschl,
    C.: Knowledge graph-based legal search over german court cases. In: ESWC (2020)
 2. Kubitza, D.O., Böckmann, M., Graux, D.: Semangit: A linked dataset from git. In:
    International Semantic Web Conference. pp. 215–228. Springer (2019)
 3. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N.,
    Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale,
    multilingual knowledge base extracted from wikipedia. Semantic Web Journal 6(2),
    167–195 (2015), http://jens-lehmann.org/files/2014/swj_dbpedia.pdf
 4. Manola, F., Miller, E., McBride, B., et al.: RDF primer. W3C recommendation
    10(1-107), 6 (2004)
 5. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A core of semantic
    knowledge. In: Proceedings of the 16th International Conference on World
    Wide Web. pp. 697–706. WWW’07, ACM, New York, NY, USA (2007).
    https://doi.org/10.1145/1242572.1242667
 6. Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Com-
    munications of the ACM 57(10), 78–85 (2014)
 7. W3C SPARQL Working Group, et al.: SPARQL 1.1 overview (2013),
    http://www.w3.org/TR/sparql11-overview/
 8. Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam,
    B., Hassanali, M.: Drugbank: a knowledgebase for drugs, drug actions and drug
    targets. Nucleic acids research 36(suppl 1), D901–D906 (2008)



                                           46