=Paper= {{Paper |id=Vol-1585/preface |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1585/preface.pdf |volume=Vol-1585 }} ==None== https://ceur-ws.org/Vol-1585/preface.pdf
                         MEPDaW+LDQ Preface?

    Jeremy Debattista1 , Javier D. Fernández2 , Magnus Knuth3 , Dimitris Kontokostas,
                Anisa Rula5 , Jürgen Umbrich2 , and Amrapali Zaveri4
                  1
                    University of Bonn and Fraunhofer IAIS, Bonn, Germany
                              debattis@cs.uni-bonn.de
               2
                 Vienna University of Economics and Business, Vienna, Austria
                {javier.fernandez,juergen.umbrich}@wu.ac.at
                  3
                    Hasso Plattner Institute, University of Potsdam, Germany
                        magnus.knuth@hpi.uni-potsdam.de
       4
         Stanford Center for Biomedical Informatics Research, Stanford University, USA
                                amrapali@stanford.edu
                         5
                           University of Milano-Bicocca, Milan, Italy
                                 rula@disco.unimib.it



        Abstract. This joint volume of proceedings gathers together papers from the
        2nd Workshop on Managing the Evolution and Preservation of the Data Web
        (MEPDaW) and the 3rd Workshop on Linked Data Quality (LDQ), held on the
        30th of May of 2016 during the 13th ESWC conference in Anissaras, Crete,
        Greece.


1     Managing the Evolution and Preservation of the Data Web

This workshop targeted one of the emerging and fundamental problems in the Seman-
tic Web, specifically the preservation of evolving linked datasets. There is a vast and
rapidly increasing quantity of scientific, corporate, government and crowd-sourced data
published on the emerging Data Web. Open Data are expected to play a catalyst role
in the way structured information is exploited in the large scale. This offers a great
potential for building innovative products and services that create new value from al-
ready collected data. It is expected to foster active citizenship (e.g., around the topics
of journalism, greenhouse gas emissions, food supply-chains, smart mobility, etc.) and
world-wide research according to the “fourth paradigm of science”. The most notewor-
thy advantage of the Data Web is that, rather than documents, facts are recorded, which
become the basis for discovering new knowledge that is not contained in any individual
source, and solving problems that were not originally anticipated. In particular, Open
Data published according to the Linked Data Paradigm are essentially transforming the
Web into a vibrant information ecosystem.
    Published datasets are openly available on the Web. A traditional view of digi-
tally preserving them by “pickling them and locking them away” for future use, like
groceries, would conflict with their evolution. There are a number of approaches and
frameworks, such as the LOD2 stack, that manage a full life-cycle of the Data Web.
?
    MEPDaW+LDQ join proceedings are publicly available in [1].
2       MEPDaW and LDQ 2016 organizers

More specifically, these techniques are expected to tackle major issues such as the syn-
chronisation problem (how can we monitor changes), the curation problem (how can
data imperfections be repaired), the appraisal problem (how can we assess the quality
of a dataset), the citation problem (how can we cite a particular version of a linked
dataset), the archiving problem (how can we retrieve the most recent or a particular
version of a dataset), and the sustainability problem (how can we spread preservation
ensuring long-term access).
    Preserving linked open datasets poses a number of challenges, mainly related to
the nature of the LOD principles and the RDF data model. In LOD, datasets repre-
senting real-world entities are structured; thus, when managing and representing facts
we need to take into consideration possible constraints that may hold. Since resources
might be interlinked, effective citation measures are required to be in place to enable,
for example, the ranking of datasets according to their measured quality. Another chal-
lenge is to determine the consequences that changes to one LOD dataset may have to
other datasets linked to it. The distributed nature of LOD datasets furthermore makes
archiving a headache.
    This workshop aimed at addressing the above mentioned challenges and issues by
providing a forum for researchers and practitioners who apply linked data technologies
to discuss, exchange and disseminate their work.
    The workshop included an inspiring talk by Dr. Axel Polleres on Archiving Linked
and Open Data, three research papers and one industry paper, and a plenary discussion.
Based on the review scores, the best paper award has been given to Ruben Taelman,
Ruben Verborgh, Pieter Colpaert, Erik Mannens and Rik Van de Walle for their work
“Continuously Updating Query Results over Real-Time Linked Data”.


2   Linked Data Quality

The focus of this workshop was to reveal novel methodologies and frameworks in as-
sessing, monitoring, maintaining, and improving the quality of Linked Data (LD) as
well as introduce tools and user interfaces which can effectively assist in the assess-
ment and repair. In addition, the workshop sought methodologies that help to identify
the current impediments in building real-world Linked Data applications leveraging
data quality. The benefits of addressing Linked Data quality issues would not only help
in detecting inherent data quality problems currently plaguing Linked Data, but also
provide the means to fix these problems and maintain the quality in the long run.
    To guarantee the full exploitation of the published or consumed Linked Data, it is
important to assure LD quality. In this way, it is possible to understand whether data is
appropriate for the task at hand before using it. There are several issues in LD which
hampers the use of datasets in building real-world LD-based applications and research
solutions. One of issues is the method to find the most relevant LD data for a particular
application. Also, generating meaningful associations between the LD datasets, at the
ontology, data or property level is an important issue to be considered when building
such applications. Essentially, the quality of LD is a deciding factor as to which datasets
can be used in building such real-world applications. Currently, there is no full-proof
method of performing this kind of quality assessment.
                                                           MEPDaW+LDQ Preface             3

     In general detecting the quality of datasets available and making this information
explicit is a challenge. This entails the (semi-)automatic identification of existing prob-
lems, which is either insensitive to the use case or is limited in identifying only specific
(objective) quality issues. In LD few efforts are currently available to standardize how
data quality tracking and assurance should be implemented and it poses yet other chal-
lenges: (i) LD refers to a Web-scale knowledge base consisting of interlinked published
data from a multitude of autonomous information providers (variety of data). The qual-
ity of provided information may depend on the intention of the information provider;
(ii) the increasing diffusion of the LD paradigm as a standard way to share knowledge
on the Web allows consumers to fully exploit vast amount of data that were not avail-
able in the past (high volume of data). We are likely to find more low quality in LD
than in smaller data sets because in large data sets data are produced with automatic
information processes which are often error prone; (iii) data sets in LD formats may
often be used by third-party applications in ways not expected by the original creators
of the data set; (iv) LD provides data integration through interlinking of data between
heterogeneous data sources. The quality of integrated data will depend on the quality of
original data sources; (v) last but not least relevant, Linked Data can be considered as
a dynamic environment where information can change rapidly and cannot be assumed
to be static (velocity of data). Changes in LD sources should reflect changes in the real
world, otherwise data can soon become out-dated. Out-of-date information can reflect
data inaccuracy problems and can deliver invalid information. For example, more up-
to-date information should Data be preferred over less up-to-date information in data
integration and fusion applications. Moreover, none of the current approaches use the
assessment to ultimately improve the quality of the underlying dataset.
     The workshop included a keynote by Christian Dierschl on Data quality assurance
in data-intensive systems, three paper presentations and a lightning talk session with
following discussions. Based on the review scores, the best paper award has been given
to Tomáš Knap for his work on Increasing Quality of Austrian Open Data by Linking
them to Linked Data Sources: Lessons Learned [2].


3     Organisation

3.1    Organising Committees

    – MEPDaW
       • Jeremy Debattista, Enterprise Information Systems, University of Bonn, Ger-
         many / Organized Knowledge, Fraunhofer IAIS, Germany
       • Jürgen Umbrich, Vienna University of Economics and Business, Austria
       • Javier D. Fernández, Vienna University of Economics and Business, Austria
    – LDQ
       • Anisa Rula – University of Milano-Bicocca, Italy
       • Amrapali Zaveri – Stanford University, United States
       • Magnus Knuth – Hasso Plattner Institute, University of Potsdam, Germany
       • Dimitris Kontokostas – AKSW, University of Leipzig, Germany
4         MEPDaW and LDQ 2016 organizers

3.2    Program Committees
    – MEPDaW
       • Judie Attard, University of Bonn/Fraunhofer IAIS, Germany
       • Ioannis Chrysakis, FORTH-ICS, Greece
       • Keith Cortis, University of Passau, Germany
       • Giorgos Flouris, FORTH-ICS, Greece
       • Marios Meimaris, ATHENA R.C., Greece
       • Fabrizio Orlandi, University of Bonn/Fraunhofer IAIS, Germany
       • Fouad Zablith, American University of Beirut, Lebanon
       • Magnus Knuth, Hasso Plattner Institute – University of Potsdam, Germany
       • Anisa Rula, University of Milano-Bicocca, Italy
       • Wouter Beek, VU University Amsterdam, Netherlands
       • Yannis Stavrakas, ATHENA R.C., Greece
       • Amrapali J. Zaveri, Dumontier Lab - Stanford University, USA
       • Mathieu d’Aquin, The Open University, United Kingdom
       • Yannis Roussakis, FORTH-ICS, Greece
       • Kemele M. Endris, University of Bonn
       • Charlie Abela, University of Malta, Msida, Malta
       • George Papastefanatos, ATHENA R.C., Greece
       • Nandana Mihindukulasooriya, Universidad Politécnica de Madrid (UPM), Spain
       • Niklas Petersen, University of Bonn/Fraunhofer IAIS, Germany
       • Joseph Bonello, University of Malta, Msida, Malta
    – LDQ
       • Maribel Acosta, Karlsruhe Institute of Technology – AIFB, Germany
       • James Anderson, Datagraph, United States
       • Volha Bryl, Springer Science+Business Media, Germany
       • Ioannis Chrysakis, ICS FORTH, Greece
       • Mathieu d’Aquin, Knowledge Media Institute, The Open University, United
         Kingdom
       • Jeremy Debattista, University of Bonn, Fraunhofer IAIS, Germany
       • Anastasia Dimou, MultimediaLab, Ghent University – iMinds, Belgium
       • Suzanne Embury – University of Manchester, United Kingdom
       • Christian Fürber, Information Quality Institute GmbH, Germany
       • Jose Emilio Labra Gayo, University of Oviedo, Spain
       • Markus Graube, Technische Universität Dresden, Germany
       • Tom Heath, The Open Data Institute, United Kingdom
       • Tomáš Knap, Semantic Web Company, AT, and Charles University in Prague,
         Czech Republic
       • Maristella Matera, Politecnico di Milano, Italy
       • John McCrae, CITEC, University of Bielefeld, Germany
       • Matteo Palmonari, University of Milan-Bicocca, Italy
       • Heiko Paulheim, University of Mannheim, Germany
       • Mariano Rico, Universidad Politécnica de Madrid, Spain
       • Patrick Westphal, AKSW, University of Leipzig, Germany
       • Antoine Zimmermann, École Nationale Supérieure des Mines de Saint-Étienne,
         France
                                                              MEPDaW+LDQ Preface               5

Acknowledgements
We would like to thank the authors for their contribution and active participation in the
workshops, and all the program committee members for reviewing the submissions and
provide valuable feedback. We are also grateful to the organisers of the ESWC 2016
conference for their support, and our keynote speakers, Axel Polleres from the Vienna
University of Economics and Business (Austria), and Christian Dirschl from Wolters
Kluwer (Germany).
    The MEPDaW workshop was co-organised by members funded by the Austrian
Science Fund (FWF): M1720-G11.
    The LDQ workshop was co-organised by members funded by the German Govern-
ment, Federal Ministry of Education and Research under the project: 03WKCJ4D.


References
1. J. Debattista, J. D. F. Garcı́a, M. Knuth, D. Kontokostas, A. Rula, J. Umbrich, and A. Zaveri,
   editors. Joint proceedings of the 2nd Workshop on Managing the Evolution and Preservation
   of the Data Web (MEPDaW 2016) and the 3rd Workshop on Linked Data Quality (LDQ 2016),
   number 1585 in CEUR Workshop Proceedings, Aachen, May 2016.
2. T. Knap. Increasing quality of austrian open data by linking them to linked data sources:
   Lessons learned. In Debattista et al. [1].