<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jeremy Debattista</string-name>
          <email>debattis@cs.uni-bonn.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Javier D. Ferna´ndez</string-name>
          <email>javier.fernandez@wu.ac.at</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Magnus Knuth</string-name>
          <email>magnus.knuth@hpi.uni-potsdam.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dimitris Kontokostas</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anisa Rula</string-name>
          <email>rula@disco.unimib.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ju¨ rgen Umbrich</string-name>
          <email>juergen.umbrich@wu.ac.at</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amrapali Zaveri</string-name>
          <email>amrapali@stanford.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hasso Plattner Institute, University of Potsdam</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Stanford Center for Biomedical Informatics Research, Stanford University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Bonn and Fraunhofer IAIS</institution>
          ,
          <addr-line>Bonn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Milano-Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <abstract>
        <p>This joint volume of proceedings gathers together papers from the 2nd Workshop on Managing the Evolution and Preservation of the Data Web (MEPDaW) and the 3rd Workshop on Linked Data Quality (LDQ), held on the 30th of May of 2016 during the 13th ESWC conference in Anissaras, Crete, Greece. Managing the Evolution and Preservation of the Data Web This workshop targeted one of the emerging and fundamental problems in the Semantic Web, specifically the preservation of evolving linked datasets. There is a vast and rapidly increasing quantity of scientific, corporate, government and crowd-sourced data published on the emerging Data Web. Open Data are expected to play a catalyst role in the way structured information is exploited in the large scale. This offers a great potential for building innovative products and services that create new value from already collected data. It is expected to foster active citizenship (e.g., around the topics of journalism, greenhouse gas emissions, food supply-chains, smart mobility, etc.) and world-wide research according to the “fourth paradigm of science”. The most noteworthy advantage of the Data Web is that, rather than documents, facts are recorded, which become the basis for discovering new knowledge that is not contained in any individual source, and solving problems that were not originally anticipated. In particular, Open Data published according to the Linked Data Paradigm are essentially transforming the Web into a vibrant information ecosystem. Published datasets are openly available on the Web. A traditional view of digitally preserving them by “pickling them and locking them away” for future use, like groceries, would conflict with their evolution. There are a number of approaches and frameworks, such as the LOD2 stack, that manage a full life-cycle of the Data Web.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        ? MEPDaW+LDQ join proceedings are publicly available in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
More specifically, these techniques are expected to tackle major issues such as the
synchronisation problem (how can we monitor changes), the curation problem (how can
data imperfections be repaired), the appraisal problem (how can we assess the quality
of a dataset), the citation problem (how can we cite a particular version of a linked
dataset), the archiving problem (how can we retrieve the most recent or a particular
version of a dataset), and the sustainability problem (how can we spread preservation
ensuring long-term access).
      </p>
      <p>Preserving linked open datasets poses a number of challenges, mainly related to
the nature of the LOD principles and the RDF data model. In LOD, datasets
representing real-world entities are structured; thus, when managing and representing facts
we need to take into consideration possible constraints that may hold. Since resources
might be interlinked, effective citation measures are required to be in place to enable,
for example, the ranking of datasets according to their measured quality. Another
challenge is to determine the consequences that changes to one LOD dataset may have to
other datasets linked to it. The distributed nature of LOD datasets furthermore makes
archiving a headache.</p>
      <p>This workshop aimed at addressing the above mentioned challenges and issues by
providing a forum for researchers and practitioners who apply linked data technologies
to discuss, exchange and disseminate their work.</p>
      <p>The workshop included an inspiring talk by Dr. Axel Polleres on Archiving Linked
and Open Data, three research papers and one industry paper, and a plenary discussion.
Based on the review scores, the best paper award has been given to Ruben Taelman,
Ruben Verborgh, Pieter Colpaert, Erik Mannens and Rik Van de Walle for their work
“Continuously Updating Query Results over Real-Time Linked Data”.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Linked Data Quality</title>
      <p>The focus of this workshop was to reveal novel methodologies and frameworks in
assessing, monitoring, maintaining, and improving the quality of Linked Data (LD) as
well as introduce tools and user interfaces which can effectively assist in the
assessment and repair. In addition, the workshop sought methodologies that help to identify
the current impediments in building real-world Linked Data applications leveraging
data quality. The benefits of addressing Linked Data quality issues would not only help
in detecting inherent data quality problems currently plaguing Linked Data, but also
provide the means to fix these problems and maintain the quality in the long run.</p>
      <p>To guarantee the full exploitation of the published or consumed Linked Data, it is
important to assure LD quality. In this way, it is possible to understand whether data is
appropriate for the task at hand before using it. There are several issues in LD which
hampers the use of datasets in building real-world LD-based applications and research
solutions. One of issues is the method to find the most relevant LD data for a particular
application. Also, generating meaningful associations between the LD datasets, at the
ontology, data or property level is an important issue to be considered when building
such applications. Essentially, the quality of LD is a deciding factor as to which datasets
can be used in building such real-world applications. Currently, there is no full-proof
method of performing this kind of quality assessment.</p>
      <p>In general detecting the quality of datasets available and making this information
explicit is a challenge. This entails the (semi-)automatic identification of existing
problems, which is either insensitive to the use case or is limited in identifying only specific
(objective) quality issues. In LD few efforts are currently available to standardize how
data quality tracking and assurance should be implemented and it poses yet other
challenges: (i) LD refers to a Web-scale knowledge base consisting of interlinked published
data from a multitude of autonomous information providers (variety of data). The
quality of provided information may depend on the intention of the information provider;
(ii) the increasing diffusion of the LD paradigm as a standard way to share knowledge
on the Web allows consumers to fully exploit vast amount of data that were not
available in the past (high volume of data). We are likely to find more low quality in LD
than in smaller data sets because in large data sets data are produced with automatic
information processes which are often error prone; (iii) data sets in LD formats may
often be used by third-party applications in ways not expected by the original creators
of the data set; (iv) LD provides data integration through interlinking of data between
heterogeneous data sources. The quality of integrated data will depend on the quality of
original data sources; (v) last but not least relevant, Linked Data can be considered as
a dynamic environment where information can change rapidly and cannot be assumed
to be static (velocity of data). Changes in LD sources should reflect changes in the real
world, otherwise data can soon become out-dated. Out-of-date information can reflect
data inaccuracy problems and can deliver invalid information. For example, more
upto-date information should Data be preferred over less up-to-date information in data
integration and fusion applications. Moreover, none of the current approaches use the
assessment to ultimately improve the quality of the underlying dataset.</p>
      <p>
        The workshop included a keynote by Christian Dierschl on Data quality assurance
in data-intensive systems, three paper presentations and a lightning talk session with
following discussions. Based on the review scores, the best paper award has been given
to Toma´sˇ Knap for his work on Increasing Quality of Austrian Open Data by Linking
them to Linked Data Sources: Lessons Learned [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Organisation</title>
      <p>Organising Committees
– MEPDaW</p>
      <p>Jeremy Debattista, Enterprise Information Systems, University of Bonn,
Germany / Organized Knowledge, Fraunhofer IAIS, Germany
Ju¨rgen Umbrich, Vienna University of Economics and Business, Austria
Javier D. Ferna´ndez, Vienna University of Economics and Business, Austria
– LDQ</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>We would like to thank the authors for their contribution and active participation in the
workshops, and all the program committee members for reviewing the submissions and
provide valuable feedback. We are also grateful to the organisers of the ESWC 2016
conference for their support, and our keynote speakers, Axel Polleres from the Vienna
University of Economics and Business (Austria), and Christian Dirschl from Wolters
Kluwer (Germany).</p>
      <p>The MEPDaW workshop was co-organised by members funded by the Austrian
Science Fund (FWF): M1720-G11.</p>
      <p>The LDQ workshop was co-organised by members funded by the German
Government, Federal Ministry of Education and Research under the project: 03WKCJ4D.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>J.</given-names>
            <surname>Debattista</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D. F.</given-names>
            <surname>Garc</surname>
          </string-name>
          ´ıa,
          <string-name>
            <given-names>M.</given-names>
            <surname>Knuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kontokostas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Umbrich</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Zaveri, editors.
          <source>Joint proceedings of the 2nd Workshop on Managing the Evolution and Preservation of the Data Web (MEPDaW 2016) and the 3rd Workshop on Linked Data Quality (LDQ</source>
          <year>2016</year>
          ), number 1585 in CEUR Workshop Proceedings, Aachen, May
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>T.</given-names>
            <surname>Knap</surname>
          </string-name>
          .
          <article-title>Increasing quality of austrian open data by linking them to linked data sources: Lessons learned</article-title>
          .
          <source>In Debattista et al. [1].</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>