MEPDaW 2017 and LDQ 2017 Preface?

 Jeremy Debattista1 , Javier D. Fernández2 , Jürgen Umbrich2 , Anisa Rula3 , Amrapali
                   Zaveri4 , Anastasia Dimou5 , and Wouter Beek6
                   1
                     University of Bonn and Fraunhofer IAIS, Bonn, Germany
                               debattis@cs.uni-bonn.de
                 2
                   Vienna University of Economics and Business, Vienna, Austria
                  {javier.fernandez,juergen.umbrich}@wu.ac.at
                          3
                             University of Milano-Bicocca, Milan, Italy
                                 rula@disco.unimib.it
                            4
                              Maastricht University, The Netherlands
                    amrapali.zaveri@maastrichtuniversity.nl
                              5
                                 Ghent University - imec, Belgium
                              anastasia.dimou@ugent.be
                              6
                                 VU Amsterdam, The Netherlands
                                     w.g.j.beek@vu.nl


         Abstract. This joint volume of proceedings gathers together papers from the 3rd
         Workshop on Managing the Evolution and Preservation of the Data Web (MEP-
         DaW2017) and the 4th Workshop on Linked Data Quality (LDQ2017), held on
         the 28th and 29th of May of 2017 during the 14th ESWC conference in Portorož,
         Slovenia.


1     Managing the Evolution and Preservation of the Data Web
There is a vast and rapidly increasing quantity of scientific, corporate, government,
and crowd-sourced data published on the emerging Data Web. Open Data are expected
to play a catalyst role in the way structured information is exploited on a large scale.
This offers a great potential for building innovative products and services that create
new value from already collected data. It is expected to foster active citizenship (e.g.,
around the topics of journalism, greenhouse gas emissions, food supply-chains, smart
mobility, etc.) and world-wide research according to the “fourth paradigm of science”.
    Published datasets are openly available on the Web. A traditional view of digitally
preserving them by pickling them and locking them away for future use, like groceries,
conflicts with their evolution. There are a number of approaches and frameworks, such
as the Linked Data Stack, that manage a full life-cycle of the Data Web. More specifi-
cally, these techniques are expected to tackle major issues such as the synchronisation
problem (how to monitor changes), the curation problem (how to repair data imperfec-
tions), the appraisal problem (how to assess the quality of a dataset), the citation prob-
lem (how to cite a particular version of a linked dataset), the archiving problem (how
to retrieve the most recent or a particular version of a dataset), and the sustainability
problem (how to support preservation at scale, ensuring long-term access).
?
    Joint proceedings are publicly available in [3].
2        MEPDaW and LDQ 2017 organizers

    Preserving linked open datasets poses a number of challenges, mainly related to the
nature of the Linked Data principles and the RDF data model. Since resources are glob-
ally interlinked, effective citation measures are required. Another challenge is to de-
termine the consequences that changes to one LOD dataset may have to other datasets
linked to it. The distributed nature of LOD datasets furthermore introduces additional
complexity, since external sources that are being linked to may change or become un-
available. Finally, another challenge is to identify means to continuously assess the
quality of dynamic datasets.
    During last year’s workshop [2], a number of open research questions were raised
during the keynote and discussions:
 1. How can we represent archives of continuously evolving linked datasets? (effi-
    ciency vs. compact representation)
 2. How can we measure the performance of systems for archiving evolving datasets,
    in terms of representation, efficiency and compactness?
 3. How can we improve completeness of archiving?
 4. How can emerging retrieval demands in archiving (e.g. time-traversing and trace-
    ability) be satisfied? What type of data analytics can we perform on top of the
    archived Web of data?
 5. How can certain time-specific queries over archives be answered? Can we re-use
    existing technologies (e.g. SPARQL or temporal extensions)? What is the right
    query language for such queries?
 6. Is there an actual and urgent need in the community for handling the dynamicity of
    the Data Web?
 7. Is there the need of a killer-app to kick start the management of the evolving Web
    of Data?
Last year’s workshop discussions and papers were discussed in a SIGIR Forum re-
port [1]. This year’s workshop will showcase 6 papers, split into two main sessions: (1)
Managing and Querying Evolving Data; and (2) Computing and Exploiting Changes in
Evolving Data. These papers address most of the questions raised in last workshop. Fur-
thermore, in this workshop, Prof. Dr. Maria-Esther Vidal keynote discusses challenges
of Semantic data management in Big Data.


2     Linked Data Quality
The 4th Linked Data Quality Workshop7 focuses on novel methodologies and frame-
works for assessing, monitoring, maintaining, and improving the quality of Linked Data
as well as to highlight tools and user interfaces which can effectively assist in its assess-
ment and repair. In addition, the workshop seeks methodologies that help to identify the
current impediments in building real-world Linked Data applications leveraging data
and ontology quality, and use cases that reveal success stories or aspects that have been
neglected so far. The benefits of addressing Linked Data quality issues will not only
help in detecting inherent data quality problems currently plaguing Linked Data, but
also provide the means to fix these problems and maintain the quality in the long run.
 7
     ldq.semanticmultimedia.org
                                            MEPDaW 2017 and LDQ 2017 Preface            3

    In this year’s contributions we see a focus on quality assessment and validation ser-
vices, rather than client-side solutions. Since Software-as-a-Service (SaaS) has known
benefits, such as reduced consumer cost and increased ease of installation and use, the
rise of Quality-Assessment-as-a-Service is promising.
    Linked Data validation has been difficult so far because Linked Data schemas are
not used as constraints, but for deriving new facts (i.e., entailment). The ongoing stan-
dardization of Linked Data validation languages such as SHACL8 and ShEX9 provides
new opportunities for automating Linked Data quality assessment. It is promising to
see that these standardization efforts have already resulted in novel Linked Data Qual-
ity approaches. Finally, the recent publication of a Linked Data Quality vocabulary by
W3C10 makes it possible to represent and disseminate the results of quality assessment
as Linked Open Data, which opens up new approaches as well.
    This year we accepted three papers and invited a keynote speaker, which we de-
scribe in brief. Mihindukulasooriya et al. [6] present Loupe, a data profiling service for
Linked Data. Data profiling is a common approach for assessing Data Quality in rela-
tional databases, but has not yet been applied to Linked Data. Loupe builds on recent
standardization efforts for Linked Data validation such as SHACL and ShEX.
    Mc Gurk et al. [5] presents a systematic overview of existing ontology and Linked
Data quality metrics, by categorizing them according to data quality standards estab-
lished by ISO. Building on the quality assessment framework Luzzu and the ontology
visualization library VOWL, they present a new approach for visualizing the extent to
which the identified quality metrics apply to a given ontology.
    Hashimoto et al. [4] focuses on the use of Linked Data ontologies in order to au-
tomatically detect and resolve conflicts in a manufacturing design process. This poses
challenges for the data, which must be of sufficient quality in order to reliably model
the design process, but also provides opportunities when conflicts can be detected and
mitigated at an early stage.
    Péter Király’s keynote discusses how metadata quality was performed in the Euro-
peana use case. His talk shows the process of metadata quality assurance in big digital
libraries, such as Europeana, the findings of the functional requirement analyses of Eu-
ropeana records, the data quality analyzing framework built, as well as the general and
specific metrics considered and the scalability issues raised.


Acknowledgments

We would like to thank the authors for their contribution and active participation in the
workshops, and all the program committee members for reviewing the submissions and
provide valuable feedback. We are also grateful to the organisers of the ESWC 2017
conference for their support, and our keynote speakers, Prof. Dr. Maria-Esther Vidal
from the University of Bonn and Fraunhofer IAIS (Germany) and Péter Király from the
Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen (Germany).
 8
   https://www.w3.org/TR/shacl/
 9
   http://shex.io/
10
   https://www.w3.org/TR/vocab-dqv/
4        MEPDaW and LDQ 2017 organizers

    The MEPDaW workshop was co-organised by members funded by the Austrian
Science Fund (FWF): M1720-G11 and supported by the European Union’s Horizon
2020 research and innovation programme under grant 731601.


References
1. J. Debattista, J. D. Fernández, and J. Umbrich. Report on the 2nd workshop on managing the
   evolution and preservation of the data web (mepdaw 2016). SIGIR Forum, 50(2):82–88, Feb.
   2017.
2. J. Debattista, J. D. F. Garcı́a, M. Knuth, D. Kontokostas, A. Rula, J. Umbrich, and A. Zaveri,
   editors. Joint proceedings of the 2nd Workshop on Managing the Evolution and Preservation
   of the Data Web (MEPDaW 2016) and the 3rd Workshop on Linked Data Quality (LDQ 2016),
   number 1585 in CEUR Workshop Proceedings, Aachen, May 2016.
3. J. Debattista, J. D. F. Garcı́a, J. Umbrich, A. Rula, A. Zaveri, A. Dimou, and W. Beek, edi-
   tors. Joint proceedings of the 3rd Workshop on Managing the Evolution and Preservation of
   the Data Web (MEPDaW 2017) and the 4th Workshop on Linked Data Quality (LDQ 2017),
   number 1824 in CEUR Workshop Proceedings, Aachen, May 2017.
4. K. Hashimoto, Y. Yamane, S. Suzuki, M. Takaai, M. Watanabe, and H. Umemoto. Towards
   Ontology Quality Assessment. In 4th Workshop on Linked Data Quality (LDQ2017).
5. S. Mc Gurk, C. Abela, and J. Debattista. Towards Ontology Quality Assessment. In 4th
   Workshop on Linked Data Quality (LDQ2017), 2017.
6. N. Mihindukulasooriya, R. Garcı́a-Castro, F. Priyatna, E. Ruckhaus, and N. Saturno. A Linked
   Data Profiling Service for Quality Assessment. In 4th Workshop on Linked Data Quality
   (LDQ2017), 2017.