<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards a Benchmark Dataset for the Digital Humanities</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Felix Ernst</string-name>
          <email>felix.ernst@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicolas Blumenröhr</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Ontology Matching, Reference Dataset, Digital Humanities, Multilingual, FAIR Digital Objects</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Karlsruhe Institute of Technology</institution>
          ,
          <addr-line>Karlsruhe</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Applications of ontology matching such as ontology engineering, information sharing or query answering are of growing importance in the field of Digital Humanities (DH). To gain knowledge about suitable ontology matching tools for DH research, a successful evaluation of these is crucial. Unfortunately, there exist no reference alignment datasets that address DH-specific requirements such as support of multiple (historic) languages, domain-specific terms and an easily applicable data format. Therefore, we propose the creation of a dataset as base for a future DH OAEI track which uses knowledge bases and expert surveys as reference alignment sources. Using surveys leads to a graduated scale of term similarity and association, which makes advanced evaluation metrics possible.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Due to the growing amount of data and its increasing complexity, fields such as the Digital
Humanities are now in the process of applying semantic tools, including ontologies, to gain
new research insights [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Automatic ontology matching plays a crucial role in creating or
extending ontologies [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], facilitates collaboration and reduces the workload of researchers.
The Ontology Alignment Evaluation Initiative (OAEI) is a reputable source for high-quality
benchmark datasets and the evaluation results of matching tools. However, the scarcity of
datasets for application within the DH remains a significant obstacle in conducting reliable
evaluations of ontology matchers for the DH use case. Existing OAEI tracks do not adequately
address all the following specific DH requirements: (a) a wide range of (historical) languages
and writing systems; (b) domain-specific terms; (c) use of a data model suitable for easily
creating knowledge organization systems such as SKOS. Moreover, the dataset will take account
of the distinction between association (e.g. cup; cofee) and similarity (cup; mug) which is
missing in multiple popular gold standards for similarity ratings [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which limits their use for
comprehensive evaluations of ontology matchers for the DH.
      </p>
      <p>
        We propose addressing these challenges by creating a new benchmark dataset tailored to
the DH requirements, with its prospective application as a newly established OAEI track. In
CEUR
Workshop
Proceedings
addition, this dataset will be represented as a FAIR Digital Object (FDO) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], a concept that
implements the FAIR principles [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] with a particular focus to facilitate machine-actionable tasks
which makes the data ready for ML matchers and applications.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Criteria for Dataset Construction</title>
      <p>To ensure the dataset’s efectiveness, several criteria were established. The dataset will
encompass both similarity and association relations, allowing for a comprehensive evaluation.
Furthermore, it will strike a balance between general terms and specialized domain-specific
terminology, enabling a broader applicability across diferent disciplines within the humanities.
When establishing the ground truth, objectivity is of high importance and only the same
partof-speech classifications will be compared. Additionally, the dataset will encompass various
languages used within the humanities, with adequate translation of terms.</p>
      <p>To make these specific data characteristics also assessable for applications outside of ontology
evaluation toolkits, the information has to be represented in a standardized way and actionable
by operations that can be performed on the data. This is enabled by the concept of FDOs, as a
Persistent Identifier (PID) is assigned to the data, and a set of typed metadata attributes that are
associated with operations, making the data machine-actionable.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Methodology</title>
      <p>Two methods will be employed to construct the reference alignment dataset. The first method
will use DH-specific terms within freely available knowledge graphs such as WordNet 1,
GermaNet2, Wikidata3 and specialized terminology resources like Loterre4. Only relations of
synonyms classes such as skos:exactMatch will be used, disregarding relations like skos:related
due to their subjectivity and dependency on the depth within the hierarchy. The reference
alignment will be manually created using synonyms as matches between terms, including
translations coming from the knowledge graphs or from language dictionaries. A benefit of
this method is that many people have contributed to the knowledge graphs, which makes
it less sensitive to bias in synonym classification. In addition, the dataset will have a broad
domain coverage and supports various (historic) languages that are relevant for the specific
domains. The major drawback is that only synonyms and no associations can be covered,
and discipline-specific terms of several small disciplines might be missing entirely. This is
particularly regrettable since smaller disciplines are in high need of digital research tools, but
often are neglected when tailoring applications of ontology matching to their research.</p>
      <p>To address the aforementioned drawbacks, the second method for constructing a reference
alignment utilizes surveys that are conducted among domain experts. They will quantify the
similarity and association (while taking care of the distinct nature of both) of preselected terms
coming from both the first method and from controlled vocabularies created by collaborating
1https://wordnet.princeton.edu/
2https://uni-tuebingen.de/en/142806
3https://www.wikidata.org/
4https://www.loterre.fr/
researchers of diferent domains. This approach introduces varying degrees of similarity, which
can be utilized to incorporate an additional metric to the F1-score. This metric would efectively
capture the nuanced nature of term matching, where a matching tool’s failure to align two
terms may be considered less critical if the rated similarity between the terms is categorized by
the researchers as ’somewhat similar.’ Conversely, if the similarity rating indicates ’very similar,’
the absence of a match becomes more significant. By incorporating this refined measure, the
evaluation of ontology matchers can better account for the subtleties in term matches, leading
to a more comprehensive and accurate assessment of their performance. Furthermore, the
surveys focusing on specific domains will yield ontologies containing highly domain-specific
terms, presenting a novel challenge for ontology matching tools.</p>
      <p>To achieve high quality, the dataset is restricted to domains in which contact with scholars
is already established through ongoing DH projects, including the fields of Greek studies,
Egyptology, or philology. Multilingual terms are only introduced if the confidence of translation
is high, e.g. given by scholars of the respective field.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>This abstract proposes the creation of a multilingual benchmark dataset that addresses the
limitations of existing resources when applied to the DH domain. By incorporating both similarity
and association relations, focusing on domain-specific terminology, and combining knowledge
bases and expert surveys as sources, this dataset has the potential to contribute significantly to
the evaluation of ontology matching tools. Additionally, its prospective integration as OAEI
track ofers opportunities for widespread evaluation and comparison. As a further stage, the
proposed methodology can also be applied to other disciplines beyond the humanities, thus
facilitating cross-disciplinary evaluation of matchers.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research is funded by the German Research Foundation (DFG)—CRC 980 Episteme in
Motion, Project-ID 191249397, and the Helmholtz Metadata Collaboration Platform (HMC) and
supported by the German National Research Data Infrastructure (NFDI4Ing, NFDI-MatWerk).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hyvönen</surname>
          </string-name>
          ,
          <article-title>Using the Semantic Web in digital humanities: Shift from data publishing to data-analysis and serendipitous knowledge discovery</article-title>
          ,
          <source>Semantic Web</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <fpage>187</fpage>
          -
          <lpage>193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          , Ontology Matching, Springer, Berlin, Heidelberg,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Reichart</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Korhonen, SimLex-999:
          <article-title>Evaluating Semantic Models With (Genuine) Similarity Estimation</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>41</volume>
          (
          <year>2015</year>
          )
          <fpage>665</fpage>
          -
          <lpage>695</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Schultes</surname>
          </string-name>
          , P. Wittenburg, FAIR Principles and
          <article-title>Digital Objects: Accelerating Convergence on a Data Infrastructure</article-title>
          , CCIS Series, Springer International Publishing,
          <year>2019</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Aalbersberg</surname>
          </string-name>
          , et al.,
          <article-title>The FAIR Guiding Principles for scientific data management and stewardship</article-title>
          ,
          <source>Scientific Data</source>
          <volume>3</volume>
          (
          <year>2016</year>
          )
          <fpage>160018</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>