<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Results of SemTab 2021⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vincenzo Cutrona</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiaoyan Chen</string-name>
          <email>jiaoyan.chen@cs.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vasilis Efthymiou</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oktie Hassanzadeh</string-name>
          <email>hassanzadeh@us.ibm.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ernesto Jime´nez-Ruiz</string-name>
          <email>ernesto.jimenez-ruiz@city.ac.uk</email>
          <email>ernestoj@uio.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Sequeda</string-name>
          <email>juan@data.world</email>
          <xref ref-type="aff" rid="aff8">8</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kavitha Srinivas</string-name>
          <email>kavitha.srinivas@ibm.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nora Abdelmageed</string-name>
          <email>nora.abdelmageed@uni-jena.de</email>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madelon Hulsebos</string-name>
          <email>m.hulsebos@uva.nl</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniela Oliveira</string-name>
          <email>dpoliveira@fc.ul.pt</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catia Pesquita</string-name>
          <email>clpesquita@fc.ul.pt</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>SUPSI</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland. vincenzo.cutrona@supsi.ch</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>City, University of London</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>FORTH-ICS</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>IBM Research</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>LASIGE, Faculdade de Cieˆncias, Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>SIRIUS, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Amsterdam</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>University of Jena</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff8">
          <label>8</label>
          <institution>data.world</institution>
          ,
          <country country="US">US</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>SemTab 2021 was the third edition of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, successfully collocated with the 20th International Semantic Web Conference (ISWC) and the 16th Ontology Matching (OM) Workshop. SemTab provides a common framework to conduct a systematic evaluation of state-of-the-art systems.</p>
      </abstract>
      <kwd-group>
        <kwd>Tabular data</kwd>
        <kwd>Knowledge Graphs</kwd>
        <kwd>Matching</kwd>
        <kwd>SemTab</kwd>
        <kwd>Semantic Web Challenge</kwd>
        <kwd>Semantic Table Interpretation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data in tabular format are the most frequent input to data analytics pipeline, thanks to
their high storage and processing efficiency. Also, the tabular format allows users to
represent the information in a compacted way, by exploiting the clear data structure
deifned by rows and columns. However, such clear structure does not imply a clear
understanding of the semantic structure (e.g., relationships between columns), as well as the
meaning of the content (e.g., if data are about a specific topic). The lack of
understanding hinders data analytics processes, requiring additional effort to properly understand
the data first. Gaining the semantic understanding is valuable for many applications,
including data cleaning, data mining, data integration, data analysis and machine
learning, and knowledge discovery. For example, the semantic understanding can help in
assessing what kind of transformations are more appropriate for a dataset, or which
datasets can be integrated to enable new analytics (e.g., marketing analysis) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
⋆ Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).
      </p>
      <p>In addition to their efficiency, the huge availability of tabular data on the Web makes
Web tables a valuable source to consider for data miners (e.g., open data CSV files).
Adding semantic information to Web tables is useful for a wide range of applications,
including web search, question answering, and knowledge base construction.</p>
      <p>Tabular data to Knowledge Graph (KG) matching is the process of clarifying the
semantic meaning of a table by mapping its elements (i.e., cells, columns, rows) to
semantic tags (i.e., entities, classes, properties) from KGs (e.g., Wikidata, DBpedia).
The task difficulty increases when table metadata ( e.g., table captions, table description,
or column names) being missing, incomplete or ambiguous.</p>
      <p>The tabular data to KG matching process is typically broken down into the following
tasks: (i) cell to KG entity matching (CEA task), (ii) column to KG class matching (CTA
task), and (iii) column pair to KG property matching (CPA task).</p>
      <p>
        Over the last decade several approaches made advances in addressing one or several
of above tasks, also constructing benchmark datasets ([
        <xref ref-type="bibr" rid="ref11 ref17 ref18 ref22">18, 22, 17, 11</xref>
        ]). The creation of
SemTab1 [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ] aimed at putting this significant amount of work into a common
framework, enabling the systematic evaluation of state-of-the-art systems. The ambition
is to make SemTab becoming the reference challenge in the Semantic Web community,
in the same way the OAEI2 is for the Ontology Matching community.3
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The Challenge</title>
      <p>The SemTab 2021 challenge has been organised into 3 different tracks: the Accuracy
Track, which is the standard track proposed in previous editions; the Usability Track,
a new track addressing the lack of publicly available, easy-to-use and generic solutions;
and the Applications Track, which focuses on applications in real-world settings where
the output of matching systems can contribute. The application track was also open to
the submission of novel benchmark datasets.
2.1</p>
      <sec id="sec-2-1">
        <title>Accuracy Track</title>
        <p>
          The Accuracy Track included 3 rounds, running from June 30 to October 15. Different
target KGs were used across rounds (see Table 1):
– DBpedia [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]: http://downloads.dbpedia.org/wiki- archive/
(version 2016-10)
– Wikidata [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]: https://zenodo.org/record/6153449
– Schema.org [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]: https://gittables.github.io/downloads/sche
ma 20210528.pkl
        </p>
        <p>
          The different rounds of SemTab 2021 have been organised to evaluate participating
systems on different datasets with variable difficulty. All the rounds were run with the
support of AIcrowd;4 SemTab 2021 also used the STILTool system [
          <xref ref-type="bibr" rid="ref5 ref8">8, 5</xref>
          ] for getting
additional insights about the submitted solutions.
1 http://www.cs.ox.ac.uk/isg/challenges/sem-tab/
2 http://oaei.ontologymatching.org/
3 http://ontologymatching.org/
4 https://www.aicrowd.com/
Datasets The different datasets used to run SemTab 2021 rounds are reported in
Table 1, with some statistics available in Table 2. All the datasets are available in Zenodo:
– Tough Tables (2T): a dataset featuring high-quality manually-curated tables with
non-obviously linkable cells, i.e., where values are ambiguous names, typos, and
misspelled entity names. These challenges are particularly relevant for the
annotation of structured legacy sources to existing KGs.
        </p>
        <p>Link: https://doi.org/10.5281/zenodo.6211551
– BioTable: a dataset focused on molecular biology data covering different entities.</p>
        <p>It has the larges number of rows per table in the challenge.</p>
        <p>Link: https://doi.org/10.5281/zenodo.5606585
– Automatically Generated (AG):5 a synthetic dataset with tables generated
automatically by means of SPARQL queries. AG is the largest dataset used in SemTab.</p>
        <p>Link: https://zenodo.org/record/6154708
– BiodivTab: a dataset with tables from real-world biodiversity research datasets.</p>
        <p>Original tables have been adapted for the SemTab challenge.</p>
        <p>Link: https://doi.org/10.5281/zenodo.5584180
– GitTables: a large-scale corpus of relational tables extracted from CSV files in
GitHub. The main purpose of this dataset is to facilitate learning table
representation models and applications in e.g., data management. A subset of tables has
been curated for benchmarking column type detection methods in SemTab.
Link: https://doi.org/10.5281/zenodo.5706316
Evaluation measures As per the previous editions, systems have been evaluated on a
single annotation for each provided target, for all the tasks; i.e., in CEA, target cells are
to be annotated with a single entity from the target KG; in CTA, target columns are to
be annotated with a single type from the target KG (as fine-grained as possible).
5 In SemTab 2021, also referred to as Hard Tables.
6 AIcrowd leaderboard scores 23 participants because of test submissions.
where target annotations refer to the target cells for CEA, the target columns for CTA,
and the target column pairs for CPA. We consider an annotation as correct when it is
included within the ground truth set (a target cell usually has multiple annotations in
the ground truth, because of redirect and same-as links in KGs).</p>
        <p>Given the fine-grained type hierarchy in Wikidata, we adopted approximations of
Precision and Recall in the CTA evaluation. Approximations adapt their numerators to
consider partially correct annotations, i.e., annotations that are ancestors or descendants
of the ground truth (GT) classes. The correctness score cscore of a CTA annotation α
considers the distance between the annotation and the GT classes in the type hierarchy,
and it is defined as</p>
        <p>0.8d(α ), if α is in GT, or an ancestor of the GT, with d(α ) ≤ 5
cscore(α ) = 0.7d(α ), if α is a descendant of the GT, with d(α ) ≤ 3
0, otherwise;
(2)
where d(α ) is the shortest distance to one of the GT classes (as for CEA, also CTA
GT columns may have multiple classes). For example, d(α ) = 0 if α is a class in the
ground truth (cscore(α ) = 1), and d(α ) = 2 if α is a grandchild of a class in the ground
truth (cscore(α ) = 0.49). Types in the higher level(s) of the KG type hierarchy are not
considered in the GT (e.g., Q35120 [entity] in Wikidata). Given the correctness
score cscore, approximated Precision (AP), Recall (AR), and F1-score (AF1) for the
CTA evaluation are as follows:
AP =</p>
        <sec id="sec-2-1-1">
          <title>P cscore(α )</title>
          <p>|System Annotations|
, AR =</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>P cscore(α )</title>
          <p>|Target Annotations|
, AF 1 =
2 × AP × AR</p>
          <p>AP + AR
(3)
Results Table 4 contains the average F1-score achieved by the 11 participating systems.
The Tough Tables dataset still represent a challenge for almost all the systems, specially
considering the fact that the the dataset is the same as in SemTab 2020. The BiodivTab
and GitTables datasets brought additional complexity in Round 3, highlighting that
realworld tables are challenging.</p>
          <p>CEA task. Results for the CEA task are reported in Figure 1 for all the datasets. The
Round 1 used the same 2T tables from last year edition,7 raising the difficulty bar
at the very beginning. Most of the systems faced important challenges when dealing
with 2T tables, with only 2 systems managing to achieve an F1-score over 0.8 and
several of them participating in only one of the tasks. It is worth noting the work of the
DAGOBAH team, which improved their system over the last year, being able to achieve
higher scores on 2T this year. Starting from Round 2, systems have been evaluated on
datasets never seen before. The AG datasets aimed at bringing new challenges in each
round, and we can observe than only the best systems managed to maintain almost the
same score on the two different versions of this dataset. Concerning bio-related datasets,
performance in Round 2 were positive (slightly below 0.9 on average), confirming that
tables with many rows (∼ 2,500 on average) do not represent a problem for most of
all the systems. Instead, the complexity brought by the (relatively small) tables in the
BiodivTab dataset represented a new problem to solve, showing significantly reduced
7 The Wikidata targets have been updated to the current Wikidata live version.
1.0
0.8
roe0.6
c
S
1
F0.4
0.2
1.0
0.8
re0.6
o
c
S
-1F0.4
0.2
0.0
MTab
MAGIC
DAGOBAH
MantisTable V
JenTab
Kepler-aSI</p>
          <p>MTab
MAGIC
DAGOBAH
JenTab
Kepler-aSI</p>
          <p>R1
2T-DBP</p>
          <p>R1
2T-WD</p>
          <p>R2
AG</p>
          <p>R2
BIO</p>
          <p>R3
AG</p>
          <p>R3
BIODIV
performance (none of the systems scored over 0.6). The JenTab system ranked 1st over
a very difficult dataset. It is worth noting, however, that members of the JenTab team
are also the providers of the BiodivTab dataset.</p>
          <p>CTA task. As shown in Figure 2, the results in the CTA tasks resemble the trend already
seen from the CEA results. This is an indicator that most of the systems solve the CTA
tasks based on annotations found in the CEA. Additional challenges have been included
in Round 3 with the GitTables dataset, where we can see a critical performance drop
for all the involved systems. It is worth emphasising that, given the general picture
provided by the results in CTA, more research is needed to make existing systems able
to deal with real-world tables, where the cells may be missing a correspondence to the
target KG.</p>
          <p>CPA task. Results for the CPA tasks are plotted in Figure 3. Currently, only BioTables
and the AG datasets provide a GT for CPA. Results are overall positive for all the tasks,
with a general improvement from Round 2 to Round 3 for all the involved systems,
except for MAGIC, whose performance dropped a bit during the last round.
1.00
0.95
e
r
co0.90
S
1
F
0.85
0.80
MTab
MAGIC
DAGOBAH
MantisTable V
JenTab
Kepler-aSI
R2
AG</p>
          <p>R2
BIO</p>
          <p>R3
AG
Starting from SemTab 2021, the organisation committee agreed to include a new track
focusing on system usability. The main goal of this track is to mitigate a pain point in
the community: the lack of publicly available, easy-to-use, and generic solution that
will address the needs of a variety of applications and settings.</p>
          <p>
            Evaluation measures Deeply evaluating the usability of a system requires user studies
to monitor different parameters [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ]. Within the SemTab scope, we decided to simply
verify the overall usability of tools as judged by a review panel. Participants’ solutions
were examined for the following criteria:
– Open source: open-source solutions make a great contribution to the community,
especially when released with a permissive license. Publicly available resources
can be used as a starting point for new tools or research investigations, and make
experiments easily reproducible.
– System dependencies: some tools may require specific platforms to be executed on
premises, or have a huge resource consumption that may affect the use in common
settings. For example, requiring many indexes/databases may prevent the usage of
a tool by users with limited access to hardware.
– Model generality: a tool may be considered general when it applies to different
(and new) applications/domains, requiring near-zero adaptations; for example, tools
employing machine learning techniques should not require extensive training and
tuning to be adapted to different contexts.
– Availability: tools may not be released as open source, but offered as a publicly
available services. In this case, a tool served as a public service supports further
research activities, and represent a big contribution to the community.
– User experience: the purpose of a tool is to help people in solving a task; for this
reason, semantic table to graph matching tools should come with a well-designed
user interface that makes the tool usable also by practitioner with a limited
experience in semantic matching. That is, the tool should not require an extensive training
to be mastered.
Results Almost all the core participants obtained good results in this track, by
performing well on one or more of the above evaluation criteria. Evaluation details are reported
in Table 5. We exclude system dependencies and model generality because of the
insufifcient available evidence, which resulted in these two criteria not impacting the overall
assessment strongly. Indeed, available data about system performance (i.e., accuracy)
with reference to the different datasets and target KGs used in SemTab rounds do not
allow us to draw any consistent conclusions. For example, it is not clear if tools were
customized or tweaked (e.g., changing the lookup function for noisy data) to increase
their accuracy in different rounds; we are not able to assess how hard a system adapts
to a different context (e.g., changing the target KG).
          </p>
          <p>The evaluation panel concluded that most of the tools are pre-configured and can
potentially be used out of the box: for example, JenTab has been packaged in Docker
containers to ease the deployment and execution of the tool on local premises. In
general, tools requirements vary in complexity, but they are reasonable overall (e.g.,
preprocessing required, like creating new indexes or embeddings).</p>
          <p>Considering the other criteria, JenTab is the only system released as open source
under a permissive license (Apache 2.0). The MTab tool has been made publicly
available as a Web service, free to use (MIT license); but the back-end application has not
been disclosed. However, having a public API enables MTab serving third-party
application (with no rate limit), and this was a key point in declaring MTab the most usable
tool. Systems like DAGOBAH and MantisTable delivered a framework with impressive
GUIs, while others (e.g., MAGIC) opted for a lightweight application.
This new track aims at addressing applications in real-world settings that take advantage
of the output of the matching systems. Challenging dataset proposals have also been
accepted and included within the SemTab 2021 rounds.</p>
          <p>Results A specific application has been identified within the biological domain, where
new data are constantly produced thanks to the advances in the field. The domain is
particularly challenging from the semantics standpoint because of the the complexity
of the biological relations between entities. Within SemTab, the data representation
significantly impact the systems performance since entities are usually represented by
codes (e.g., chemical formulas or gene names). Two different datasets have been
submitted related to the biological domain; the first one, BioTables, is a dataset focused
on molecular biology data; the second, BiodivTab, is a dataset focused on biodiversity
research data and data augmentation.</p>
          <p>Along side the above domain, a different dataset has been submitted to this track and
also included in Round 3, GitTables. This dataset includes relational tables extracted
from CSV files hosted at GitHub, and it comes with a peculiarity: the GT for CTA
uses a mixture of classes and properties to annotate columns (both for the DBpedia and
Schema.org versions).</p>
          <p>The three datasets brought new complexity and contributed to increment the data
diversity among the SemTab benchmark datasets.
2.4</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Prizes</title>
        <p>As in previous editions, IBM Research8 sponsored SemTab 2021 and awarded the best
systems in each track with the following prizes:
– Accuracy Track: DAGOBAH (1st prize) was the top system in most of the tasks,
showing appreciable improvements over the last years. Honorary mention to MTab
– Usability Track: MTab team (1st prize), for providing the easy-to-use MTab tool9
along with Web services to lookup entities and annotate tables; JenTab (2st prize),
for being the only open-source system with a permissive license. Honorary
mentions to DAGOBAH, MAGIC and MantisTable.
– Applications Track: BiodivTab dataset (1st prize), for having brought new
challenges in CEA and CTA tasks. Honorary mention to GitTables.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Lessons Learned and Future Work</title>
      <p>Avoiding over-fitting to AG. We have been using the same automated dataset generation
process, with some variations that make it more challenging, since the first SemTab
challenge. This may be resulting to participating systems that explicitly target datasets
with characteristics similar to those of the AG datasets. This becomes evident from the
almost perfect results shown in Table 4. For that reason, this year we have introduced
several new datasets, while we are also planning to use as much as possible real data,
rather than synthetic, in the future versions of the challenge.</p>
      <p>
        System generalizability beyond KGs. Many systems currently rely on matching table
values to entities in KGs. In this version of SemTab, we challenged the participating
systems on their ability to detect the semantic types of table columns even when their
values are not linkable to KG entities. We conclude that most systems do not generalize
well in this scenario as indicated by the performance drop on the CTA task for GitTables
(see Section 2.1). Improving systems to this end would make them useful for expanding
KG coverage by matching tables from novel data sources to KGs in order to populate the
8 https://www.research.ibm.com/
9 https://github.com/phucty/mtab tool
“unknown unknowns” [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. This generalizability would also benefit the applicability of
the systems in offline databases. We plan to encourage and evaluate systems on their
generalizability towards novel data sources in future versions of SemTab.
CTA vs CPA: the case of GitTables. Since the first edition of SemTab, we are used to
consider CTA and CPA as two separated tasks, the first focuses on ontology classes, and
the latter is dedicated to properties. However, GitTables annotations for CTA includes
also properties from DBpedia and Schema.org. The rationale behind this choice stands
in the relational nature of the considered tables: columns typically correspond to the
attributes of an entity, which are reflected by properties in DBpedia and Schema.org,
for example. Also, this choice is very useful when annotating literal columns (i.e.,
columns not containing mentions of entities), avoiding annotations based on datatypes
(e.g., xsd:string). Therefore, GitTables introduced a new technical challenge, which
potentially contributed to the complexity observed from the results in Figure 2. The case
of GitTables may result in a new task to accomplish in the future, given that it
enables table-to-KG matching with tables from alternative data sources and contexts (e.g.,
database dumps from industry).
      </p>
      <p>
        Usability track. We believe that the introduction of the usability track has contributed
to making participating systems publicly accessible. Our goal was exactly to encourage
this, despite the competitive nature that a challenge may have. Thus, we consider this
new track to be a very important one and we are planning to keep it in the next
challenges. Next SemTab editions may consider to improve the evaluation of this track,
for example by adopting the System Usability Scale (SUS) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to score the overall user
experience. In particular, developing a systematic way to evaluate systems’ generality
and dependencies would definitely improve the evaluation of this track.
Applications track. We believe that the call of the application track has grasped more
attention from the community by introducing their own datasets. Contributions from
the community like BiodivTab, BioTable and GitTables help in extending the SemTab
benchmark with new real-world challenges that are hard to reproduce in synthetic
datasets as AG. Thus, this new track has been an important addition to SemTab.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>We would like to thank the challenge participants, the ISWC &amp; OM organisers, the
AIcrowd team, and our sponsor IBM Research that played a key role in the success of
SemTab. We also thank Paul Groth and C¸ ag˘atay Demiralp for their contributions to
GitTables. Moreover, we would like to thank Sirko Schindler and Birgitta Ko¨nig-Ries
for their contribution to BiodivTab. This work was also supported by the SIRIUS
Centre for Scalable Data Access (Research Council of Norway), Samsung Research UK,
the EPSRC projects UK FIRES and ConCur, and the HFRI project ResponsibleER (No
969). DO and CP were supported by FCT through LASIGE (UIDB/00408/2020 and
UIDP/00408/2020). We would also like to acknowledge that the work of the challenge
organisers was greatly simplified by using the EasyChair conference management
system and the CEUR-WS.org open-access publication service.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>N.</given-names>
            <surname>Abdelmageed</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Schindler. JenTab Meets SemTab</surname>
          </string-name>
          <article-title>2021's New Challenges. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEURWS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>N.</given-names>
            <surname>Abdelmageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schindler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Ko¨</surname>
          </string-name>
          nig-Ries.
          <article-title>BiodivTab: A Tabular Benchmark based on Biodiversity Research Data. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <surname>Z. Ives.</surname>
          </string-name>
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          .
          <source>In The Semantic Web</source>
          , pages
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          . Springer Berlin Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Avogadro</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Cremaschi. MantisTable V:</surname>
          </string-name>
          <article-title>A novel and efficient approach to Semantic Table Interpretation. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Avogadro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cremaschi</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rula</surname>
          </string-name>
          .
          <article-title>A Framework for Quality Assessment of Semantic Annotations of Tabular Data</article-title>
          .
          <source>In 20th International Semantic Web Conference (ISWC)</source>
          , pages
          <fpage>528</fpage>
          -
          <lpage>545</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>W.</given-names>
            <surname>Baazouzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kachroudi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Faiz.</surname>
          </string-name>
          Kepler-aSI at
          <article-title>SemTab 2021</article-title>
          .
          <article-title>In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>J.</given-names>
            <surname>Brooke</surname>
          </string-name>
          .
          <article-title>SUS: a 'quick and dirty' usability scale</article-title>
          .
          <source>Usability evaluation in industry</source>
          ,
          <volume>189</volume>
          (
          <issue>3</issue>
          ),
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M.</given-names>
            <surname>Cremaschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Siano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Avogadro</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Maurino. STILTool: A Semantic Table</surname>
          </string-name>
          <article-title>Interpretation evaLuation Tool</article-title>
          .
          <source>In ESWC 2020 Satellite Events</source>
          , pages
          <fpage>61</fpage>
          -
          <lpage>66</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>V.</given-names>
            <surname>Cutrona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Palmonari</surname>
          </string-name>
          . Tough Tables:
          <article-title>Carefully Evaluating Entity Linking for Tabular Data</article-title>
          .
          <source>In 19th International Semantic Web Conference (ISWC)</source>
          , pages
          <fpage>328</fpage>
          -
          <lpage>343</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>V.</given-names>
            <surname>Cutrona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. D.</given-names>
            <surname>Paoli</surname>
          </string-name>
          , A. Kosˇmerlj, N. Nikolov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmonari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Perales</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Roman</surname>
          </string-name>
          .
          <article-title>Semantically-Enabled Optimization of Digital Marketing Campaigns</article-title>
          .
          <source>In International Semantic Web Conference (ISWC)</source>
          , pages
          <fpage>345</fpage>
          -
          <lpage>362</lpage>
          . Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodriguez-Muro</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Christophides</surname>
          </string-name>
          .
          <article-title>Matching Web Tables with Knowledge Base Entities: From Entity Lookups to Entity Embeddings</article-title>
          .
          <source>In ISWC</source>
          , volume
          <volume>10587</volume>
          , pages
          <fpage>260</fpage>
          -
          <lpage>277</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Brickley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Macbeth</surname>
          </string-name>
          . Schema.Org:
          <article-title>Evolution of Structured Data on the Web</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>59</volume>
          (
          <issue>2</issue>
          ):
          <fpage>44</fpage>
          -
          <lpage>51</lpage>
          , jan
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>M. Hulsebos</surname>
          </string-name>
          , C¸ .
          <string-name>
            <surname>Demiralp</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Groth. GitTables: A Large-Scale Corpus</surname>
          </string-name>
          of Relational Tables. CoRR, abs/2106.07258,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. V.
          <string-name>
            <surname>-P. Huynh</surname>
            , J. Liu,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chabot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Deuze</surname>
            ´, T. Labbe´,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Monnin</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Troncy</surname>
          </string-name>
          . DAGOBAH:
          <article-title>Table and Graph Contexts For Efficient Semantic Annotation Of Tabular Data. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. E.
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Hassanzadeh</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Efthymiou</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Srinivas</surname>
          </string-name>
          .
          <source>SemTab</source>
          <year>2019</year>
          :
          <article-title>Resources to Benchmark Tabular Data to Knowledge Graph Matching Systems</article-title>
          .
          <source>In The Semantic Web: ESWC</source>
          . Springer International Publishing,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. E.
          <string-name>
            <surname>Jime</surname>
          </string-name>
          <article-title>´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Hassanzadeh</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Efthymiou</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Srinivas</surname>
            , and
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Cutrona</surname>
          </string-name>
          .
          <article-title>Results of SemTab 2020</article-title>
          .
          <article-title>In Proceedings of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching co-located with the 19th International Semantic Web Conference (ISWC</article-title>
          <year>2020</year>
          ), pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>O.</given-names>
            <surname>Lehmberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Meusel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>A large public corpus of web tables containing time and context metadata</article-title>
          .
          <source>In WWW</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. G. Limaye,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarawagi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Chakrabarti</surname>
          </string-name>
          .
          <article-title>Annotating and searching web tables using entities, types and relationships</article-title>
          .
          <source>VLDB Endowment</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          -2):
          <fpage>1338</fpage>
          -
          <lpage>1347</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>P.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , I. Yamada,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kertkeidkachorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ichise</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Takeda</surname>
          </string-name>
          .
          <source>SemTab</source>
          <year>2021</year>
          :
          <article-title>Tabular Data Annotation with MTab Tool</article-title>
          . In Semantic Web Challenge on
          <article-title>Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>D.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          .
          <article-title>SemTab 2021 BioTable Dataset</article-title>
          . doi:
          <volume>10</volume>
          .5281/zenodo.5606585,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>C. Pesquita</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lohmann</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lambrix</surname>
          </string-name>
          .
          <article-title>A framework to conduct and report on empirical user studies in semantic web contexts</article-title>
          .
          <source>In European Knowledge Acquisition Workshop</source>
          , pages
          <fpage>567</fpage>
          -
          <lpage>583</lpage>
          . Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lehmberg</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Matching HTML Tables to DBpedia</article-title>
          .
          <source>In Proceedings of the 5th International Conference on Web Intelligence</source>
          , Mining and Semantics, WIMS, pages
          <volume>10</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          :
          <fpage>6</fpage>
          . ACM,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>B.</given-names>
            <surname>Steenwinckel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. D.</given-names>
            <surname>Turck</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ongenae</surname>
          </string-name>
          . MAGIC:
          <article-title>Mining an Augmented Graph using INK, starting from a CSV. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Kro¨tzsch. Wikidata: a free collaborative knowledge base</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>57</volume>
          (
          <issue>10</issue>
          ):
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25. G. Weikum.
          <source>Knowledge Graphs</source>
          <year>2021</year>
          :
          <article-title>A Data Odyssey</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>14</volume>
          (
          <issue>12</issue>
          ):
          <fpage>3233</fpage>
          -
          <lpage>3238</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. L.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Ding</surname>
            , and
            <given-names>J. Jin.</given-names>
          </string-name>
          <article-title>GBMTab: A Graph-Based Method for Interpreting Semantic Table to Knowledge Graph. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab)</article-title>
          .
          <source>CEUR-WS.org</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>