<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SemTab 2021: Tabular Data Annotation with MTab Tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Phuc Nguyen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ikuya Yamada</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natthawut Kertkeidkachorn</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ryutaro Ichise</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hideaki Takeda</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Japan Advanced Institute of Science and Technology</institution>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Institute of Informatics</institution>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Studio Ousia</institution>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents MTab, an automatic tool for tabular data annotation with knowledge graphs. MTab tool could provide helpful information for tabular data such as structural annotations (e.g., table headers, subject column) or semantic annotations with knowledge graph concepts from Wikidata, DBpedia, and Wikipedia (e.g., cells with entities, columns with types, and column pairs with properties). The tool supports multilingual tables and could process many table formats such as Excel, CSV, TSV, markdown tables, or a pasted table content. MTab achieves impressive empirical performance on many datasets: 1st on HardTable CEA, CTA, CPA tasks, BioTable CTA, CPA tasks, and HardTablesR3 CPA task. Additionally, the system also got the 1st on usability track with advanced features: easy-to-use, generic solution, welldesigned user interface. MTab's graphical interface, public APIs, documents are available at https://github.com/phucty/mtab_tool.</p>
      </abstract>
      <kwd-group>
        <kwd>tabular data annotation</kwd>
        <kwd>knowledge graph</kwd>
        <kwd>semantic annotation</kwd>
        <kwd>structural annotation</kwd>
        <kwd>Wikidata</kwd>
        <kwd>Wikipedia</kwd>
        <kwd>DBpedia</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Open Data movement has made many valuable tabular resources available
on the Internet and Open Data Portals. However, due to insu cient data
descriptions, various data formats, and terminology issues, the use of tabular data
in applications is constrained. Many tabular data lack a description, or the
description is not adequately described the data. Table structure and layout are
also lacking in many tabular resources. Furthermore, many tables do not
employ conventional vocabularies, such as multilingual expressions, abbreviations,
ambiguous or many misspellings, and encoding issues. To improve tabular data
usability, it is necessary to have a tabular data annotation system capable of
providing explicit information about table content.</p>
      <p>This paper introduces MTab, an automatic tool that generates structural
and semantic annotations for tabular data. MTab tool, as illustrated in Figure 1,
Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
    </sec>
    <sec id="sec-2">
      <title>Header</title>
    </sec>
    <sec id="sec-3">
      <title>Subject</title>
    </sec>
    <sec id="sec-4">
      <title>Column</title>
      <sec id="sec-4-1">
        <title>Structural Annotations</title>
      </sec>
      <sec id="sec-4-2">
        <title>Semantic Annotations</title>
        <p>CEA
Cell
CTA
Column
CPA
Relation</p>
        <p>Entity
Q1490 (Tokyo)</p>
        <p>Type
Q5119 (capital)
Property
P1082 (population)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Knowledge Graphs</title>
      <p>could provide helpful information for tabular data such as structural annotations
(e.g., table headers, subject column) or semantic annotations with knowledge
graph concepts from Wikidata, DBpedia, and Wikipedia, e.g., a cell with entity
annotation (CEA task), a column with type (or class) annotation (CTA task),
and a column pair with property annotation (CPA task). The tool supports
multilingual tables and could process many table formats such as Excel, CSV,
TSV, markdown tables, or a pasted table content.</p>
      <p>MTab archives impressive performance on many datasets: 1st on HardTable
CEA, CTA, CPA tasks, BioTable CTA, CPA tasks, and HardTablesR3 CPA task.
Additionally, the system also got the 1st on usability track with advanced
features: easy-to-use, generic solution, well-designed user interface. The user could
access MTab's graphical interface, APIs, documents at https://github.com/
phucty/mtab_tool.
2</p>
      <sec id="sec-5-1">
        <title>Related Work</title>
        <p>
          an aggregation of multiple cross-lingual lookup services and probabilistic
graphical models [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. CSV2KG (IDLab) also uses multiple lookup services to improve
matching performance [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. Tabular ISI implements the lookup part with
Wikidata API, and Elastic Search on DBpedia labels and aliases [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. ADOG [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]
system also uses Elastic Search to index knowledge graph. LOD4ALL rst checks
whereas there is an available entity which has a similar label with table cell
using ASK SPARQL, else perform DBpedia entity search [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. DAGOBAH system
performs entity linking with a lookup on Wikidata and DBpedia; the authors
also used Wikidata entity embedding to estimate the entity type candidates [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
Mantis Table provides a Web interface and API for tabular data matching [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          In SemTab 2020, the matching target knowledge graph is Wikidata
including new set of di culties such as larger-scale of data, graph shifting, rich and
complex data schema in Wikidata. Beside the generated tabular data from
Wikidata, there was a new manually curated dataset (tough tables [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]). The winner
system, MTab4Wikidata proposed new fuzzy entity and statement search
methods to improve entity candidate generation (with 99.89% coverage) [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. The
bbw system [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] are based on contextual matching and meta-lookup with SearX
metasearch engine to deal with spelling mistakes. LinkingPart [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], DAGOBAH
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], JenTab [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], MantisTable SE [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], SSL [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], AMALGAM [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] systems proposed
new scoring functions to rank the matching results.
        </p>
        <p>
          However, most solutions or systems are not available to use or require
extensive con guration, setup, high computing power, or high time complexity [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ].
We implement the MTab tool and release the public APIs and interfaces to
address the usability issue of the current annotation systems.
3
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>MTab Tool</title>
        <p>This section describes MTab tool, started with the system assumptions in Section
3.1, then the overall framework is described in Section 3.2.
3.1</p>
        <p>Assumptions
Assumption 1 MTab tool is built on a closed-world assumption.
It means that the tool could return incorrect answers if table elements are not
available in the knowledge graph.</p>
        <p>Assumption 2 We assume that the input tables are horizontal relational types.
A horizontal relational table contains semantic knowledge graph triples in
[subject, predicate, object]. The table also has a subject column containing entity
names and the relation between the subject column and other columns
representing the predicate relation between the entities (subject) and attribute values
(object).</p>
        <p>Assumption 3 We assume that all the cell values of the same column have the
same data type, and the entities related to cell values are of the same type.
Assumption 4 MTab tool treats input tables independently.
-Text (string)
-File (CSV, TSV, EXCEL)
-Table Object</p>
        <p>Preprocessing
In this paper, we focus on the usability factor of the annotation system. So,
we implement the MTab tool to support multilingual tables and could process
various table formats. The system e ciency also is an important concern of
the implementation so that we optimize the annotations run time by about 1.52
sec/table on average (tested on SemTab 2020 dataset). Moreover, we also provide
graphical interfaces to visualize the annotation results as in Section 4.</p>
        <p>The overall framework of the MTab tool is described in Fig. 2. We build
WikiGraph, which is an integrated knowledge graph from Wikidata, DBpedia, and
Wikipedia as in Section 3.2.1. The annotation procedure is started with data
preprocessing as in Section 3.2.2. Then, the system performs data type prediction,
header prediction, and subject column prediction as in the structural
annotations section (Section 3.2.3). Finally, MTab performs semantic annotations as in
Section 3.2.4.
3.2.1 Knowledge Graph We build a WikiGraph from the dump data of
Wikidata, Wikipedia, and DBpedia as the target knowledge graph for the
annotation tasks. With the dump data on 1 January 2021, we extracted 91.2 million
entities and 249.3 million entity labels in multilingual, including entity labels,
aliases, other names, redirect entity labels, and disambiguation entities. We also
extracted 3.5 billion triples in WikiGraph. Additionally, WikiGraph will be
updated frequently based on the future released dumps of knowledge graphs
(Wikidata, Wikipedia, and DBpedia).
3.2.2</p>
        <p>
          Preprocessing
Data Type Prediction The system rstly predicts a table cell's data type into
either non-cell (empty cell), literal, or named-entity (NE). We use the
pretrained SpaCy models [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] (trained using the OntoNotes 5 dataset) to
identify named entities (PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT,
EVENT, WORK OF ART, LAW, LANGUAGE) and date-time and numeric
entities (DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL,
CARDINAL). We associate the named entities to NE type, and date-time and numeric
entities to literal types. If there is no assigned named entities of SpaCy outputs,
we associate the cell type as NE because the SpaCy model could miss recognized
named-entity of table cells.
        </p>
        <p>
          Next, the system predicts a table column's data type into either a non-match
column (empty column), a literal, or a named-entity column. The column data
type is derived from the majority voting of all cell data types in this column.
Header Prediction We use simple heuristics to predict table headers as follows.
{ Table headers could be located in some of the rst rows of a table.
{ If the list of data types of the header candidate row di ers from most data
types of the remaining rows, the candidate is the table header. For example,
the list of data types of header candidate (row) is [named-entity,
namedentity, named-entity], while the list of the majority data type of remaining
rows is [named-entity, literal, literal].
{ We also found that the length of header text is empirically shorter or longer
than the remaining data rows. If the length of values of the header candidate
row is less than the 0.05 quantile or larger than the 0.95 quantiles of the
length of the value of remaining rows, the candidates are the table header.
Subject Column Prediction We adopt the heuristics proposed by Ritze et al. [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]
as well as modify a simple heuristic to predict the subject column of a table as
follows.
        </p>
        <p>{ A column is a subject column when its data type is a named-entity type.
{ The average cell value length is from 3.5 to 200. We also add a restriction
that only considers non-header cells since the length of table headers could
di er from the remaining cells.
{ The subject column is determined based on the uniqueness score as an
increased score for columns with many unique values and reduces the score
for columns with many missing values. The subject column is the highest
unique score column. If we have many columns that have the same score,
the left-most column is chosen.
3.2.4</p>
        <p>Semantic Annotations
Matching Target Prediction: MTab automatically predicts the matching targets
based on data types when the input does not have matching targets. The CEA
matching targets are the table cells whose data types are named entity types. The
CTA matching targets are columns so that the column data types are named
entity types. The CPA matching targets are the relation between the subject
column and the remaining table columns.</p>
        <p>
          Entity Search: We perform entity candidate generation for each table cell with
the entity search modules. MTab tool provides the three entity search
modules, i.e., keyword search, fuzzy search, and aggregation search1. We
implement the keyword search using BM25 algorithm with the hyper-parameters as
b = 0:75; k1 = 1:2. The fuzzy search is implemented using Damerau{Levenshtein
edit distance. We perform candidate ltering and hashing with pre-calculating
entity label deletes as the Symmetric Delete algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to reduce the number
of operations on pairwise edit distance calculation and capable of up to six edits.
In the aggregation search, we combine the results of keyword search and fuzzy
search. In our experiments, we use the aggregation search as the default entity
search.
        </p>
        <p>
          Post-Processing: We calculate context similarities with the value-based matching
between statements of entity candidates in the subject column with table row
values. Finally, generate the annotations for entities, properties, and types based
on majority voting of context similarities [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
4
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>Interfaces</title>
        <p>4.1</p>
        <p>Entity Search
The entity search interface is available at https://mtab.app/mtabes. Fig. 3
depicts an example of entity search with the query of \2MASS J10540655-0031018".
MTab tool supports multilingual search so that users could type entity name
expressed in any language.
1 Entity Search Documents: https://mtab.app/mtabes/docs
The table annotation interface is available at https://mtab.app. Users could
submit table les in various table formats, expressed in any language to MTab
API, or copy data content and paste it to the interface. Then, users could tap
the \Annotate" button to get the annotation results.</p>
        <p>Fig. 4 illustrates an annotation example of a SemTab dataset's table. MTab
took 0.49 seconds to annotate a pasted table from the text box (left picture).
The photo on the right is the annotation results. The table header is in the rst
row, and the subject column is in the rst column. Entity annotations are in
red and located below the table cell value. The type annotation is in green and
located in the \Type" column. Finally, the relations between the subject column
and other columns are in blue and located in the property column.
This paper presents the MTab tool for table annotation with Wikidata, DBpedia,
and Wikipedia knowledge graphs. MTab tool achieves promising performance on
many datasets of SemTab 2021. Moreover, the system also got the rst rank of
usability track.</p>
        <p>In the future work, we will focus on e ciency improvement of the MTab
tool by processing only small parts of table content and continues expanding
until there is no di erence in the annotation results. Another direction is
building downstream applications based on MTab's annotations, such as question
answering and data analysis.</p>
      </sec>
      <sec id="sec-5-4">
        <title>Acknowledgements</title>
        <p>The research was supported by the Cross-ministerial Strategic Innovation
Promotion Program (SIP) Second Phase, \Big-data and AI-enabled Cyberspace
Technologies" by the New Energy and Industrial Technology Development
Organization (NEDO).
2 SemTab 2021 Leaderboards: https://www.aicrowd.com/challenges/semtab-2021/
leaderboards</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abdelmageed</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schindler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Jentab:
          <article-title>Matching tabular data to knowledge graphs</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>40</volume>
          {
          <issue>49</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Azzi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diallo</surname>
          </string-name>
          , G.:
          <article-title>Amalgam: A matching approach to fairfy tabular data with knowledge graph model</article-title>
          .
          <source>arXiv preprint arXiv:2101.06637</source>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chabot</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Labbe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troncy</surname>
          </string-name>
          , R.:
          <article-title>Dagobah: an end-to-end context-free tabular data semantic annotation system</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>41</volume>
          {
          <issue>48</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karaoglu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Negreanu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gordon</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.Y.</given-names>
          </string-name>
          :
          <article-title>Linkingpark: An integrated approach for semantic table interpretation</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>65</volume>
          {
          <issue>74</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cremaschi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avogadro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barazzetti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chieregato</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Mantistable se: an e cient approach for the semantic table interpretation</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>75</volume>
          {
          <issue>85</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cremaschi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avogadro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chieregato</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Mantistable: an automatic approach for the semantic table interpretation</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>15</volume>
          {
          <issue>24</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cremaschi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Paoli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rula</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spahiu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A fully automated approach to a complete semantic table interpretation</article-title>
          .
          <source>Future Generation Computer Systems</source>
          <volume>112</volume>
          ,
          <fpage>478</fpage>
          {
          <fpage>500</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Cutrona</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bianchi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palmonari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Tough tables: Carefully evaluating entity linking for tabular data</article-title>
          .
          <source>In: ISWC</source>
          . pp.
          <volume>328</volume>
          {
          <fpage>343</fpage>
          . Springer (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Garbe</surname>
          </string-name>
          , W.:
          <article-title>Symspell: Symmetric delete algorithm</article-title>
          . https://github.com/ wolfgarbe/SymSpell (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Honnibal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montani</surname>
          </string-name>
          , I.:
          <article-title>spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing (</article-title>
          <year>2017</year>
          ), https: //spacy.io/, to appear
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Huynh</surname>
            ,
            <given-names>V.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chabot</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Labbe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monnin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troncy</surname>
          </string-name>
          , R.: Dagobah:
          <article-title>Enhanced scoring algorithms for scalable annotations of tabular data</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>27</volume>
          {
          <issue>39</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassanzadeh</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Efthymiou</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Semtab 2019: Resources to benchmark tabular data to knowledge graph matching systems</article-title>
          .
          <source>In: ESWC</source>
          . vol.
          <volume>12123</volume>
          , pp.
          <volume>514</volume>
          {
          <fpage>530</fpage>
          . Springer (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassanzadeh</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Efthymiou</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cutrona</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Results of semtab 2020</article-title>
          .
          <article-title>In: SemTab@ISWC</article-title>
          . vol.
          <volume>2775</volume>
          , pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Generating conceptual subgraph from tabular data for knowledge graph matching</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>96</volume>
          {
          <issue>103</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Morikawa</surname>
          </string-name>
          , H.:
          <article-title>Semantic table interpretation using lod4all</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>49</volume>
          {
          <issue>56</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kertkeidkachorn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ichise</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Takeda</surname>
          </string-name>
          , H.:
          <article-title>Mtab: Matching tabular data to knowledge graph using probability models</article-title>
          .
          <source>In: SemTab@ISWC 2019</source>
          . vol.
          <volume>2553</volume>
          , pp.
          <volume>7</volume>
          {
          <issue>14</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kertkeidkachorn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ichise</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Takeda</surname>
          </string-name>
          , H.:
          <article-title>Tabeano: Table to knowledge graph entity annotation</article-title>
          . CoRR abs/
          <year>2010</year>
          .
          <year>01829</year>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamada</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kertkeidkachorn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ichise</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Takeda</surname>
          </string-name>
          , H.:
          <article-title>Mtab4wikidata at semtab 2020: Tabular data annotation with wikidata</article-title>
          .
          <source>In: SemTab@ISWC</source>
          . vol.
          <volume>2775</volume>
          , pp.
          <volume>86</volume>
          {
          <issue>95</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Oliveira</surname>
          </string-name>
          , D.,
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Adog-annotating data with ontologies and graphs</article-title>
          . In: SemTab@ ISWC. pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Ritze</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmberg</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Matching html tables to dbpedia</article-title>
          .
          <source>In: Proceedings of the 5th International Conference on Web Intelligence</source>
          , Mining and Semantics,
          <string-name>
            <surname>WIMS</surname>
          </string-name>
          <year>2015</year>
          . pp.
          <volume>10</volume>
          :
          <issue>1</issue>
          {
          <issue>10</issue>
          :
          <article-title>6</article-title>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Shigapov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zumstein</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamlah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Oberlander, L.,
          <string-name>
            <surname>Mechnich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schumm</surname>
          </string-name>
          , I.:
          <article-title>bbw: Matching csv to wikidata via meta-lookup</article-title>
          . vol.
          <volume>2775</volume>
          , pp.
          <volume>17</volume>
          {
          <issue>26</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Speer</surname>
          </string-name>
          , R.: ftfy.
          <source>Zenodo</source>
          (
          <year>2019</year>
          ), https://github.com/LuminosoInsight/ python-ftfy,
          <source>version 5.5</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Thawani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zafar</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Divvala</surname>
            ,
            <given-names>N.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qasemi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szekely</surname>
            ,
            <given-names>P.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pujara</surname>
          </string-name>
          , J.:
          <article-title>Entity linking to knowledge graphs to infer column types and properties</article-title>
          . In: SemTab@ ISWC. pp.
          <volume>25</volume>
          {
          <issue>32</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Vandewiele</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steenwinckel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Turck</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ongenae</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Cvs2kg: Transforming tabular data into semantic knowledge</article-title>
          .
          <source>In: SemTab@ ISWC</source>
          . pp.
          <volume>33</volume>
          {
          <issue>40</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shiralkar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lockard</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>X.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>TCN: table convolutional network for web table interpretation</article-title>
          .
          <source>In: WWW '21</source>
          . pp.
          <volume>4020</volume>
          {
          <fpage>4032</fpage>
          . ACM / IW3C2 (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>