<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generating Table Vector Representations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aneta Koleva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Ringsquandl</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mitchell Joblin</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volker Tresp</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ludwig Maximilian University of Munich</institution>
          ,
          <addr-line>Geschwister-Scholl-Platz 1, 80539 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Siemens</institution>
          ,
          <addr-line>Otto-Hahn-Ring 6, 81739 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>High-quality Web tables are rich sources of information that can be used to populate Knowledge Graphs (KG). The focus of this paper is an evaluation of methods for table-to-class annotation, which is a sub-task of Table Interpretation (TI). We provide a formal definition for table classification as a machine learning task. We propose an experimental setup and we evaluate 5 fundamentally diferent approaches to find the best method for generating vector table representations. Our findings indicate that although transfer learning methods achieve high F1 score on the table classification task, dedicated table encoding models are a promising direction as they appear to capture richer semantics.</p>
      </abstract>
      <kwd-group>
        <kwd>table interpretation</kwd>
        <kwd>table classification</kwd>
        <kwd>representation learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Tabular data is one of the most prevalent data representations. The efort by Cafarella [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
known as WebTables, identified and extracted more than 200 million high-quality tables from
HTML pages. The availability of such large corpus of structured data initiated several directions
of research related to the diferent applications of tabular data such as: table search [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], table
improvement [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], question answering [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and semantic annotation of columns [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. As a result of
the increasing adoption of KGs, which are often populated from tabular data, the task of aligning
tables with KGs, also referred to as table interpretation (TI), has become a highly relevant task.
In contrast to information extraction from unstructured documents, TI should leverage the
explicit relational structure. The unique table structure with rows and columns of cells and
other metadata can be exploited for discovery and disambiguation of the meaning captured in
the table. The task of TI entails three diferent sub-tasks. The first sub-task, which is the focus in
this paper, is the classification of tables according to classes in a given KG schema. The second
sub-task is related to linking rows from tables to existing entities in the KG. The annotation of
columns as entity attributes and the discovery of binary relations between columns is the third
sub-task of TI. While there have been several works focusing on the row-to-entity [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ], and
column-to-attribute sub-tasks [
        <xref ref-type="bibr" rid="ref5 ref9">5, 9</xref>
        ], the task of linking a table to a class has been neglected.
However, in the case of entity tables, where one column (the core column) is associated to the
name of the entity and the remaining columns are attributes of this entity, discovering the class
of the table as a first step can greatly improve the solving of the other two sub-tasks. It is often
the case that the column names are missing or incorrect, therefore finding the name of the core
ISWC 2021 Workshop DL4KG
nEvelop-O
CEUR
column does not imply finding the class of the table. Moreover, when two tables have the same
column names and similar content (e.g., one table of class Country and one of class City), it is
not trivial to disambiguate the entities and column types based only on the table content. Once
a table has been interpreted, its content can be used for extracting new triples for enriching the
KG, a task known as KG completion, or for extracting missing facts for the KG, which is the task
of slot-filling .
      </p>
      <p>Due to the inherent scarcity of labelled data for the first sub-task (class-annotated tables),
a table classification model must either be of low complexity (few parameters) or leverage
pre-trained models. Using pre-trained models in TI has been studied only to a very limited
extend. Hence, we explore two promising directions for making learning-based approaches
more eficient: (a) by using transfer learning, (b) by considering additional inductive biases that
are unique to tabular data representations.</p>
      <p>
        We propose an experimental setup with the intention of finding the best method for
generating a representation which captures the information from the table but also the row and column
structure, so that it can be later used towards solving the remaining sub-tasks of TI: row-to-entity
linking, column type annotation and relation extraction. We are interested in understanding
how pre-trained language models, such as BERT [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and their dedicated table-based
counterparts, for instance TaBERT [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], can be utilized for generating vector representation for table.
Surprisingly, our experiments show that a transfer learning method with a rich vocabulary
of pre-trained word embeddings achieves similar F1 score compared to more sophisticated
pre-trained language models (LM). Another interesting finding is that the inductive bias for
tabular structure in the LM pre-trained on tabular data does not bring beneficial impact to a text
pre-trained LM. However, the classification confusion matrix for this method, gives an insight
to the miss-classifications being justifiable and reasonable. Our main contributions are:
• A formal definition of table classification as a machine learning task and a protocol for
evaluating performance on this task.
• A setup for table encoding using 5 fundamentally diferent approaches covering a spectrum
of paradigms from general purpose document encoders to specialized pre-trained models
designed for tabular data.
      </p>
      <p>• An extensive empirical evaluation of the diferent approaches.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        In this section, we review prior work related to solving the diferent sub-tasks of TI. We also
give a short overview of methods for generating vector representations of tables.
Table Interpretation The three sub-tasks of TI were first introduced in the paper by Ritze
et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. That paper also introduced the T2K Matcher, a method for iterative value-based
matching, which solves the TI tasks by matching values from the tables to values of retrieved
candidates from the KG. More recent work by Limaye et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposed a probabilistic graphical
method which attempts to jointly solve the two sub-tasks of finding entity-to-row and
columnto-attribute alignments. Deng et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] exploited word embeddings for representing the
contents of tables and utilized them for the discovery of new entities. The SemTab challenge
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] has also motivated new approaches [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]. However, the task of table-to-class annotation
is not part of this challenge.
      </p>
      <p>To the best of our knowledge, the T2K Matcher is the only existing
method for solving the table-to-class task. Namely, the class of the table is chosen by ranking
the sum of the similarity scores of the column-to-property correspondences aggregated per
class. Since this method requires querying of the KG for candidate retrieval and first solving the
column-to-property alignment in order to find the correct class of a table, we do not consider it
during our experiments. In contrast to the T2K Matcher, we consider a closed book scenario,
where the instances of the KG are not available, only the classes in the KG schema.
Representation Learning on Tables</p>
      <p>
        Based on powerful LM, dedicated deep learning models
have recently been proposed to exploit tabular data structures, e.g., in table-based question
answering [
        <xref ref-type="bibr" rid="ref17 ref4">4, 17</xref>
        ] and KG completion from tables [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. One benefit from using pre-trained LM
is that they can handle synonyms well, e.g., the abbreviation of New York as NY, which are
frequently occurring in tables because of the innate limitation of the cells. The other benefit
is that, due to the exposure to large textual corpora during the pre-training phase, the LM
can store implicit information learned from the data whilst pre-training, in the form of model
parameters [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. TaBERT [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] by Yin et al. is a novel model which was pre-trained to jointly
learn representation of a natural language question, called utterance, and tables. An example of
utterance for the entity table shown in Figure 1 is the question: How much is the population
of New York?. During encoding, instead of using the full table, TaBERT samples 1 or 3 rows,
referred to as content snapshot. First, each row from the snapshot, concatenated with the
utterance, is encoded by BERT [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Second, the encoding of the rows are stacked and in order
to generate vector representations for each of the columns, a vertical self-attention mechanism
is used. Finally, representation for the table is generated by pooling the column representations.
Similar work is the method TAPAS by Herzig et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], which is also pre-trained on tables
and text segments. Ding et al. proposed TURL [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] as a framework for pre-training, also on
tabular data, which uses the same objectives as TaBERT for learning representations of the
content of the tables. Additionally, they proposed task-specific fine-tuning on the framework
for solving the row-to-entity and column-to-attribute annotation. Wang et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] presented a
novel method which exploits information within one table but also aggregates the contextual
information shared across similar tables in order to generate a vector representation that can be
used for column-to-class annotation and relation prediction tasks.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem Description</title>
      <p>
        We focus on the task of table-to-class annotation. The task has been introduced together with
the two other TI sub-tasks in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], however without a formal definition. The goal of the
tableto-class annotation is to label a table with its corresponding class according to the given KG
schema. We now provide a definition of this task as a machine learning task.
      </p>
      <p>An entity table   is a   ×   matrix where   and   are the number of rows and columns of

the table   . Each element of the matrix   ,  , , contains one or more tokens, where each token</p>
      <p>1,∗,  2,∗, … ,   ,∗ .
is a sequence of characters. We denote with  ,∗ and  ∗, the  -th row and the  -th column of

the matrix   respectively. The header of the table is the first row   =  0,∗. The content of the</p>
      <p>Let  = {(  1,   ), … , (  ,   )} be the set of labeled tables with  number of tables, and each label
  ∈  is in the set of classes defined in the KG schema  = {
is a model, with a parameter vector  , which encodes each table  
1, … ,   }. A table encoder  
∶ {  } → ℝ to a vector
  (  ) =   and  = {  0,  1, … ,   } is the set of feature vectors for every   ∈  . The final
task is to train a classification model</p>
      <p>∶ ℝ →  so that each table vector is assigned to one
of the class labels. The problem is defined in the multi-class setting. Formally our setting is
  ∘   ∶ {  } →  , where only the parameters  are trained on the table classification task, i.e.,
no gradient updates are performed on  .</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>textual corpora and an approach for question-answering which has been pre-trained on tabular
data (Figure 1 (b)). The code for the experiments is accessible online 1.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>
          For evaluation we used the second version of the T2D gold standard dataset [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], T2Dv2. To
the best of our knowledge, the T2D sets are the only publicly available datasets which have
been annotated with table-to-class correspondence. The second version of the dataset2 contains
237 such annotations. In our experiments, we consider those classes which have at least two
27 unique classes. The mean of the number of rows in the dataset is 119.2 and the mean of the
number of columns is 7.7.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Models compared</title>
        <p>In the evaluation we used 5 diferent models as table encoders, varying from general purpose
document encoders to more sophisticated LM, pre-trained on tabular data.</p>
        <p>TF-IDF</p>
        <p>or term frequency-inverse document frequency, is a term weighting scheme which
generates vector representation for a document based on the frequency of the words in the
document. It is the simplest method which we used as a table encoder.</p>
        <sec id="sec-4-2-1">
          <title>1https://github.com/anetakoleva/tableClassification 2http://webdatacommons.org/webtables/goldstandardV2.html</title>
          <p>
            Spacy pre-trained word vectors on a text extracted from blogs, news and comments. We used
the vectorizer from english-medium sized pipeline3 which contains vocabulary of size 684830.
Word2Vec pre-trained word vectors trained with FastText 4 on a Wikipedia text corpus. The
model used for the learning the vectors [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ] is an extension of the original word2vec model.
It is skip-gram based and trained to learn representations for character n-grams. This model
consists of vocabulary of size 2.5 million.
          </p>
          <p>
            BERT is a widely used, Transformer-based LM [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]. During the pre-training phase, the model
has been exposed to a large corpus of unstructured text with the objective of predicting missing
words and prediction of next sentence. This enables the model to learn the correlation of the
words and to generate diferent vector representation for words depending on the context.
TaBERT is a table encoding method [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], pre-trained on Web tables with the objective to be
used in question-answering tasks on tables. Since the model expects an utterance, i.e., a natural
language question, as input together with a table, in our experiments we provided an empty
space “ ”. We conducted more experiments to evaluate the influence of the utterance on the
generated table representation and we discuss these results in Section 5.
          </p>
        </sec>
        <sec id="sec-4-2-2">
          <title>3https://spacy.io/models/en#en_core_web_md 4https://fasttext.cc/docs/en/pretrained-vectors.html</title>
          <p>4.3. Setup
To systematically evaluate the quality of the representations generated with the diferent table
encoders, we compare their performance on the classification task under diferent scenarios. It
is important to note that we did not train or fine-tune any of the methods for table encoding, i.e.,
we used them of-the-shelf . Since the tables can be large, in order to avoid scalability issues, we
resort to sampling of rows. Namely, we first shufle the rows in the tables and then we sample
the first  rows. The shufling of the rows is done only once. For the experiments, we sampled
 ∈ {1, 3, 5, 7} rows from each of the tables and used these sampled tables as input to the table
encoders.</p>
          <p>When using TF-IDF as table encoder, the input is a set of sequences, where each sequence
feature vectors  tf-idf ∶  →  .
corresponds to a table from the set of tables  . More formally, a table sequence for table   is a
sequence of rows    = ( 0,∗,  1,∗, … ,   ,∗ ), such that  ∈ {1, 3, 5, 7} , and the set of sequences is the

set  = {  0, … ,    }. The table encoder TF-IDF transforms the set of table sequences to the set of

table content are concatenated into one vector    =  
‖  .</p>
          <p>Word2Vec and Spacy generate the vector representation for table   in 3 steps. First, the
sequence    , representing the header of the table   , is encoded as the mean over the word
vectors in the sequence    , represented as   . Second, the content of the table, is transformed

into a table sequence    = ( 1,∗ …   ,∗ ) and encoded as the vector   , which represents the mean


over all the word vectors in    . Finally, the vector representations for the header and for the</p>
          <p>Considering that there is a limit on the length of the sequence that BERT can encode in one
step, we used diferent transformation for the last two methods. BERT encodes each table row


by row, i.e, a sequence   ,∗ is generated for each of the rows  ,∗ of table   , where 0 ≤  ≤  .

BERT generates row-wise vectors, so for each sequence   ,∗
the output is a vector   ,∗
. The
vector representation for table   is the vector    which is the result of the mean-pooling over
the set of the BERT’s output vectors {  0,∗, … ,   ,∗ } that correspond to the table rows. In the
same manner, the TaBERT model also first generates an encoding for each of the rows of table
  resulting in a set of vectors. This model uses vertical self-attention focused on the vertically
stacked vectors, {  0,∗, … ,   ,∗ }. Because of the vertically aligned vectors, the output of the model
wiseadcoolmu meann-vpeoctoolirnrgeporveesretnhteatcioonlu{m n∗,0r,e…pr, e s∗e,n t}aftoiornesactoh goefntehreate tchoelutambnles einnctoadbilne g   .  F.inally,</p>
          <p>We then use the Multi-layer Perceptron (MLP) with one hidden layer of size 500, the tanh
activation function and adam optimizer as the classifier   from Figure 1. The hyper parameters
are chosen after an extensive search and they are fixed for all of the experiments. Since the
available dataset is small, instead of splitting it once into a training set and a test set, we use
stratified K-fold validation with  = 20 splits. Considering that the dataset is imbalanced, we
report the macro averaged F1 score. The reported scores are the average of the results on the
test set after the cross validation. To explore the efect of the column names, we also encoded
the tables with their column names masked. Specifically, for all of the tables, we substitute their
column names with the token [UNK].</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>TaBERT Analysis To get a better understanding of the (under-) performance of TaBERT
we analyse the influence of the utterance and its interplay with column names. In addition to
the empty string “ ” used in previous experiments, we also used a randomly generated string
with 10 characters (unique per table), and one constant string, Thing, for all tables. Moreover,
we experimented with adding the correct class of the tables as utterance, as well as a wrong
class (for instance, all the tables of class Country are encoded with the class Plant as utterance).
Figure 3 shows the results of these experiments, where the input tables were with  = 3 rows.
The horizontal axis shows the diferent options that we passed as utterance to the model and
the vertical axis shows the achieved F1 score. The masking of column names has significant
influence on the generated table representation. The reason for this might be in the way how
a row is transformed into a string, i.e., the value of each table entry is concatenated with the
column name of the entry and its value. Observing the results with the diferent utterance, we
see that the choice of utterance does not afect the performance of the model when the column
names are not masked. Nevertheless, when the column names are masked, the influence of the
utterance is more significant. In both cases when the utterance is the wrong class or the correct
class, the achieved score is much higher, which might be attributed to a class-wide shift in the
vector space because of the grouping that these utterances cause.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future work</title>
      <p>In this paper we explored diferent types of table encoders for generating vector representations
for tabular data. Specifically, we focused on evaluating diferent methods for table encoding on
the sub-task for TI, table-to-class annotation. Despite the increasing interest in the problem
of TI, so far, only one approach towards this specific sub-task has been proposed. In this
direction, we provided a formal definition for the table-to-class annotation task as a machine
learning task. We conduct an empirical study with five diferent methods for generating vector
representation of a table and evaluate their performance on the table-to-class annotation task.
The results from our experiments show that transfer learning methods with large vocabularies
of pre-trained word embeddings perform on par with more complex and expensive modes
such as LM pre-trained on tables. An interesting finding is that the inductive bias for tabular
structure in TaBERT did not bring benefit to the performance of the BERT model. A possible
explanation for this is the missing significant utterance that the TaBERT model expects as input.
Nonetheless, the miss-classifications made by this model are reasonable, suggesting that the
vector representations capture the semantics of the tables. Future work should target closing
the gap between existing general-purpose models and model specific for encoding tabular data.
To further our work we plan to explore other existing methods for table encoding for solving
the table-to-class task, as well as for solving the entity-to-row and column-to-property tasks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Cafarella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          , E. Wu,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Webtables: exploring the power of tables on the web</article-title>
          ,
          <source>VLDB</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Venetis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Madhavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pasca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          , G. Miao,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Recovering semantics of tables on the web</article-title>
          ,
          <source>VLDB</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , K. Chakrabarti, Infogather+:
          <article-title>semantic matching and annotation of numeric and time-varying attributes in web tables</article-title>
          ,
          <source>in: SIGMOD</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>Table cell search for question answering</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jiménez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <article-title>Learning semantic annotations for tabular data</article-title>
          ,
          <source>in: IJCAI</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodriguez-Muro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Christophides</surname>
          </string-name>
          ,
          <article-title>Matching web tables with knowledge base entities: From entity lookups to entity embeddings</article-title>
          ,
          <source>in: ISWC</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kertkeidkachorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ichise</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Takeda</surname>
          </string-name>
          , Tabeano:
          <article-title>Table to knowledge graph entity annotation</article-title>
          ,
          <source>CoRR</source>
          (
          <year>2020</year>
          ).
          <article-title>a r X i v : 2 0 1 0 . 0 1 8 2 9</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , E. Meij,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Reinanda</surname>
          </string-name>
          ,
          <article-title>Novel entity discovery from web tables</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Limaye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarawagi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chakrabarti</surname>
          </string-name>
          ,
          <article-title>Annotating and searching web tables using entities, types and relationships</article-title>
          ,
          <source>VLDB</source>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: NAACL-HLT</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Yin</surname>
          </string-name>
          , G. Neubig,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <article-title>Tabert: Pretraining for joint understanding of textual and tabular data</article-title>
          ,
          <source>in: ACL</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lehmberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <article-title>Matching HTML tables to dbpedia</article-title>
          , in: WIMS,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Zhang, K. Balog,
          <article-title>Table2vec: Neural word and entity embeddings for table population and retrieval</article-title>
          , in: SIGIR,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>E.</given-names>
            <surname>Jiménez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Srinivas</surname>
          </string-name>
          ,
          <year>Semtab 2019</year>
          :
          <article-title>Resources to benchmark tabular data to knowledge graph matching systems</article-title>
          , in: ESWC,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Karaoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Negreanu</surname>
          </string-name>
          , T. Ma, J. Yao,
          <string-name>
            <given-names>J.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gordon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Linkingpark:</surname>
          </string-name>
          <article-title>An integrated approach for semantic table interpretation</article-title>
          , in: SemTab@ISWC,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , I. Yamada,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kertkeidkachorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ichise</surname>
          </string-name>
          , H. Takeda, Mtab4wikidata at semtab 2020:
          <article-title>Tabular data annotation with wikidata</article-title>
          ,
          <source>in: SemTab@ISWC</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Yu, TURL: table understanding through representation learning</article-title>
          ,
          <source>VLDB</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kruit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Urbani</surname>
          </string-name>
          ,
          <article-title>Extracting novel facts from tables for knowledge graph completion</article-title>
          ,
          <source>in: ISWC</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <article-title>How much knowledge can you pack into the parameters of a language model?</article-title>
          ,
          <source>in: EMNLP</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Herzig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Nowak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piccinno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Eisenschlos</surname>
          </string-name>
          , Tapas:
          <article-title>Weakly supervised table parsing via pre-training</article-title>
          ,
          <source>in: ACL</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shiralkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lockard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Jiang, TCN: table convolutional network for web table interpretation</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>