=Paper= {{Paper |id=Vol-3784/short2 |storemode=property |title=THoRR: Complex Table Retrieval and Refinement for RAG |pdfUrl=https://ceur-ws.org/Vol-3784/short2.pdf |volume=Vol-3784 |authors=Kihun Kim,Mintae Kim,Hokyung Lee,Seongik Park,Youngsub Han,Byoung-Ki Jeon |dblpUrl=https://dblp.org/rec/conf/ir-rag/KimKLPHJ24 }} ==THoRR: Complex Table Retrieval and Refinement for RAG== https://ceur-ws.org/Vol-3784/short2.pdf

THoRR: Complex Table Retrieval and Refinement for RAG
Kihun Kim∗ , Mintae Kim, Hokyung Lee, Seongik Park, Youngsub Han and Byoung-Ki Jeon
LG UPLUS, 71, Magokjungang 8-ro, Gangseo-gu, Seoul, Republic of Korea

Abstract
Recent advancements in the contextual understanding and generation capabilities of Large Language Models (LLMs) have sparked
increasing interest in the application of Retrieval-Augmented Generation (RAG) in specific domains and industry documents. Retrieving
and understanding tables within these documents is crucial for generating correct answers in RAG systems. This study focuses on
documents containing large and complex tables, such as statistical and industry reports and these presents two major challenges: 1)
processing the large tables and 2) understanding complex tables. Previous studies faced challenges as they considered elements of tabular
data such as cells, headers, and titles. In contrast, we designed the Table Header for Retrieval and Refinement (THoRR) method to address
the aforementioned issues. THoRR performs two tasks: table retrieval and table refinement. In the table retrieval phase, we propose
a table header representation approach that uses headers and titles, without considering cells. In the refinement phase, the model
selects relevant table headers from the retrieved tables and processes them into refined tables containing the necessary information to
answer the questions. This approach aids in understanding complex tables without chunking, by reorganizing information. Our models
outperform existing approaches such as DTR and DPR-table. Moreover, we experimentally demonstrate that our refinement model can
reduce hallucinations. To the best of our knowledge, our table refinement approach for RAG system is the first of its kind in the field.

Keywords
table retrieval, complex table, retrieval-augmented generation (RAG), table refinement, table representation

1. Introduction The first challenge involves processing large and complex
tables. Previous studies, such as DTR[3], and DPR-table[4],
were designed with relatively simple open-domain tables in
mind, such as those found in the nq-table[5] dataset, thus
de-emphasizing the processing of large tables. Similar to the
processing of text documents, previous methods involved
dividing data tables into fixed-length segments (chunking),
or even cutoff parts that exceeded a maximum input length.
The chunking method complicates data retrieval by not only
increasing the number of retrieval targets but also making
it challenging to compare values across segmented tables.
Moreover, disregarding overflow sections risks losing table
information, diminishing the probability of obtaining a suf-
ficient table representation. These problems can ultimately
affect table retrieval performance.
The second challenge is the difficulty in understanding ta-
bles due to their complex structure. Complex tables typically
feature hierarchical headers and numerous values, present-
ing a challenge for generator to consider vast amounts of
information. Insufficient table comprehension can lead to in-
correct answers (hallucinations). Figure 1 demonstrates an
Figure 1: Example of tableQA with gpt-3.5-turbo. Comparing example where GPT-3.5-turbo[6] is used to perform tableQA
the result of the original complex table (top) and the refined table on a hierarchical table. It showcases how the original table
(bottom). leads to incorrect responses, whereas the refined table, as
processed by our proposed model, yields the correct an-
swers.
Recent advancements in the contextual understanding
In this paper, we propose Table Header For Retrieval and
and generative capabilities of Large Language Models
Refinement (THoRR) to solve this problem. These method
(LLMs) have heightened interest in Retrieval-Augmented
is grounded in a heuristic assumption that, when finding
Generation (RAG)[1, 2] for specific domains such as open
and understanding tables, headers are more critical than
domain or industry-specific documents. Industry or finance
values. THoRR has two models, a retriever and a refine-
domain documents often contain large and complex tables.
ment model, performed sequentially. Each is different from
The understanding of which is critical for a RAG system to
the previous one. THoRR: Retriever uses a table header
produce accurate responses. However, this task presents
representation. It performs table retrieval using only the
several challenges. Our research seeks solutions to two
header without considering the cells of the table. THoRR:
primary challenges.
Refinement performs relevant table header detection in the
retrieved table to select table headers that are relevant to the
IR-RAG’24: Information Retrieval’s Role in RAG Systems (IR-RAG), July
18, 2024, IR-RAG, Washington D.C. question, and refines them into a simple table that contains
∗
Corresponding author. only the necessary information, reducing the amount of
Envelope-Open kimkihun@lguplus.co.kr (K. Kim); iammt@lguplus.co.kr (M. Kim); information the generator needs to consider.
hogay88@lguplus.co.kr (H. Lee); spark32@lguplus.co.kr (S. Park); We compare THoRR with DTR[3] and DPR-table[4] and
yshan042@lguplus.co.kr (Y. Han); bkjeon@lguplus.co.kr (B. Jeon)
show that it has better retrieval performance in fine-tuning
Orcid 0009-0005-9453-7443 (K. Kim)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License and zero-shot experiments on the HiTab[7] and AIT-QA[8]
Attribution 4.0 International (CC BY 4.0).

CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Figure 2: Architecture of the RAG system with LLM for table data, featuring our proposed THoRR method.

datasets while reducing the information (number of cells) 𝑡. Our method, which utilizes the table’s header and title
needed to input the generator. Furthermore, our proposed without considering every cell, is relatively free from the
methodology enables an efficient reduction in the number input limitations of the encoder. The chunking method and
of tokens required for table inputs in the generator. our comparative experiments are explained in Section 3.2
The objective of the training is to minimize the dis-
tance between questions q and positive table 𝑡𝑖+ while
2. Method maximizing the distance between queries and the num-
ber of 𝑛 negative tables 𝑡𝑖− in a given training dataset
In this section, we present the Table Header For Retrieval
𝐷 = {(𝑞𝑖 , 𝑡𝑖+ , 𝑡𝑖,1
− , 𝑡 − , ..., 𝑡 − )}𝑀 . The loss function, optimized
𝑖,2 𝑖,𝑛 𝑖=1
and Refinement (THoRR) method, designed to retrieve and
as Negative Log Likelihood (NLL) :
refine tables within the RAG system [2]. THoRR is divided
into two phases, retrieval and refinement, as shown in Figure 𝐿𝑟𝑒𝑡𝑟𝑖𝑒𝑣𝑒𝑟 (𝑞𝑖 , 𝑡𝑖+ , 𝑡𝑖,1
− , 𝑡 − , ..., 𝑡 − )
𝑖,2 𝑖,𝑛
2. These two phases are separately trained and serve distinct +
purposes. The retrieval phase is utilized for embedding and 𝑒 𝑠𝑖𝑚(𝑞𝑖 ,𝑡𝑖 ) (3)
= −𝑙𝑜𝑔 𝑛 + −
indexing tables. Subsequently, as a question is input, it ∑𝑗=1 𝑒 𝑠𝑖𝑚(𝑞𝑖 ,𝑡𝑖 ) + 𝑒 𝑠𝑖𝑚(𝑞𝑖 ,𝑡𝑖,𝑗 )
retrieves the pre-indexed tables. In the refinement phase,
the retrieved tables are processed to extract the necessary
information, refining them into smaller tables. 2.2. Table Refinement Model
The goal of this method is to obtain the 𝑇 𝑜𝑝_𝐾 refined
This paper introduces a new task called Table Refinement,
Tables 𝑇𝑟 relevant to the given 𝑀 target Tables 𝑇 when a
defined as simplifying a table while preserving specific infor-
question 𝑄 is provided. We denote the components of 𝑇 as
mation. Accordingly, our THoRR:refinement model aims to
𝑡𝑖𝑡𝑙𝑒, ℎ𝑒𝑎𝑑𝑒𝑟𝑟𝑜𝑤 , and ℎ𝑒𝑎𝑑𝑒𝑟𝑐𝑜𝑙 , representing the row headers,
obtain refined tables, denoted as 𝑡𝑟 , for the 𝑇 𝑜𝑝_𝐾 candidate
column headers, and title, respectively. The comparative
tables from the retrieval phase. The input 𝑥𝑟 is defined by
experiments between THoRR and the existing table retrieval
equation 4, where the ℎ𝑒𝑎𝑑𝑒𝑟 ∈ [ℎ𝑒𝑎𝑑𝑒𝑟𝑟𝑜𝑤 , ℎ𝑒𝑎𝑑𝑒𝑟𝑐𝑜𝑙 ]. Simi-
baseline are explained in Section 3.1
lar to equation 5, 𝑥𝑟 is input to the refinement encoder 𝐸𝑛𝑐𝑅
to obtain hidden states. And then, the linear layer takes in
2.1. Table Retriever these hidden states and outputs the relevant header score,
denoted as ℎ. Using ℎ, we obtain the 𝑇 𝑜𝑝_𝐶 relevant column
Given M target tables T, Our THoRR:retrieval model aims to
headers indices (𝐼𝑐 ) and 𝑇 𝑜𝑝_𝑅 relevant row header indices
retrieve the 𝑇 𝑜𝑝_𝐾 candidate tables containing information
(𝐼𝑟 ) as specified in Equation 6. Subsequently, we refine candi-
relevant to the question Q. In this paper, we follow the struc-
date tables using selected row and column indices to obtain
ture of DPR[9] for comparison with DPR-table[4]. we use
𝑡𝑟 .
two different encoders (the table header encoder (𝐸𝑛𝑐𝑇 ) and
the question encoder (𝐸𝑛𝑐𝑄 ), both utilizing the base model
𝑥𝑟 = {[𝐶𝐿𝑆] 𝑄 [𝑆𝐸𝑃] ℎ𝑒𝑎𝑑𝑒𝑟 [𝑆𝐸𝑃]} (4)
of [10]. 𝐸𝑛𝑐𝑇 maps target 𝑀 tables to table header repre-
sents 𝑡 and builds an index 𝑡 that will be used for retrieval. ℎ = 𝐸𝑛𝑐𝑅 (𝑥𝑟 ) (5)
The input 𝑥𝑡 to 𝐸𝑛𝑐𝑇 is defined in equation 1. When given
𝐼𝑟𝑜𝑤 = 𝑎𝑟𝑔𝑚𝑎𝑥(ℎ𝑟𝑜𝑤 , 𝑇 𝑜𝑝_𝑅)
a question 𝑄, obtain a question representation 𝑞 using the (6)
𝐸𝑛𝑐𝑄 , and then select the 𝑇 𝑜𝑝_𝐾 closest candidate tables for 𝐼𝑐𝑜𝑙 = 𝑎𝑟𝑔𝑚𝑎𝑥(ℎ𝑐𝑜𝑙 , 𝑇 𝑜𝑝_𝐶)
indexed 𝑡 from it. The similarity between 𝑡 and 𝑞 is defined
by using the dot product, as in [9] (equation 2), and the The learning objective aims to identify the index of the
encoder uses the base model of [10]. question and relevant header, with the goal of increasing
the score of the answer’s header index ℎ𝑖 . The loss function
is as described in Equation (5), We optimized Cross Entropy
𝑥𝑡 = {[𝐶𝐿𝑆] 𝑡𝑖𝑡𝑙𝑒 [𝑆𝐸𝑃] ℎ𝑒𝑎𝑑𝑒𝑟𝑐𝑜𝑙 [𝑆𝐸𝑃] ℎ𝑒𝑎𝑑𝑒𝑟𝑟𝑜𝑤 [𝑆𝐸𝑃]} Loss. Where, 𝑁 represents the number of tokens in input 𝑥𝑟
(1) and 𝑦 is the gold relevant header index.
𝑆𝑖𝑚(𝑞, 𝑡) = 𝐸𝑛𝑐𝑄 (𝑄)⊤ ⋅ 𝐸𝑛𝑐𝑇 (𝑥𝑡 ) (2)
𝑒𝑥𝑝(ℎ𝑦 )
𝐿𝑟𝑒𝑓 𝑖𝑛𝑒𝑚𝑒𝑛𝑡 (ℎ, 𝑦) = −𝑙𝑜𝑔 𝑁
(7)
In this process, a difference aspect of our retriever compared ∑𝑛=1 = 𝑒𝑥𝑝(ℎ𝑛 )
to previous research lies in the table header representation
Table 1
Comparison of retrieval accuracy performance of our THoRR method and the baselines. Fine-tuning denotes training with the
HiTab dataset and Zero-shot denotes evaluation of AIT-QA using fine-tuned model.
Refinement HiTab Fine-tuning AIT-QA Zero-shot
Model 𝑇 𝑜𝑝_𝐶 𝑇 𝑜𝑝_𝑅 HIT@1 HIT@5 HIT@10 HIT@20 HIT@50 HIT@1 HIT@5 HIT@10 HIT@20 HIT@50
DTR[3] - - 19.00 40.97 51.96 64.27 77.53 8.74 20.39 28.16 41.75 71.07
DPR-table[4] - - 40.40 69.51 77.15 84.03 90.66 19.61 41.75 55.15 71.26 89.51
THoRR 5 - 45.39 74.75 82.83 87.31 91.60 22.52 47.38 62.91 74.95 92.82
(Ours) 5 10 43.50 71.84 79.55 84.03 88.07 21.75 44.27 59.03 69.51 84.85
7 - 45.77 75.51 83.59 88.07 92.49 23.50 48.54 64.47 76.89 94.76
7 10 43.88 72.60 80.30 84.79 88.95 22.72 45.44 60.58 71.46 86.80

3. EXPERIMENTS domain. In this process, a fine-tuned model using the HiTab
[7] dataset is used to make predictions on the AIT-QA[8]
Dataset We conduct experiments on two complex table dataset without any additional training, and the results are
benchmark datasets. HiTab[7] is a Table QA dataset with evaluated. Through this experiment, we aim to demonstrate
a hierarchical structure. This dataset consists of questions that the proposed models can handle complex table retrieval
that require complex numerical calculations, including ta- in previous unseen domains. Table 1 presents the results of
bles from Wikipedia and statistical reports. It contains a this experiment, showing superior performance compared
total of 10,672 question-answer pairs, with 7,417 for train- to the baselines and indicating well THoRR works on com-
ing, 1,671 for validation, and 1,584 for testing. There are plex tables in different domains.
a total of 3,597 tables in this dataset. We use this dataset
for fine-tuning. AIT-QA[8] is a Table QA dataset specific to
the Airline industry, composed of tables extracted from the
3.2. Retrieval Result
U.S. public SEC filings. It includes specialized vocabulary
terms for a specific domain and also has a hierarchical struc-
ture like HiTab[7]. It consists of 515 questions and answers,
with a total of 116 tables. In this paper, this dataset is used
to evaluate the zero-shot performance of the fine-tuning
model.
Baseline In order to demonstrate the performance of our
method, we compare it with baseline methods. DTR[3] is
a table encoder that uses a table-specific structure. DPR-
table[4], on the other hand, processes tables linearly, similar
to understanding text passages. Both of these baselines
have been trained on the nq-dataset[5] and their pretrained
models are publicly available. We fine-tune these pre-train
models as backbones and compare them with our model.
(a)
3.1. Main Result : THoRR
The experiments in this paper evaluate the proposed mod-
els, THoRR, in a two-phase process as shown in Figure 1
(THoRR:retrieval and THoRR:refinement). The performance
of the models is evaluated using the ’Hits accuracy’ as the
main evaluation metric. This metric measures the ratio of
correct answers included in the 𝑇 𝑜𝑝_𝐾 selected tables by
the models. Where, 𝑇 𝑜𝑝_𝐾 takes values 1, 5, 10, 20, 50 to
evaluate the accuracy of the models.
Fine-tuning To compare fine-tuning experiments on the
complex table dataset, we train THoRR and baselines using
the HiTab [7] training set. Table 1 presents the performance
of the THoRR method compared to baseline models. The
experimental results indicate that the proposed models out-
performed baselines in most cases. When 𝑇 𝑜𝑝_𝐶 = 7 and (b)
𝑇 𝑜𝑝_𝐾 <= 10, the proposed models exhibit an accuracy
improvement of more than 5% compared to the baseline’s Figure 3: (a) Retriever accuracy with DPR-table’s chunking
best accuracy. The superior performance at a small 𝑇 𝑜𝑝_𝐾 method vs THoRR:retrival’s table header representation method.
indicates the importance in the RAG system, as it indicates (b) Comprison between the number of chunks by the max token
length.
effective utilization of a limited number of reference pieces
of information, which is common when the 𝑇 𝑜𝑝_𝐾 is less
than 10. We compare our proposed table header representation
Zero-shot The zero-shot experiment intend to observe method and chunking method in terms of retrieval accuracy.
how the model performs on complex table data from a new Figure 3(a) illustrates the performance of [4] with the chunk-
ing method and our method. (”inf” refers to the use of the learns over 26 million natural language questions and ta-
original table without chunking.) As shown in Figure 3(b), bles. TURL[18] introducing a structure-aware Transformer
we observe a decrease in retrieval accuracy as the lower max encoder and Masked Entity Recovery (MER) objective for
token length, indicating that the number of retrieval targets pre-training. StruG[14] proposes a semi-supervised learn-
affects the performance significantly in retrieval tasks. Our ing framework for learning the connection between text
approach demonstrates superior performance compared to and SQL. MATE[15] demonstrates the efficient restriction of
methods that consider all values. This highlights the effec- Transformer attention flow on tabular data, enabling train-
tiveness of our method, which relies solely on table headers ing with larger sequence lengths. Tableformer[16] learns
for table representation, especially in retrieving large and from tables using attention biases, making it better at un-
complex tables. Moreover, our method demonstrates su- derstanding tabular data. TABBIE[17] introduces a method
perior performance compared to existing approaches that to improve performance on table-based prediction tasks by
consider all values, thereby experimentally validating our pre-training only tabular data.
heuristic assumption that headers are crucial elements in Research on table retrieval includes methodologies such
table retrieval. as [19, 3, 4, 20, 21]. Table2vec[19] proposes a method for
obtaining table embeddings by considering various table el-
3.3. Refinement Result ements such as captions, headers, cells, and entities. DTR[9]
introduces a table-specific model suitable for open-domain
table question answering. DPR-table[4] linearizes tables
to handle them similar to text passages, instead of using
table-specific models. GTR[20] introduces a model that
transforms tables into graphs, capturing both cell and lay-
out structures. [21] introduces a method for enhancing
the similarity between queries and tables for table retrieval,
employing various semantic spaces and similarity measure-
ment methods.

5. Conclusion
We propose the THoRR method, which uses the table head-
ers to retrieve and help understand the complex and large
Figure 4: Comparison of human evaluation performance on tables. We use the table header representations in the re-
TableQA and the number of refined table cells. triever that can retrieve tables without chunking them. Ad-
ditionally, we propose a novel methodology for refining
tables by detecting the table headers that are relevant to the
In this section, we experiment with our refinement model
questions within the table. This approach aims to simplify
to reduce cell information in mitigating hallucinations. In
the tables in which an excessive amount of information is
Figure 4, the green line indicates a decreasing trend in the
present, particularly in complex tables. THoRR is capable
number of cells in tables when using our model. Further-
of handling large and complex tables without dividing them
more, Figure 4 illustrates the human evaluation accuracy on
into smaller chunks, reducing the information required for
the results obtained by input refined tables into Llama2[11]
preventing hallucinations in LLM generator. Furthermore,
7B-Chat. Where, ”(-,-)” denotes the original table. We ran-
the Table Refinement task is the first of its kind in this field,
domly sample 300 questions from the HiTab test dataset for
therefore, it is expected to contribute significantly to the
human evaluation. Llama2[11] 7B-Chat takes a gold table as
future research in this field. Our future work involves ex-
input to generate responses. If the generated response con-
ploring methods to detect the table headers. Additionally,
tains exactly the answer and is correct, we mark it as correct.
we aim to prevent potential information loss in questions
Otherwise, we consider it as a hallucination. Three master’s
by selecting fewer relevant headers during the refinement
students in the field of AI evaluated the generated results.
phase.
To ensure the reliability of the evaluations, one evaluator
and two validators were assigned roles in the evaluation
process. As a result, by setting 𝑇 𝑜𝑝_𝐶 = 7 and 𝑇 𝑜𝑝_𝑅 = 10, References
we demonstrate that our refinement model reduces the num-
ber of table cells from 153.88 to 58.03, resulting in a 62.2% [1] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin,
decrease compared to the original table. Additionally, we N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rock-
observe a 9.33% improvement in the reduction of halluci- täschel, S. Riedel, D. Kiela, Retrieval-augmented gen-
nations. This validates the superiority of our refinement eration for knowledge-intensive nlp tasks, in: Proceed-
approach. ings of the 34th International Conference on Neural
Information Processing Systems, NIPS’20, Curran As-
sociates Inc., Red Hook, NY, USA, 2020.
4. Related Works [2] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai,
J. Sun, Q. Guo, M. Wang, H. Wang, Retrieval-
Research on table encoders has been focused on pre-training
augmented generation for large language models: A
tabular data with table-specific architectures[12, 13, 14, 15,
survey, ArXiv abs/2312.10997 (2023). URL: https:
16, 17]. TAPAS[12] introduces a pre-training method using
//api.semanticscholar.org/CorpusID:266359151.
Masked-Language-Modeling for the cells of tabular data.
[3] J. Herzig, T. Müller, S. Krichene, J. Eisenschlos, Open
TaBERT[13] introduces a pre-training model that jointly
domain question answering over tables via dense re-
trieval, in: K. Toutanova, A. Rumshisky, L. Zettle- Language Processing (EMNLP), Association for Com-
moyer, D. Hakkani-Tur, I. Beltagy, S. Bethard, R. Cot- putational Linguistics, Online, 2020, pp. 6769–6781.
terell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of URL: https://aclanthology.org/2020.emnlp-main.550.
the 2021 Conference of the North American Chap- doi:10.18653/v1/2020.emnlp- main.550 .
ter of the Association for Computational Linguistics: [10] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT:
Human Language Technologies, Association for Com- Pre-training of deep bidirectional transformers for
putational Linguistics, Online, 2021, pp. 512–519. URL: language understanding, in: J. Burstein, C. Doran,
https://aclanthology.org/2021.naacl-main.43. doi:10. T. Solorio (Eds.), Proceedings of the 2019 Conference
18653/v1/2021.naacl- main.43 . of the North American Chapter of the Association for
[4] Z. Wang, Z. Jiang, E. Nyberg, G. Neubig, Table retrieval Computational Linguistics: Human Language Tech-
may not necessitate table-specific model design, in: nologies, Volume 1 (Long and Short Papers), Associa-
W. Chen, X. Chen, Z. Chen, Z. Yao, M. Yasunaga, T. Yu, tion for Computational Linguistics, Minneapolis, Min-
R. Zhang (Eds.), Proceedings of the Workshop on Struc- nesota, 2019, pp. 4171–4186. URL: https://aclanthology.
tured and Unstructured Knowledge Integration (SUKI), org/N19-1423. doi:10.18653/v1/N19- 1423 .
Association for Computational Linguistics, Seattle, [11] H. Touvron, L. Martin, K. Stone, P. Albert, A. Alma-
USA, 2022, pp. 36–46. URL: https://aclanthology.org/ hairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava,
2022.suki-1.5. doi:10.18653/v1/2022.suki- 1.5 . S. Bhosale, et al., Llama 2: Open foundation and fine-
[5] T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, tuned chat models, arXiv preprint arXiv:2307.09288
A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. De- (2023).
vlin, K. Lee, K. Toutanova, L. Jones, M. Kelcey, M.-W. [12] J. Herzig, P. K. Nowak, T. Müller, F. Piccinno,
Chang, A. M. Dai, J. Uszkoreit, Q. Le, S. Petrov, Nat- J. Eisenschlos, TaPas: Weakly supervised table
ural questions: A benchmark for question answering parsing via pre-training, in: D. Jurafsky, J. Chai,
research, Transactions of the Association for Com- N. Schluter, J. Tetreault (Eds.), Proceedings of the
putational Linguistics 7 (2019) 452–466. URL: https: 58th Annual Meeting of the Association for Compu-
//aclanthology.org/Q19-1026. doi:10.1162/tacl_a_ tational Linguistics, Association for Computational
00276 . Linguistics, Online, 2020, pp. 4320–4333. URL: https:
[6] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Ka- //aclanthology.org/2020.acl-main.398. doi:10.18653/
plan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sas- v1/2020.acl- main.398 .
try, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, [13] P. Yin, G. Neubig, W.-t. Yih, S. Riedel, TaBERT:
T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, Pretraining for joint understanding of textual and
C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, tabular data, in: D. Jurafsky, J. Chai, N. Schluter,
S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, J. Tetreault (Eds.), Proceedings of the 58th An-
A. Radford, I. Sutskever, D. Amodei, Language models nual Meeting of the Association for Computational
are few-shot learners, in: Proceedings of the 34th Linguistics, Association for Computational Linguis-
International Conference on Neural Information Pro- tics, Online, 2020, pp. 8413–8426. URL: https://
cessing Systems, NIPS’20, Curran Associates Inc., Red aclanthology.org/2020.acl-main.745. doi:10.18653/
Hook, NY, USA, 2020. v1/2020.acl- main.745 .
[7] Z. Cheng, H. Dong, Z. Wang, R. Jia, J. Guo, Y. Gao, [14] X. Deng, A. H. Awadallah, C. Meek, O. Polozov, H. Sun,
S. Han, J.-G. Lou, D. Zhang, HiTab: A hierarchical M. Richardson, Structure-grounded pretraining for
table dataset for question answering and natural lan- text-to-SQL, in: K. Toutanova, A. Rumshisky, L. Zettle-
guage generation, in: S. Muresan, P. Nakov, A. Villav- moyer, D. Hakkani-Tur, I. Beltagy, S. Bethard, R. Cot-
icencio (Eds.), Proceedings of the 60th Annual Meet- terell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of
ing of the Association for Computational Linguistics the 2021 Conference of the North American Chapter
(Volume 1: Long Papers), Association for Computa- of the Association for Computational Linguistics: Hu-
tional Linguistics, Dublin, Ireland, 2022, pp. 1094–1110. man Language Technologies, Association for Compu-
URL: https://aclanthology.org/2022.acl-long.78. doi:10. tational Linguistics, Online, 2021, pp. 1337–1350. URL:
18653/v1/2022.acl- long.78 . https://aclanthology.org/2021.naacl-main.105. doi:10.
[8] Y. Katsis, S. Chemmengath, V. Kumar, S. Bharad- 18653/v1/2021.naacl- main.105 .
waj, M. Canim, M. Glass, A. Gliozzo, F. Pan, J. Sen, [15] J. Eisenschlos, M. Gor, T. Müller, W. Cohen, MATE:
K. Sankaranarayanan, S. Chakrabarti, AIT-QA: Ques- Multi-view attention for table transformer efficiency,
tion answering dataset over complex tables in the in: M.-F. Moens, X. Huang, L. Specia, S. W.-t. Yih
airline industry, in: A. Loukina, R. Gangadharaiah, (Eds.), Proceedings of the 2021 Conference on Em-
B. Min (Eds.), Proceedings of the 2022 Conference of pirical Methods in Natural Language Processing, As-
the North American Chapter of the Association for sociation for Computational Linguistics, Online and
Computational Linguistics: Human Language Tech- Punta Cana, Dominican Republic, 2021, pp. 7606–7619.
nologies: Industry Track, Association for Compu- URL: https://aclanthology.org/2021.emnlp-main.600.
tational Linguistics, Hybrid: Seattle, Washington + doi:10.18653/v1/2021.emnlp- main.600 .
Online, 2022, pp. 305–314. URL: https://aclanthology. [16] J. Yang, A. Gupta, S. Upadhyay, L. He, R. Goel, S. Paul,
org/2022.naacl-industry.34. doi:10.18653/v1/2022. TableFormer: Robust transformer modeling for table-
naacl- industry.34 . text encoding, in: S. Muresan, P. Nakov, A. Villavi-
[9] V. Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, cencio (Eds.), Proceedings of the 60th Annual Meet-
S. Edunov, D. Chen, W.-t. Yih, Dense passage retrieval ing of the Association for Computational Linguistics
for open-domain question answering, in: B. Web- (Volume 1: Long Papers), Association for Computa-
ber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the tional Linguistics, Dublin, Ireland, 2022, pp. 528–537.
2020 Conference on Empirical Methods in Natural URL: https://aclanthology.org/2022.acl-long.40. doi:10.
18653/v1/2022.acl- long.40 .
[17] H. Iida, D. Thai, V. Manjunatha, M. Iyyer, TAB-
BIE: Pretrained representations of tabular data,
in: K. Toutanova, A. Rumshisky, L. Zettlemoyer,
D. Hakkani-Tur, I. Beltagy, S. Bethard, R. Cotterell,
T. Chakraborty, Y. Zhou (Eds.), Proceedings of the
2021 Conference of the North American Chapter of
the Association for Computational Linguistics: Hu-
man Language Technologies, Association for Compu-
tational Linguistics, Online, 2021, pp. 3446–3456. URL:
https://aclanthology.org/2021.naacl-main.270. doi:10.
18653/v1/2021.naacl- main.270 .
[18] X. Deng, H. Sun, A. Lees, Y. Wu, C. Yu, Turl:
table understanding through representation learn-
ing, Proc. VLDB Endow. 14 (2020) 307–319.
URL: https://doi.org/10.14778/3430915.3430921. doi:10.
14778/3430915.3430921 .
[19] L. Zhang, S. Zhang, K. Balog, Table2vec: Neural word
and entity embeddings for table population and re-
trieval, in: Proceedings of the 42nd International ACM
SIGIR Conference on Research and Development in In-
formation Retrieval, SIGIR’19, Association for Comput-
ing Machinery, New York, NY, USA, 2019, p. 1029–1032.
URL: https://doi.org/10.1145/3331184.3331333. doi:10.
1145/3331184.3331333 .
[20] F. Wang, K. Sun, M. Chen, J. Pujara, P. Szekely, Retriev-
ing complex tables with multi-granular graph repre-
sentation learning, SIGIR ’21, Association for Comput-
ing Machinery, New York, NY, USA, 2021, p. 1472–1482.
URL: https://doi.org/10.1145/3404835.3462909. doi:10.
1145/3404835.3462909 .
[21] S. Zhang, K. Balog, Semantic tablenbsp;retrieval us-
ing keyword and table queries, ACM Trans. Web 15
(2021). URL: https://doi.org/10.1145/3441690. doi:10.
1145/3441690 .