Understanding Tables in Financial Documents Shared Tasks for Table Retrieval and Table QA on Japanese Annual Securities Reports Yasutomo Kimura1,∗ , Eisaku Sato1 , Kazuma Kadowaki2 and Hokuto Ototake3 1 Otaru University of Commerce, Hokkaido, Japan 2 The Japan Research Institute, Limited, Tokyo, Japan 3 Fukuoka University, Fukuoka, Japan Abstract This paper presents a framework for the “NTCIR-18 U4” and “SIG-FIN UFO-2024” shared tasks, which focus on tables within annual securities reports. Annual securities reports are critical documents that provide insights into a company’s financial status and business performance. However, challenges remain in accurately and efficiently analyzing the data they contain. To address these issues, we propose two sub-tasks for the above shared tasks: Table Retrieval and Table QA tasks, which utilize datasets from TOPIX100 and TOPIX500 annual securities reports. Participants are tasked with developing systems (programs) that automatically process data for the two tasks and compete for top performance on a leaderboard. Accuracy scores and rankings are determined by submitting the task’s output, in JSON format, to the leaderboard. Through these shared tasks, we aim to enhance the utility of annual securities reports and advance natural language processing technologies for financial data analysis. Keywords annual securities report, shared task, table retrieval, table question-answering 1. Introduction In recent years, financial disclosures have become essential for investors seeking to make informed decisions based on reliable corporate data. In Japan, listed companies are required to submit an annual securities report, a statutory disclosure document that provides comprehensive information on business operations, financial data, risk factors, corporate governance, and shareholder information. These reports, accessible via the Electronic Disclosure for Investors’ NETwork (EDINET)1 , serve as a critical information source for investors aiming to compare companies effectively. These securities reports are structured in XBRL (eXtensible Business Reporting Language), an XML- based format designed to standardize and facilitate the production, distribution, and reuse of financial information. By incorporating “taxonomies” that define the structure and meaning of data, XBRL enables automated processing, potentially streamlining financial analysis. However, practical challenges arise due to the presence of untagged data and the existence of unique taxonomies created by different report submitters, complicating the identification of comparable elements across reports. To this end, we propose two tasks that aim to facilitate cross-company comparisons by focusing on the tables and text within annual securities reports. The first task is the NTCIR-18 U4 task, adopted by Japan’s National Institute of Informatics (NII) as part of NTCIR-182 . The second is the SIG-FIN UFO-2024 task, organized by the Financial Informatics Study Group (SIG-FIN) under the Japanese Society for Artificial Intelligence (JSAI). The former focuses on TOPIX100 annual securities reports submitted between April 1, 2020, and March 31, 2021, while the latter focuses on TOPIX500 annual securities reports submitted between July 1, 2023, and June 30, 2024. EMTCIR ’24: The First Workshop on Evaluation Methodologies, Testbeds and Community for Information Access Research, December 12, 2024, Tokyo, Japan ∗ Corresponding author. Envelope-Open kimura@res.otaru-uc.ac.jp (Y. Kimura); ouc0149eisa@gmail.com (E. Sato); kadowaki.kazuma@jri.co.jp (K. Kadowaki); ototake@fukuoka-u.ac.jp (H. Ototake) Orcid 0000-0003-1849-1816 (Y. Kimura); 0009-0009-2930-0713 (K. Kadowaki); 0000-0002-6502-5570 (H. Ototake) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Figure 1: Overview of the Table Retrieval Task and Table QA Task We organized these shared tasks in collaboration with the NII Testbeds and Community for Informa- tion Access Research (NTCIR) [1], which specializes in information retrieval, and the SIG-FIN group of the JSAI, which focuses on financial technology. Through these initiatives, we aim to attract researchers and practitioners interested in these fields and contribute to further advancing technologies at the intersection of finance and information retrieval. In each shared task, we conducted two sub-tasks: Table Retrieval, which involves searching for tables, and Table Question Answering (Table QA), which involves answering questions by identifying the target cells within the tables, as illustrated in Figure 1. We designed each sub-task and constructed datasets for each task. The contributions of this study are as follows: • Design of two tasks, Table Retrieval and Table QA, targeting securities reports. • Construction of datasets for Table Retrieval and Table QA, and their release on GitHub3 . • Organization of the NTCIR-18 U4 task and the SIG-FIN UFO-2024 task. 2. Related Work 2.1. Research on Tables A table is a data format with a two-dimensional structure used to organize and manage knowledge or information, and it is widely utilized in various contexts. However, not all tables have a highly structured database format; they are often represented as semi-structured data. Furthermore, the data contained in table cells is not limited to numerical values; it frequently includes strings and other non-numerical data. Numerous methods have been proposed to accommodate such diverse table data. Zhang and Balog [2] surveyed on tables on the web and classified approaches to accessing table data into six main categories. 1. Table extraction 2. Table interpretation 3. Table search 4. Table question answering 5. Knowledge base augmentation 6. Table augmentation 1 https://disclosure2.edinet-fsa.go.jp/ 2 https://research.nii.ac.jp/ntcir/ntcir-18/index-en.html 3 https://github.com/nlp-for-japanese-securities-reports/ntcir18-u4, https://github.com/nlp-for-japanese-securities-reports/ufo-2024 In addition, table-related tasks include table fact verification [3, 4], table detection (searching for tables within documents) [5, 6], spreadsheet manipulation [7], column type annotation [8], and entity linking (linking to knowledge bases) [9, 10]. These tasks are critical in information retrieval and data analysis based on table data, and they are particularly anticipated in fields where handling large-scale data and automation are required. Recently, approaches utilizing large language models (LLMs) and visual language models (VLMs) have been increasing, and research on learning methods, prompt engineering, and agents is also gaining attention [11]. Our shared tasks (NTCIR-18 U4 and the SIG-FIN UFO-2024) are related to table search, table detection, and table question answering (Table QA). 2.2. Table Retrieval and Table QA Table retrieval aims to identify appropriate tables from vast datasets [12]. In this task, methods that typically assign relevance scores on the basis of the relationship between natural language queries and individual tables are commonly used. Table QA refers to the technology that provides appropriate answers from tables in response to user questions. Approaches to Table QA include semantic parsing-based, generation-based, extraction-based, matching-based, and retrieval-based methods [13]. The difficulty in Table QA lies in the need to handle semi-structured or unstructured data, as it also involves non-database tables. Compared to existing Tabular QA datasets such as FinQA[14] and TAT-QA[15], which primarily focus on English-language datasets and are designed to handle numerical reasoning in financial contexts, our proposed shared tasks specifically target the Japanese language. Japanese tabular and textual data often exhibit unique linguistic and structural features distinct from those in English datasets. These features may include variations in numerical data formats, context-dependent expressions, and implicit relational cues. 2.3. Tables in the Financial Domain Hybrid data, which includes both tables and text, such as in financial reports, is quite prevalent in the real world [16]. Zhu et al. constructed a question-answering benchmark dataset focused on the hybrid content of tabular and textual data in the financial domain [15]. Pan et al. proposed CLTR, an architecture for end-to-end table retrieval at the cell level [17]. While CLTR can be applied to open-domain datasets, including finance and healthcare ones, its performance specifically within the financial domain has not been clarified, nor does it target the Japanese language. One of the tasks focused on Japanese financial table structure analysis is the UFO (Understanding of non-Financial Objects in Financial Reports) task [18]. The UFO task aims to extract structured information from tables and text found in annual securities reports and consists of two sub-tasks: the Table Data Extraction (TDE) task and the Text-to-Table Relationship Extraction (TTRE) task. The TDE task classifies cells in tables into four categories with the goal of identifying the type of each cell: metadata, header, attribute, and data [19]. The main focus of TDE was on cell classification, and additional processing to enable inter-company comparisons remained unexplored. 3. NTCIR-18 U4 and SIG-FIN UFO-2024 Tasks Both the NTCIR-18 U4 and SIG-FIN UFO-2024 tasks consist of two sub-tasks: the Table Retrieval task, which involves searching for tables, and the Table Question Answering (Table QA) task, which involves answering questions by identifying the target cells within the tables [20]. Figure 1 illustrates the concept of these two sub-tasks. 3.1. Table Retrieval: Table Search Task Table Retrieval is a task that involves searching for a “table” containing the values that answer a given question from the tables included in a company’s annual securities report. On average, a company’s annual securities report contains 221.9 tables [21], and it is necessary to identify the specific table that contains the answer to the question needs to be identified. The input, output, and evaluation criteria for this task are as follows. Input 1. Question 2. HTML file of the annual securities report Output Table (Table ID) Evaluation Accuracy An example of input and output is shown below. Input 1. For Bandai Namco Holdings Inc., what were the “net assets and key management indicators” as of 2020? 2. S100ISF1-0000000.html, S100ISF1-0101010.html, ... Output S100ISF1-0101010-tab2 For the input, HTML files downloaded from EDINET are used. Each table element (
| and | tags) within the table in the HTML file is assigned a unique Cell ID. When outputting the value corresponding to the answer to the given question, this Cell ID is used. In the output example above, the Cell ID is the Table ID of the table containing the cell, with “-r{row number}c{column number}” appended, so the Cell ID “S100ISF1-0101010-tab2-r8c1” refers to the cell in the 8th row and 1st column of the table “100ISF1-0101010-tab2”. For evaluation, similar to the Table Retrieval task, accuracy is calculated by dividing the number of correct outputs by the total number of inputs in the test dataset. However, discrepancies between the value contained in the HTML cell and the expected answer are frequently observed. For example, if the expected answer is “4448000000”, the corresponding cell in the HTML might contain the string “4,448”, while another cell, such as in the top right or column name, might indicate “(in millions of yen)”. In this case, the system answering the task must reference both cells to generate the answer “4,448 million yen”. While this is equivalent to the correct answer, to compare it accurately, the system must replace the string “million yen” with “000000” and remove the comma. Due to this, in this task, both the response and the correct answer are normalized before calculat- ing accuracy. The normalization specification was continually revised during the “dry run” period, considering feedback from participants. We evaluated a few baseline methods using our validation datasets, which contain 3,132 and 1,534 questions for NTCIR-18 U4 and SIG-FIN UFO-2024, respectively. These baseline methods involved converting the target table into text format and inputting it, along with the question, into an LLM to generate the desired values. The results are shown in Table 2. For the NTCIR-18 U4 task, the highest accuracy of 0.7471 was achieved by using the Claude 3 Opus model. Similarly, for the SIG-FIN UFO-2024 task, a top accuracy of 0.5750 was obtained using the GPT-4o model. 4. Dataset 4.1. Securities Reports Used in Our Dataset The NTCIR-18 U4 task focuses on analyzing securities reports from companies in the TOPIX100 index. The dataset consists of securities reports from companies that are part of the TOPIX100, submitted between April 1, 2020, and March 31, 2021. The SIG-FIN UFO-2024 task, on the other hand, focuses on analyzing securities reports from companies in the TOPIX500 index. The annual securities reports used in this task are drawn from the TOPIX500, which represent publicly listed companies with high market capitalization and liquidity. For this task, we target the annual securities reports of 497 companies4 constituting the TOPIX 500 as of April 30, 4 Despite the name TOPIX500, as of 30 April 2024, it only includes 497 companies. To be more precise, the TOPIX500 is a Table 2 Validation results for the Table QA task Task Baseline methods Accuracy (value) NTCIR-18 U4 GPT-4o (gpt-4o-2024-05-13) 0.6475 GPT-3.5-turbo (gpt-3.5-turbo-0125) 0.3493 Gemini 1.5 Pro (gemini-1.5-pro-001) 0.5744 Gemini 1.5 Flash (gemini-1.5-flash-001) 0.4898 Claude 3 Opus (claude-3-opus-20240229) 0.7471 Claude 3 Haiku (claude-3-haiku-20240307) 0.3209 Claude 3.5 Sonnet (claude-3-5-sonnet-20240620) 0.7216 SIG-FIN UFO-2024 GPT-4o (gpt-4o-2024-05-13) 0.5750 GPT-3o-mini (gpt-4o-mini-2024-07-18) 0.3957 2024. The dataset includes 494 financial statements submitted to EDINET between July 1, 2023, and June 30, 2024. To account for differences in the structure of annual securities reports across industries, we ensure that the dataset is balanced by industry. The annual securities reports are distributed across the training, validation, and test sets with minimal industry bias. Specifically, we use the ten major categories from the Tokyo Stock Exchange’s 33 industry classifications (service industry, transportation and communications, finance and insurance, construction, mining, commerce, fisheries, agriculture and forestry, manufacturing, electricity and gas, and real estate). The data is divided such that the ratio of train:validation:test is approximately 6:1:3 within each industry category. This results in 289 companies’ reports being used for training, 52 for validation, and 153 for testing. We retrieve the financial data using the EDINET API v2, utilizing the XBRL, HTML, and CSV files available through the API. The XBRL files contain tabular data, such as taxonomies and instances, referred to as “XBRL information” below, which is also embedded in the corresponding HTML files. The CSV files, referred to below as “annual securities report CSVs,” provide a more accessible format for the XBRL data for easier handling in the study. 4.2. Question Creation Questions are created using annual securities report CSVs and question templates. In the annual securities report CSV, each row represents data, and each column shows the corresponding XBRL information (element ID, item name, context ID, relevant year, consolidated or individual, period or point in time, unit ID, unit, value). Among this XBRL information, the element ID and context ID are crucial for data extraction. The element ID indicates what the data represents, but it is not unique within a single annual securities report. Therefore, combining the element ID with the context ID, which represents the period and dimension, enables data within a report to be uniquely identified and the desired information to be extracted. Thus, the question must include both element ID and context ID. On the basis of this, the initial version of the question is created as follows: Question (Initial Version) What is the value of “{Element ID}” for {Company Name} in {Context ID}? However, if the element ID and context ID are used as they appear in the annual securities report CSV, the question will not be meaningful in Japanese as they are simply IDs. Therefore, the context ID is represented using the relative year, consolidated or individual, and period or point in time, while the element ID is expressed as the item name. The final version of the question is defined as follows: stock price index composed of the TOPIX Core30, TOPIX Large70 and TOPIX Mid400, but of these, only 397 companies are included in the TOPIX Mid400. Question (Detailed Version) What is the value of “{Item Name}” in the {Year} {Period or Point in Time} {Consolidated or Individual (optional)} annual securities report of {Company Name} for {Member Element (optional)}? The explanations for each part are as follows: • Year: Calculated based on the basis of the relevant year, and the string is included in the question. • Period or Point in Time: If it is a point in time, the word “point” is added right after the year. • Consolidated or Individual: If it is consolidated or individual, the corresponding string is included; otherwise, “annual securities report of” is omitted. • Member Element: If the context ID contains a member element, the string is included. This element is not translated into Japanese to ensure uniqueness and is used as-is from the annual securities report CSV (ensuring uniqueness is a future challenge). • Item Name: This is essentially the Japanese translation of the element ID, so the string is included. An example of a question created using the template is as follows: Example Created with Question Template What is the value of “Building (net amount)” in the 2020 individual annual securities report of Daiwa House Industry Co., Ltd. for NonConsolidatedMember ? When creating questions for the SIG-FIN UFO-2024 dataset, we also performed data sampling to avoid generating too many similar questions. For data sampling, a unique list of item names is created for each company, and random sampling is performed so that 1/10th of the entire dataset is selected. Additionally, the number of samples per item name is adjusted on the basis of the number of data entries for each item name5 . As a result of these procedures, we constructed the NTCIR-18 U4 dataset consisting of 32,587 entries for the Table Retrieval task and 32,589 entries for the Table QA task, and the SIG-FIN UFO-2024 dataset consisting of 14,410 entries for the Table Retrieval task and 14,412 entries for the Table QA task, as shown in Table 36 . Table 3 Breakdown of Each Dataset for Dry Run Task Train Validation Test Total NTCIR-18 U4 Table Retrieval 22,982 3,131 6,474 32,587 Table QA 22,982 3,132 6,475 32,589 SIG-FIN UFO-2024 Table Retrieval 8,390 1,533 4,487 14,410 Table QA 8,390 1,534 4,488 14,412 5. Schedule To encourage broad participation from those interested in finance, the organizers have introduced two complementary tasks: the NTCIR-18 U4 and the SIG-FIN UFO-2024. The SIG-FIN community includes researchers and practitioners actively engaged in finance, while NTCIR attracts participants interested in shared tasks, especially those with a focus on information retrieval and natural language processing. 5 If there is only one data entry for an item name, one sample is taken; if there are two to five, two samples are taken; if there are six or more, three samples are randomly selected. Data with submitter-specific taxonomies that do not include an item name in the annual securities report CSV, or data not from tables (i.e., data not in HTML’s | tags) are excluded. 6 These are the breakdowns of the initial datasets used for the Dry Run. The datasets used in the Formal Run phase have been modified to fix a couple of issues, and as a result, they contain a slightly different number of entries. By running similar tasks across these two distinct communities, we aim to foster interaction among participants with diverse expertise and perspectives on finance, creating an opportunity for knowledge exchange and collaboration. We look forward to welcoming a diverse group of participants to build a comprehensive and impactful competition. The schedule for each task is outlined in Table 4, showing the parallel timelines and key phases for both NTCIR-18 U4 and SIG-FIN UFO-2024. As illustrated in the table, both tasks share similar phases such as a dry run, formal run, and evaluation period, which will allow participants to apply and refine their approaches across both tasks seamlessly. This alignment ensures that participants will have the opportunity to benefit from complementary insights across the two tasks and fosters collaborative learning within the community. Table 4 Timeline for NTCIR-18 U4 and SIG-FIN UFO-2024 Tasks Task Phase NTCIR-18 U4 Task SIG-FIN UFO-2024 Task Preparation Dataset Release July 20, 2024 August 15, 2024 Initial Briefing Online Session July 20, 2024 - Dry Run July 20–October 31, 2024 August 15 – October 31, 2024 Formal Run November 1–December 28, 2024 November 1–December 28, 2024 Evaluation Results Return February 1, 2025 Mid-January, 2025 Publication Paper Submission May 1, 2025 Mid-February, 2025 Presentation Final Conference June 10–13, 2025 Early March, 2025 5.1. NTCIR-18 U4 Task The schedule for the NTCIR-18 U4 task is as follows: The dataset for the NTCIR-18 U4 shared task was released in July 2024, followed by an online briefing session on July 20, 2024, where participants received essential information about the task. The dry run phase ran from July 2024 to October 31, 2024, during which participants worked on the dataset and refined their methods. Any issues identified in the dataset during this phase were addressed and resolved to ensure a smooth formal run. The formal run phase is scheduled from November 1, 2024, to December 28, 2024. Throughout the NTCIR-18 U4 task, a leaderboard will be used to provide participants with real-time feedback on their performance. Similar to the SIG-FIN UFO-2024 task, the NTCIR-18 U4 leaderboard will display a Public score on the basis of a subset of the test data during the task period, allowing participants to gauge their progress. Evaluation results and final rankings are scheduled to be returned to participants on February 1, 2025, along with a partial publication of the task overview paper summarizing key outcomes. 5.2. SIG-FIN UFO-2024 Task The schedule for the SIG-FIN UFO-2024 task is as follows: The dataset for this shared task was released on August 15, 2024, with the dry run phase extending until October 31, 2024. During this phase, participants worked on developing their methods using the dataset, and any data issues identified during this period were addressed and corrected. The formal run phase is scheduled from November 1, 2024, to December 28, 2024. The shared task ranking will be determined on the basis of the evaluation method used in Kaggle7 , incorporating both Public and Private scores. Throughout the shared task, the Public score (calculated from a subset of the test data) will be displayed on the leaderboard. After the shared task concludes, the Private score (evaluated on the remaining portion of the test data) will be calculated. The final results, based on the Private score, are scheduled to be announced at the 34th SIG-FIN in March 2025. 7 https://www.kaggle.com/ 6. Conclusion This paper proposed a framework for two shared tasks, NTCIR-18 U4 and SIG-FIN UFO-2024, which focus on tables within annual securities reports. In these shared tasks, two sub-tasks are conducted: Table Retrieval and Table Question Answering (Table QA), which target the annual securities reports of companies belonging to the TOPIX 100 or TOPIX 500 indexes. Acknowledgments This research was supported by JSPS KAKENHI Grant Number 21H03769. We would also like to express our gratitude to everyone at the National Institute of Informatics, Japan, the NTCIR Co-chairs, the members of the SIG-FIN Research Group, and our corporate sponsor, Preferred Networks, Inc., for their valuable cooperation in planning these shared tasks. References [1] T. Sakai, D. W. Oard, N. Kando, Evaluating Information Retrieval and Access Tasks: NTCIR’s Legacy of Research Impact, The Information Retrieval Series, Springer Nature, 2021. doi:10.1007/ 978- 981- 15- 5554- 1 . [2] S. Zhang, K. Balog, Web table extraction, retrieval, and augmentation: A survey, in: ACM Transactions on Intelligent Systems and Technology (TIST), volume 11, issue 2, 2020, pp. 1–35. doi:10.1145/3372117 . [3] W. Chen, H. Wang, J. Chen, Y. Zhang, H. Wang, S. Li, X. Zhou, W. Y. Wang, TabFact: A large- scale dataset for table-based fact verification, in: 8th International Conference on Learning Representations, ICLR 2020, 2020. URL: https://openreview.net/forum?id=rkeJRhNYDH. [4] F. Wang, K. Sun, J. Pujara, P. Szekely, M. Chen, Table-based fact verification with salience-aware learning, in: Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 4025–4036. doi:10.18653/v1/2021.findings- emnlp.338 . [5] L. Chen, C. Huang, X. Zheng, J. Lin, X. Huang, TableVLM: Multi-modal pre-training for table struc- ture recognition, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 2437–2449. doi:10.18653/v1/2023.acl- long.137 . [6] D. Prasad, A. Gadpal, K. Kapadni, M. Visave, K. Sultanpure, CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 2439–2447. doi:10.1109/CVPRW50498.2020.00294 . [7] Z. Ma, B. Zhang, J. Zhang, J. Yu, X. Zhang, X. Zhang, S. Luo, X. Wang, J. Tang, SpreadsheetBench: To- wards challenging real world spreadsheet manipulation, 2024. doi:10.48550/arXiv.2406.14991 . [8] P. Li, Y. He, D. Yashar, W. Cui, S. Ge, H. Zhang, D. Rifinski Fainman, D. Zhang, S. Chaudhuri, Table- GPT: Table fine-tuned GPT for diverse table tasks, in: Proceedings of the ACM on Management of Data, volume 2, issue 3, 2024, pp. 1–28. doi:10.1145/3654979 . [9] X. Deng, H. Sun, A. Lees, Y. Wu, C. Yu, TURL: table understanding through representation learning, in: Proceedings of the VLDB Endowment, volume 14, issue 3, 2020, p. 307–319. doi:10.14778/ 3430915.3430921 . [10] T. Zhang, X. Yue, Y. Li, H. Sun, TableLlama: Towards open large generalist models for tables, in: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 6024–6044. doi:10.18653/v1/2024.naacl- long.335 . [11] W. Lu, J. Zhang, J. Fan, Z. Fu, Y. Chen, X. Du, Large language model for table processing: A survey, 2024. doi:10.48550/arXiv.2402.05121 . [12] A. S. Sundar, L. Heck, cTBLS: Augmenting large language models with conversational tables, in: Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), 2023, pp. 59–70. doi:10.18653/v1/2023.nlp4convai- 1.6 . [13] N. Jin, J. Siebert, D. Li, Q. Chen, A survey on table question answering: Recent advances, in: Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy, Springer Nature Singapore, 2022, pp. 174–186. doi:10.1007/978- 981- 19- 7596- 7_14 . [14] Z. Chen, W. Chen, C. Smiley, S. Shah, I. Borova, D. Langdon, R. Moussa, M. Beane, T.-H. Huang, B. Routledge, W. Y. Wang, FinQA: A dataset of numerical reasoning over financial data, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 3697–3711. doi:10.18653/v1/2021.emnlp- main.300 . [15] F. Zhu, W. Lei, Y. Huang, C. Wang, S. Zhang, J. Lv, F. Feng, T.-S. Chua, TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 3277–3287. doi:10.18653/v1/2021.acl- long.254 . [16] N. Romanus Myrberg, S. Danielsson, Question-Answering in the Financial Domain, Master’s thesis, Department of Computer Science, Lund University, 2023. URL: http://lup.lub.lu.se/student-papers/ record/9126226. [17] F. Pan, M. Canim, M. Glass, A. Gliozzo, P. Fox, CLTR: An end-to-end, transformer-based system for cell-level table retrieval and table question answering, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021, pp. 202–209. doi:10.18653/v1/ 2021.acl- demo.24 . [18] Y. Kimura, T. Kondo, K. Kadowaki, M. Kato, UFO: Proposal for an information extraction task for tables in annual securities reports (in Japanese), in: JSAI Technical Report, Type 2 SIG, volume FIN- 029, The Japanese Society for Artificial Intelligence, 2022, pp. 32–38. doi:10.11517/jsaisigtwo. 2022.FIN- 029_32 . [19] K. Kadowaki, Y. Kimura, M. Kato, T. Kondo, H. Ototake, Toward the construction of a dataset for table structure analysis for annual securities reports (in Japanese), in: JSAI Technical Report, Type 2 SIG, volume FIN-030, The Japanese Society for Artificial Intelligence, 2023, pp. 100–105. doi:10.11517/jsaisigtwo.2023.FIN- 030_100 . [20] E. Sato, Y. Kimura, Creating a question-answering dataset for securities reports and evaluation of the method using LLM (in Japanese), in: IEICE Technical Report, volume 124, no. 173, The Institute of Electronics, Information and Communication Engineers, 2024, pp. 93–98. URL: https: //www.ieice.org/publications/search/summary.php?id=132450&tbl=ken&lang=jp. [21] E. Sato, Y. Kaji, Y. Kimura, Analysis of tabular data contained in the TOPIX100 annual securi- ties report (in Japanese), The 21st Forum on Information Technology (FIT2022) E-021 (2022). URL: https://www.ieice.org/publications/conferences/summary.php?id=FIT0000015362&ConfCd= F&conf_type=F&year=2022. [22] K. Okuyama, Y. Kimura, Analysis of machine-unreadable table structures in securities reports (in Japanese), The 30th Annual Meeting of the Association for Natural Language Processing (NLP2024) P3-20 (2024). URL: https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/P3-20.pdf. |
|---|