Results of SemTab 2021⋆ Vincenzo Cutrona1 , Jiaoyan Chen2 , Vasilis Efthymiou3 , Oktie Hassanzadeh4 , Ernesto Jiménez-Ruiz5,6 , Juan Sequeda7 , Kavitha Srinivas4 , Nora Abdelmageed8 , Madelon Hulsebos9 , Daniela Oliveira10 , and Catia Pesquita10 1 SUPSI, Switzerland. vincenzo.cutrona@supsi.ch 2 University of Oxford, UK. jiaoyan.chen@cs.ox.ac.uk 3 FORTH-ICS, Greece. vefthym@ics.forth.gr 4 IBM Research, USA. hassanzadeh@us.ibm.com, kavitha.srinivas@ibm.com 5 City, University of London, UK. ernesto.jimenez-ruiz@city.ac.uk 6 SIRIUS, University of Oslo, Norway. ernestoj@uio.no 7 data.world, US. juan@data.world 8 University of Jena, Germany. nora.abdelmageed@uni-jena.de 9 University of Amsterdam, The Netherlands. m.hulsebos@uva.nl 10 LASIGE, Faculdade de Ciências, Universidade de Lisboa, Portugal. dpoliveira@fc.ul.pt, clpesquita@fc.ul.pt Abstract. SemTab 2021 was the third edition of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, successfully collocated with the 20th International Semantic Web Conference (ISWC) and the 16th Ontology Matching (OM) Workshop. SemTab provides a common framework to conduct a systematic evaluation of state-of-the-art systems. Keywords: Tabular data · Knowledge Graphs · Matching · SemTab · Semantic Web Challenge · Semantic Table Interpretation 1 Motivation Data in tabular format are the most frequent input to data analytics pipeline, thanks to their high storage and processing efficiency. Also, the tabular format allows users to represent the information in a compacted way, by exploiting the clear data structure de- fined by rows and columns. However, such clear structure does not imply a clear under- standing of the semantic structure (e.g., relationships between columns), as well as the meaning of the content (e.g., if data are about a specific topic). The lack of understand- ing hinders data analytics processes, requiring additional effort to properly understand the data first. Gaining the semantic understanding is valuable for many applications, including data cleaning, data mining, data integration, data analysis and machine learn- ing, and knowledge discovery. For example, the semantic understanding can help in assessing what kind of transformations are more appropriate for a dataset, or which datasets can be integrated to enable new analytics (e.g., marketing analysis) [10]. ⋆ Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons Li- cense Attribution 4.0 International (CC BY 4.0). In addition to their efficiency, the huge availability of tabular data on the Web makes Web tables a valuable source to consider for data miners (e.g., open data CSV files). Adding semantic information to Web tables is useful for a wide range of applications, including web search, question answering, and knowledge base construction. Tabular data to Knowledge Graph (KG) matching is the process of clarifying the semantic meaning of a table by mapping its elements (i.e., cells, columns, rows) to semantic tags (i.e., entities, classes, properties) from KGs (e.g., Wikidata, DBpedia). The task difficulty increases when table metadata (e.g., table captions, table description, or column names) being missing, incomplete or ambiguous. The tabular data to KG matching process is typically broken down into the following tasks: (i) cell to KG entity matching (CEA task), (ii) column to KG class matching (CTA task), and (iii) column pair to KG property matching (CPA task). Over the last decade several approaches made advances in addressing one or several of above tasks, also constructing benchmark datasets ([18, 22, 17, 11]). The creation of SemTab1 [15, 16] aimed at putting this significant amount of work into a common framework, enabling the systematic evaluation of state-of-the-art systems. The ambition is to make SemTab becoming the reference challenge in the Semantic Web community, in the same way the OAEI2 is for the Ontology Matching community.3 2 The Challenge The SemTab 2021 challenge has been organised into 3 different tracks: the Accuracy Track, which is the standard track proposed in previous editions; the Usability Track, a new track addressing the lack of publicly available, easy-to-use and generic solutions; and the Applications Track, which focuses on applications in real-world settings where the output of matching systems can contribute. The application track was also open to the submission of novel benchmark datasets. 2.1 Accuracy Track The Accuracy Track included 3 rounds, running from June 30 to October 15. Different target KGs were used across rounds (see Table 1): – DBpedia [3]: http://downloads.dbpedia.org/wiki- archive/ (version 2016-10) – Wikidata [24]: https://zenodo.org/record/6153449 – Schema.org [12]: https://gittables.github.io/downloads/sche ma 20210528.pkl The different rounds of SemTab 2021 have been organised to evaluate participating systems on different datasets with variable difficulty. All the rounds were run with the support of AIcrowd;4 SemTab 2021 also used the STILTool system [8, 5] for getting additional insights about the submitted solutions. 1 http://www.cs.ox.ac.uk/isg/challenges/sem-tab/ 2 http://oaei.ontologymatching.org/ 3 http://ontologymatching.org/ 4 https://www.aicrowd.com/ Table 1: Datasets used across SemTab 2021 rounds. Rounds Tasks Target KGs R1 R2 R3 CTA CPA CEA DBpedia Wikidata Schema.org 2T [9] ✓ ✓ ✓ ✓ ✓ BioTable [20] ✓ ✓ ✓ ✓ ✓ AG [15] ✓ ✓ ✓ ✓ ✓ ✓ BiodivTab [2] ✓ ✓ ✓ ✓ GitTables [13] ✓ ✓ ✓ ✓ Datasets The different datasets used to run SemTab 2021 rounds are reported in Ta- ble 1, with some statistics available in Table 2. All the datasets are available in Zenodo: – Tough Tables (2T): a dataset featuring high-quality manually-curated tables with non-obviously linkable cells, i.e., where values are ambiguous names, typos, and misspelled entity names. These challenges are particularly relevant for the annota- tion of structured legacy sources to existing KGs. Link: https://doi.org/10.5281/zenodo.6211551 – BioTable: a dataset focused on molecular biology data covering different entities. It has the larges number of rows per table in the challenge. Link: https://doi.org/10.5281/zenodo.5606585 – Automatically Generated (AG):5 a synthetic dataset with tables generated automat- ically by means of SPARQL queries. AG is the largest dataset used in SemTab. Link: https://zenodo.org/record/6154708 – BiodivTab: a dataset with tables from real-world biodiversity research datasets. Original tables have been adapted for the SemTab challenge. Link: https://doi.org/10.5281/zenodo.5584180 – GitTables: a large-scale corpus of relational tables extracted from CSV files in GitHub. The main purpose of this dataset is to facilitate learning table represen- tation models and applications in e.g., data management. A subset of tables has been curated for benchmarking column type detection methods in SemTab. Link: https://doi.org/10.5281/zenodo.5706316 Table 3 shows the participation per round. Compared with previous editions, we had 11 participants (vs 28 in 2020) submitting to at least one round.6 We identified 6 core participants (vs 8 in 2020), which completed ∼14 tasks on average (out of 17 tasks). Seven participants submitted a system paper to the challenge: MTab [19], MAGIC [23], MantisTable V [4], JenTab [1], GBMTab [26], Kepler-aSI [6], and DAGOBAH [14]. Evaluation measures As per the previous editions, systems have been evaluated on a single annotation for each provided target, for all the tasks; i.e., in CEA, target cells are to be annotated with a single entity from the target KG; in CTA, target columns are to be annotated with a single type from the target KG (as fine-grained as possible). 5 In SemTab 2021, also referred to as Hard Tables. 6 AIcrowd leaderboard scores 23 participants because of test submissions. Table 2: Statistics of the datasets in each SemTab 2021 round. For target values: W=Wikidata; D=DBpedia; S=Schema.org. AG 2T BioTables BiodivTab GitTables Round 2 Round 3 Round 1 Round 2 Round 3 Round 3 Tables # 1,750.00 7,207.00 180.00 110.00 50.00 1,101.00 Avg. Rows # (total) 16.73 8.18 1,080.21 2,449.08 259.06 58.20 Avg. Cols # (total) 3.19 2.48 4.46 5.97 23.96 15.87 1, 080.19D Avg. Rows # (target CEA) 16.73W 8.18W 2, 449.08W 258.28W 1, 080.21W 3.00D Avg. Cols # (target CEA) 1.65W 1.00W 5.97W 13.60W 3.00W 3.00D 3.08D Avg. Cols # (target CTA) 1.25W 1.00W 5.97W 12.28W 3.00W 2.62S Avg. Cols # (target CPA) 3.19W 2.48W 5.97W Table 3: Participation in the SemTab 2021 challenge. Round 1 Round 2 Round 3 2T BioTable AG AG BiodivTab GitTables 5D CEA 6 6 5 5 - 7W 3D 4D CTA 7 6 6 6 7W 2S CPA - 6 6 5 - - Total 11 7 6 6 6 4 The evaluation measures for CEA, CPA and CTA (DBpedia and Schema.org) are the standard Precision, Recall and F1-score, as defined in Equation 1: |Correct Annotations| |Correct Annotations| 2×P ×R P = , R= , F1 = (1) |System Annotations| |Target Annotations| P +R where target annotations refer to the target cells for CEA, the target columns for CTA, and the target column pairs for CPA. We consider an annotation as correct when it is included within the ground truth set (a target cell usually has multiple annotations in the ground truth, because of redirect and same-as links in KGs). Given the fine-grained type hierarchy in Wikidata, we adopted approximations of Precision and Recall in the CTA evaluation. Approximations adapt their numerators to consider partially correct annotations, i.e., annotations that are ancestors or descendants of the ground truth (GT) classes. The correctness score cscore of a CTA annotation α considers the distance between the annotation and the GT classes in the type hierarchy, and it is defined as  d(α) 0.8  , if α is in GT, or an ancestor of the GT, with d(α) ≤ 5 d(α) cscore(α) = 0.7 , if α is a descendant of the GT, with d(α) ≤ 3 (2)  0, otherwise;  Table 4: Average F1-score consider the 11 participating systems. We included MTab results for 2T from SemTab 2020. Round 1 Round 2 Round 3 2T BioTable AG AG BiodivTab GitTables 0.51D CEA 0.82 0.91 0.90 0.41 - 0.52W 0.35D 0.04D CTA 0.78 0.91 0.80 0.23 0.53W 0.19S CPA - 0.88 0.96 0.95 - - where d(α) is the shortest distance to one of the GT classes (as for CEA, also CTA GT columns may have multiple classes). For example, d(α) = 0 if α is a class in the ground truth (cscore(α) = 1), and d(α) = 2 if α is a grandchild of a class in the ground truth (cscore(α) = 0.49). Types in the higher level(s) of the KG type hierarchy are not considered in the GT (e.g., Q35120 [entity] in Wikidata). Given the correctness score cscore, approximated Precision (AP), Recall (AR), and F1-score (AF1) for the CTA evaluation are as follows: P P cscore(α) cscore(α) 2 × AP × AR AP = , AR = , AF 1 = |System Annotations| |Target Annotations| AP + AR (3) Results Table 4 contains the average F1-score achieved by the 11 participating systems. The Tough Tables dataset still represent a challenge for almost all the systems, specially considering the fact that the the dataset is the same as in SemTab 2020. The BiodivTab and GitTables datasets brought additional complexity in Round 3, highlighting that real- world tables are challenging. CEA task. Results for the CEA task are reported in Figure 1 for all the datasets. The Round 1 used the same 2T tables from last year edition,7 raising the difficulty bar at the very beginning. Most of the systems faced important challenges when dealing with 2T tables, with only 2 systems managing to achieve an F1-score over 0.8 and several of them participating in only one of the tasks. It is worth noting the work of the DAGOBAH team, which improved their system over the last year, being able to achieve higher scores on 2T this year. Starting from Round 2, systems have been evaluated on datasets never seen before. The AG datasets aimed at bringing new challenges in each round, and we can observe than only the best systems managed to maintain almost the same score on the two different versions of this dataset. Concerning bio-related datasets, performance in Round 2 were positive (slightly below 0.9 on average), confirming that tables with many rows (∼2,500 on average) do not represent a problem for most of all the systems. Instead, the complexity brought by the (relatively small) tables in the BiodivTab dataset represented a new problem to solve, showing significantly reduced 7 The Wikidata targets have been updated to the current Wikidata live version. 1.0 0.8 F1-Score 0.6 MTab 0.4 MAGIC DAGOBAH MantisTable V 0.2 JenTab Kepler-aSI R1 R1 R2 R2 R3 R3 2T-DBP 2T-WD AG BIO AG BIODIV Fig. 1: Results in the CEA task for the core participants. MTab results on 2T are from 2020. 1.0 0.8 0.6 F1-Score 0.4 MTab MAGIC 0.2 DAGOBAH JenTab 0.0 Kepler-aSI R1 R1 R2 R2 R3 R3 R3 R3 2T-DBP 2T-WD AG BIO AG BIODIV GIT-DBP GIT-SCH Fig. 2: Results in the CTA task for the core participants. performance (none of the systems scored over 0.6). The JenTab system ranked 1st over a very difficult dataset. It is worth noting, however, that members of the JenTab team are also the providers of the BiodivTab dataset. CTA task. As shown in Figure 2, the results in the CTA tasks resemble the trend already seen from the CEA results. This is an indicator that most of the systems solve the CTA tasks based on annotations found in the CEA. Additional challenges have been included in Round 3 with the GitTables dataset, where we can see a critical performance drop for all the involved systems. It is worth emphasising that, given the general picture provided by the results in CTA, more research is needed to make existing systems able to deal with real-world tables, where the cells may be missing a correspondence to the target KG. CPA task. Results for the CPA tasks are plotted in Figure 3. Currently, only BioTables and the AG datasets provide a GT for CPA. Results are overall positive for all the tasks, with a general improvement from Round 2 to Round 3 for all the involved systems, except for MAGIC, whose performance dropped a bit during the last round. 1.00 0.95 F1-Score 0.90 MTab MAGIC 0.85 DAGOBAH MantisTable V JenTab 0.80 Kepler-aSI R2 R2 R3 AG BIO AG Fig. 3: Results in the CPA task for the core participants. 2.2 Usability Track Starting from SemTab 2021, the organisation committee agreed to include a new track focusing on system usability. The main goal of this track is to mitigate a pain point in the community: the lack of publicly available, easy-to-use, and generic solution that will address the needs of a variety of applications and settings. Evaluation measures Deeply evaluating the usability of a system requires user studies to monitor different parameters [21]. Within the SemTab scope, we decided to simply verify the overall usability of tools as judged by a review panel. Participants’ solutions were examined for the following criteria: – Open source: open-source solutions make a great contribution to the community, especially when released with a permissive license. Publicly available resources can be used as a starting point for new tools or research investigations, and make experiments easily reproducible. – System dependencies: some tools may require specific platforms to be executed on premises, or have a huge resource consumption that may affect the use in common settings. For example, requiring many indexes/databases may prevent the usage of a tool by users with limited access to hardware. – Model generality: a tool may be considered general when it applies to different (and new) applications/domains, requiring near-zero adaptations; for example, tools employing machine learning techniques should not require extensive training and tuning to be adapted to different contexts. – Availability: tools may not be released as open source, but offered as a publicly available services. In this case, a tool served as a public service supports further research activities, and represent a big contribution to the community. – User experience: the purpose of a tool is to help people in solving a task; for this reason, semantic table to graph matching tools should come with a well-designed user interface that makes the tool usable also by practitioner with a limited experi- ence in semantic matching. That is, the tool should not require an extensive training to be mastered. Table 5: Usability evaluation details. Availability User Open source as a Service Experience (GUI) MTab ✓ ✓ MAGIC ✓ ✓ DAGOBAH ✓ MantisTable V ✓ ✓ JenTab ✓ Kepler-aSI Results Almost all the core participants obtained good results in this track, by perform- ing well on one or more of the above evaluation criteria. Evaluation details are reported in Table 5. We exclude system dependencies and model generality because of the insuf- ficient available evidence, which resulted in these two criteria not impacting the overall assessment strongly. Indeed, available data about system performance (i.e., accuracy) with reference to the different datasets and target KGs used in SemTab rounds do not allow us to draw any consistent conclusions. For example, it is not clear if tools were customized or tweaked (e.g., changing the lookup function for noisy data) to increase their accuracy in different rounds; we are not able to assess how hard a system adapts to a different context (e.g., changing the target KG). The evaluation panel concluded that most of the tools are pre-configured and can potentially be used out of the box: for example, JenTab has been packaged in Docker containers to ease the deployment and execution of the tool on local premises. In gen- eral, tools requirements vary in complexity, but they are reasonable overall (e.g., pre- processing required, like creating new indexes or embeddings). Considering the other criteria, JenTab is the only system released as open source under a permissive license (Apache 2.0). The MTab tool has been made publicly avail- able as a Web service, free to use (MIT license); but the back-end application has not been disclosed. However, having a public API enables MTab serving third-party appli- cation (with no rate limit), and this was a key point in declaring MTab the most usable tool. Systems like DAGOBAH and MantisTable delivered a framework with impressive GUIs, while others (e.g., MAGIC) opted for a lightweight application. 2.3 Applications Track This new track aims at addressing applications in real-world settings that take advantage of the output of the matching systems. Challenging dataset proposals have also been accepted and included within the SemTab 2021 rounds. Results A specific application has been identified within the biological domain, where new data are constantly produced thanks to the advances in the field. The domain is particularly challenging from the semantics standpoint because of the the complexity of the biological relations between entities. Within SemTab, the data representation significantly impact the systems performance since entities are usually represented by codes (e.g., chemical formulas or gene names). Two different datasets have been sub- mitted related to the biological domain; the first one, BioTables, is a dataset focused on molecular biology data; the second, BiodivTab, is a dataset focused on biodiversity research data and data augmentation. Along side the above domain, a different dataset has been submitted to this track and also included in Round 3, GitTables. This dataset includes relational tables extracted from CSV files hosted at GitHub, and it comes with a peculiarity: the GT for CTA uses a mixture of classes and properties to annotate columns (both for the DBpedia and Schema.org versions). The three datasets brought new complexity and contributed to increment the data diversity among the SemTab benchmark datasets. 2.4 Prizes As in previous editions, IBM Research8 sponsored SemTab 2021 and awarded the best systems in each track with the following prizes: – Accuracy Track: DAGOBAH (1st prize) was the top system in most of the tasks, showing appreciable improvements over the last years. Honorary mention to MTab – Usability Track: MTab team (1st prize), for providing the easy-to-use MTab tool9 along with Web services to lookup entities and annotate tables; JenTab (2st prize), for being the only open-source system with a permissive license. Honorary men- tions to DAGOBAH, MAGIC and MantisTable. – Applications Track: BiodivTab dataset (1st prize), for having brought new chal- lenges in CEA and CTA tasks. Honorary mention to GitTables. 3 Lessons Learned and Future Work Avoiding over-fitting to AG. We have been using the same automated dataset generation process, with some variations that make it more challenging, since the first SemTab challenge. This may be resulting to participating systems that explicitly target datasets with characteristics similar to those of the AG datasets. This becomes evident from the almost perfect results shown in Table 4. For that reason, this year we have introduced several new datasets, while we are also planning to use as much as possible real data, rather than synthetic, in the future versions of the challenge. System generalizability beyond KGs. Many systems currently rely on matching table values to entities in KGs. In this version of SemTab, we challenged the participating systems on their ability to detect the semantic types of table columns even when their values are not linkable to KG entities. We conclude that most systems do not generalize well in this scenario as indicated by the performance drop on the CTA task for GitTables (see Section 2.1). Improving systems to this end would make them useful for expanding KG coverage by matching tables from novel data sources to KGs in order to populate the 8 https://www.research.ibm.com/ 9 https://github.com/phucty/mtab tool “unknown unknowns” [25]. This generalizability would also benefit the applicability of the systems in offline databases. We plan to encourage and evaluate systems on their generalizability towards novel data sources in future versions of SemTab. CTA vs CPA: the case of GitTables. Since the first edition of SemTab, we are used to consider CTA and CPA as two separated tasks, the first focuses on ontology classes, and the latter is dedicated to properties. However, GitTables annotations for CTA includes also properties from DBpedia and Schema.org. The rationale behind this choice stands in the relational nature of the considered tables: columns typically correspond to the attributes of an entity, which are reflected by properties in DBpedia and Schema.org, for example. Also, this choice is very useful when annotating literal columns (i.e., columns not containing mentions of entities), avoiding annotations based on datatypes (e.g., xsd:string). Therefore, GitTables introduced a new technical challenge, which po- tentially contributed to the complexity observed from the results in Figure 2. The case of GitTables may result in a new task to accomplish in the future, given that it en- ables table-to-KG matching with tables from alternative data sources and contexts (e.g., database dumps from industry). Usability track. We believe that the introduction of the usability track has contributed to making participating systems publicly accessible. Our goal was exactly to encourage this, despite the competitive nature that a challenge may have. Thus, we consider this new track to be a very important one and we are planning to keep it in the next chal- lenges. Next SemTab editions may consider to improve the evaluation of this track, for example by adopting the System Usability Scale (SUS) [7] to score the overall user experience. In particular, developing a systematic way to evaluate systems’ generality and dependencies would definitely improve the evaluation of this track. Applications track. We believe that the call of the application track has grasped more attention from the community by introducing their own datasets. Contributions from the community like BiodivTab, BioTable and GitTables help in extending the SemTab benchmark with new real-world challenges that are hard to reproduce in synthetic datasets as AG. Thus, this new track has been an important addition to SemTab. Acknowledgements We would like to thank the challenge participants, the ISWC & OM organisers, the AIcrowd team, and our sponsor IBM Research that played a key role in the success of SemTab. We also thank Paul Groth and Çağatay Demiralp for their contributions to GitTables. Moreover, we would like to thank Sirko Schindler and Birgitta König-Ries for their contribution to BiodivTab. This work was also supported by the SIRIUS Cen- tre for Scalable Data Access (Research Council of Norway), Samsung Research UK, the EPSRC projects UK FIRES and ConCur, and the HFRI project ResponsibleER (No 969). DO and CP were supported by FCT through LASIGE (UIDB/00408/2020 and UIDP/00408/2020). We would also like to acknowledge that the work of the challenge organisers was greatly simplified by using the EasyChair conference management sys- tem and the CEUR-WS.org open-access publication service. References 1. N. Abdelmageed and S. Schindler. JenTab Meets SemTab 2021’s New Challenges. In Se- mantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR- WS.org, 2021. 2. N. Abdelmageed, S. Schindler, and B. König-Ries. BiodivTab: A Tabular Benchmark based on Biodiversity Research Data. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 3. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A Nucleus for a Web of Open Data. In The Semantic Web, pages 722–735. Springer Berlin Heidelberg, 2007. 4. R. Avogadro and M. Cremaschi. MantisTable V: A novel and efficient approach to Semantic Table Interpretation. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 5. R. Avogadro, M. Cremaschi, E. Jiménez-Ruiz, and A. Rula. A Framework for Quality As- sessment of Semantic Annotations of Tabular Data. In 20th International Semantic Web Conference (ISWC), pages 528–545, 2021. 6. W. Baazouzi, M. Kachroudi, and S. Faiz. Kepler-aSI at SemTab 2021. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 7. J. Brooke. SUS: a ‘quick and dirty’ usability scale. Usability evaluation in industry, 189(3), 1996. 8. M. Cremaschi, A. Siano, R. Avogadro, E. Jiménez-Ruiz, and A. Maurino. STILTool: A Semantic Table Interpretation evaLuation Tool. In ESWC 2020 Satellite Events, pages 61– 66, 2020. 9. V. Cutrona, F. Bianchi, E. Jiménez-Ruiz, and M. Palmonari. Tough Tables: Carefully Eval- uating Entity Linking for Tabular Data. In 19th International Semantic Web Conference (ISWC), pages 328–343, 2020. 10. V. Cutrona, F. D. Paoli, A. Košmerlj, N. Nikolov, M. Palmonari, F. Perales, and D. Roman. Semantically-Enabled Optimization of Digital Marketing Campaigns. In International Se- mantic Web Conference (ISWC), pages 345–362. Springer, 2019. 11. V. Efthymiou, O. Hassanzadeh, M. Rodriguez-Muro, and V. Christophides. Matching Web Tables with Knowledge Base Entities: From Entity Lookups to Entity Embeddings. In ISWC, volume 10587, pages 260–277. Springer, 2017. 12. R. V. Guha, D. Brickley, and S. Macbeth. Schema.Org: Evolution of Structured Data on the Web. Commun. ACM, 59(2):44–51, jan 2016. 13. M. Hulsebos, Ç. Demiralp, and P. Groth. GitTables: A Large-Scale Corpus of Relational Tables. CoRR, abs/2106.07258, 2021. 14. V.-P. Huynh, J. Liu, Y. Chabot, F. Deuzé, T. Labbé, P. Monnin, and R. Troncy. DAGOBAH: Table and Graph Contexts For Efficient Semantic Annotation Of Tabular Data. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 15. E. Jimenez-Ruiz, O. Hassanzadeh, V. Efthymiou, J. Chen, and K. Srinivas. SemTab 2019: Resources to Benchmark Tabular Data to Knowledge Graph Matching Systems. In The Semantic Web: ESWC. Springer International Publishing, 2020. 16. E. Jiménez-Ruiz, O. Hassanzadeh, V. Efthymiou, J. Chen, K. Srinivas, and V. Cutrona. Re- sults of SemTab 2020. In Proceedings of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching co-located with the 19th International Semantic Web Confer- ence (ISWC 2020), pages 1–8, 2020. 17. O. Lehmberg, D. Ritze, R. Meusel, and C. Bizer. A large public corpus of web tables con- taining time and context metadata. In WWW, 2016. 18. G. Limaye, S. Sarawagi, and S. Chakrabarti. Annotating and searching web tables using entities, types and relationships. VLDB Endowment, 3(1-2):1338–1347, 2010. 19. P. Nguyen, I. Yamada, N. Kertkeidkachorn, R. Ichise, and H. Takeda. SemTab 2021: Tabular Data Annotation with MTab Tool. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 20. D. Oliveira and C. Pesquita. SemTab 2021 BioTable Dataset. doi:10.5281/zenodo.5606585, Oct. 2021. 21. C. Pesquita, V. Ivanova, S. Lohmann, and P. Lambrix. A framework to conduct and report on empirical user studies in semantic web contexts. In European Knowledge Acquisition Workshop, pages 567–583. Springer, 2018. 22. D. Ritze, O. Lehmberg, and C. Bizer. Matching HTML Tables to DBpedia. In Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics, WIMS, pages 10:1–10:6. ACM, 2015. 23. B. Steenwinckel, F. D. Turck, and F. Ongenae. MAGIC: Mining an Augmented Graph using INK, starting from a CSV. In Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab). CEUR-WS.org, 2021. 24. D. Vrandecic and M. Krötzsch. Wikidata: a free collaborative knowledge base. Commun. ACM, 57(10):78–85, 2014. 25. G. Weikum. Knowledge Graphs 2021: A Data Odyssey. Proc. VLDB Endow., 14(12):3233– 3238, 2021. 26. L. Yang, S. Shen, J. Ding, and J. Jin. GBMTab: A Graph-Based Method for Interpreting Se- mantic Table to Knowledge Graph. In Semantic Web Challenge on Tabular Data to Knowl- edge Graph Matching (SemTab). CEUR-WS.org, 2021.