1. Introduction

Qanary Builder: Addressing the Reproducibility Crisis in Question Answering over Knowledge Graphs

Aleksandr Perevalov

0 1

Andreas Both

1 2

Florian Gudat

Paul Bräuning

Johannes Meesters

Lennart Gründel

Marie-Susann Bachmann

Salem Zin Iden Naser

1 0 DICE Group, University of Paderborn , Warburger Str. 100, 33098 Paderborn , Germany 1 Leipzig University of Applied Sciences , Karl-Liebknecht-Straße 132, 04277 Leipzig , Germany 2 Technology Innovation Unit, DATEV eG , Nuremberg , Germany

This paper discusses the challenge of reproducibility in the field of Question Answering over Knowledge Graphs (KGQA). To address this challenge, the Qanary Builder has been developed as a tool to facilitate the creation and evaluation of component-based KGQA systems. The Qanary Builder is a full-stack Web application that enables a no-code development process of KGQA systems by configuring them from pre-defined components and providing evaluation functionality. Based on the Qanary Framework, it provides visual insights and instant explainability of a KGQA process through semantic annotations. The authors aim to present the efectiveness of the Qanary Builder in addressing the reproducibility crisis and demonstrate how this tool can improve the KGQA system development and evaluation eficiency.

eol>Qanary Builder Qanary Framework Question Answering Evaluation Reproducibility

1. Introduction

questions. It acts as an orchestrator of diferent pre-defined components that can be combined in a KGQA system. The SAs are used to persist the outputs of all components of a Qanarybased system, hence, each KGQA process can be traced by following the SAs. In this regard, the KGQA systems and their components may store their confidence score, execution time, identified resources, and other information in SAs. In a nutshell, the Qanary Builder provides its users—researchers—with a full-cycle development process of KGQA systems by interactively (re)configuring them from pre-defined components and providing built-in evaluation functionality without writing code. In this demo paper, we present the aforementioned features of the Qanary Builder and describe how it is addressing the reproducibility crisis and enabling more eficient and reliable research in this field.

2. Related Work

Reproducibility is a general challenge in many research communities. In particular, for the KGQA field, a researcher may not be able to reproduce results presented a few years ago or even the most recent ones [ 1 ]. Therefore, a number of various solutions were proposed to address this problem. The authors of [ 2, 3 ] introduce Qanary as a knowledge-based methodology for orchestrating component-based KGQA systems distributed over the Web. It employs its own RDF ontology (based on the Web Annotation Data Model) as an exchange format (Semantic Annotations) for components to build KGQA systems in a more flexible and standardized way. GERBIL [ 4 ] has been introduced as an evaluation framework for semantic entity annotation and KGQA (cf., GERBIL-QA [ 5 ]). This framework generates data in a machine-readable format and provides persistent URIs for each experiment, ensuring the reproducibility and archiving of the corresponding evaluation results. Furthermore, there were several initiatives to provide standardized benchmarks [ 6 ] and leaderboards [ 1 ] for diferent KGQA tasks.

3. Qanary Builder’s Use Cases

The use cases that demonstrate the efectiveness of Qanary Builder in addressing the reproducibility crisis are: (1) Researchers may create their own KGQA system from available Qanary components and evaluate it on a provided dataset; (2) Researchers may take existing KGQA systems, which are represented as a single Qanary component, run the evaluation, and compare the obtained results. The use cases do not require any coding as everything is pre-defined, therefore, it standardizes the evaluation process and decreases the chances of making mistakes in between. For a better understanding, we provide a video2 that covers Qanary Builder’s use-cases and encourage readers to test the application online3.

Figure 1 presents the designer module of Qanary Builder. The designer enables users to manage the available Qanary-based systems and the corresponding configurations , i.e., a sequential order of components to form the process of a KGQA system. The workspace of the designer allows a user to select components, try single questions to test the functionality, and see the answer as well as the SAs created by each component during the KGQA process. Thus, the designer contributes to both first and second use cases. The instant explainability is provided 2https://drive.google.com/file/d/10DT9UfgjFUObbhE6fsbT4EcjxRahl2Yc/view 3Live demo link: https://builder.qanary.net/. Login: “iswc2023”, Password: “dem0”.

4 3 5 through the SAs viewer (Element 6 of Figure 1) to enable user directly observe what a particular KGQA component has identified. The datasets’ manager is responsible for managing custom datasets that are further used for the evaluation. The accepted data format is a .csv file that contains two fields: “question” and “answer”. An “answer” may be represented in diferent forms: a textual answer, a SPARQL query, a named entity’s URI and many more. Hence, the datasets’ manager is a crucial component for establishing a reproducible evaluation process related the first and second use cases. The tester facilitates the evaluation runs given a specified configuration and a dataset. Each run contains information on the run time, configuration, dataset, and a question-wise accuracy score. The tester utilizes a dataset created with the datasets’ manager and iteratively sends questions to a KGQA system configuration defined in the designer. The results appear after a particular question has been processed. Therefore, the tester addresses both the first and second use cases as well. 4. Qanary Builder’s Technical Overview The Qanary Builder is split into front-end and back-end subsystems. It connects to a specified Qanary KGQA system instance4 and monitors currently registered components. Hence, Qanary Builder always has up-to-date information on what KGQA components can be used for configuring a system. A configured KGQA system can be directly evaluated in the Qanary Builder by selecting a specific test dataset. In its turn, the test datasets are custom and are managed by a dedicated module. The overview of the architecture of Qanary Builder is presented in Figure 2. 4The Qanary was developed outside of this work.

Generated Web pages for a user

<<subsystem>> Qanary Builder's Front-end NextJS

Axios Metadata Database (e.g.,

MongoDB) <<subsystem>> Qanary Builder's Back-end

Database driver

Database Port

SPARQL Interface Port

Spring Boot Apache Jena

Apache Jena's Java

Interface

SPARQL Endpoint (to encapsulate the Triplestore)

RESTful API

Port

RESTful API

Interface QuestionAnswering Interface SPARQL

Endpoint QuestionAnswering

Port <<external subsystem>> Qanary System

Exposed

SPARQL Port

Qanary Pipeline Encapsulated SPARQL Port SPARQL Endpoint

Component registration Port Registration

Interface Triplestore (e.g., Stardog)

Qanary Component 1 (e.g., Query Builder)

Qanary Component N (e.g., Named Entity

Recognition)

The front-end subsystem of the Qanary Builder is a Web application written with Next.js. It contains three functional modules: designer, datasets manager, and tester that were described in the above section. Thus, it helps users with managing their KGQA system configurations, datasets, and test runs. The back-end subsystem is a RESTful API written using the Spring Boot framework. It handles the logic for managing the metadata about Qanary Systems, KGQA system configurations, datasets, and test runs. The storage of this metadata is done with MongoDB via the corresponding database driver. The back-end requests the Qanary System via its Question-Answering interface to trigger processing of a question given a set of components. The back-end communicates with the Qanary System’s SPARQL endpoint via Apache Jena library, which provides Java interface from one side and connects to the SPARQL endpoint from the other side. This is used to fetch the SAs and present them at the front-end subsystem.

5. Conclusion

In conclusion, the reproducibility crisis in the KGQA field has been a major concern for researchers. The development of the Qanary Builder ofers a solution to this challenge by allowing the creation and evaluation of component-based KGQA systems without the need for coding. With built-in development and evaluation functionality, the Qanary Builder provides visual insights and instant explainability. By utilizing this tool, researchers can improve the reproducibility of KGQA system development and evaluation, leading to more eficient and reliable research in the field of KGQA. The source code of the whole project is published online 5 as open source (MIT License).

Acknowledgments

This research has been partially funded by the Federal Ministry of Education and Research (BMBF) under grant 01IS17046. as part of the Software Campus project “LASS KG: Language Agnostic Semantic Search driven by Knowledge Graphs”, and by grants for the ITZBund6-funded research project “Entwicklung und Erforschung von IT-basierten Lösungen im Rahmen des ChatBot-Frameworks des Bundes (Question-Answering-Komponenten zur Erweiterung des ChatBot-Frameworks)” at the Leipzig University of Applied Sciences.

[1]

Perevalov ,

Yan ,

Kovriguina ,

Jiang ,

Both ,

Usbeck , Knowledge graph question answering leaderboard: A community resource to prevent a replication crisis , in: Proceedings of the Thirteenth Language Resources and Evaluation Conf ., 2022 , pp. 2998 - 3007 .

[2]

Both ,

Diefenbach ,

Singh ,

Shekarpour ,

Cherix ,

Lange , Qanary-a methodology for vocabulary-driven open question answering systems , in: European Semantic Web Conference , Springer, 2016 , pp. 625 - 641 .

[3]

Diefenbach ,

Singh ,

Both ,

Cherix ,

Lange ,

Auer , The Qanary ecosystem: getting new insights by composing question answering pipelines , in: ICWE 2017 , Rome, Italy, June 5-8, 2017 , Proceedings 17, Springer, 2017 , pp. 171 - 189 .

[4]

Usbeck ,

Röder , A. -C. Ngonga Ngomo , C.

Baron , A.

Both , M.

Brümmer , D.

Ceccarelli , M.

Cornolti , D.

Cherix , B.

Eickmann , P.

Ferragina , C.

Lemke , A.

Moro , R.

Navigli , F.

Piccinno , G. Rizzo, H.

Sack , R.

Speck , R.

Troncy , J.

Waitelonis , L. Wesemann, GERBIL: General entity annotator benchmarking framework , in: Proceedings of the 24th International Conference on World Wide Web, WWW '15 , 2015 , p. 1133 - 1143 .

[5]

Verborgh ,

Usbeck ,

Röder ,

Hofmann ,

Conrads ,

Huthmann , A. -C. NgongaNgomo , C. Demmler, C. Unger , A. -C. Ngonga Ngomo , I. Fundulaki ,

Krithara , Benchmarking question answering systems , Semantic Web 10 ( 2019 ) 293 - 304 .

[6]

Usbeck ,

R. H.

Gusmita ,

A. N.

Ngomo , M. Saleem, 9th challenge on question answering over linked data (QALD-9), in: Joint proc . of the 4th Workshop on Semantic Deep Learning (SemDeep-4) and NLIWoD4 and QALD-9 co-located with ISWC 2018 , 2018 , pp. 58 - 64 .