<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Qanary Builder: Addressing the Reproducibility Crisis in Question Answering over Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aleksandr Perevalov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Both</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Florian Gudat</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Bräuning</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Meesters</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lennart Gründel</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marie-Susann Bachmann</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salem Zin Iden Naser</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DICE Group, University of Paderborn</institution>
          ,
          <addr-line>Warburger Str. 100, 33098 Paderborn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Leipzig University of Applied Sciences</institution>
          ,
          <addr-line>Karl-Liebknecht-Straße 132, 04277 Leipzig</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technology Innovation Unit, DATEV eG</institution>
          ,
          <addr-line>Nuremberg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper discusses the challenge of reproducibility in the field of Question Answering over Knowledge Graphs (KGQA). To address this challenge, the Qanary Builder has been developed as a tool to facilitate the creation and evaluation of component-based KGQA systems. The Qanary Builder is a full-stack Web application that enables a no-code development process of KGQA systems by configuring them from pre-defined components and providing evaluation functionality. Based on the Qanary Framework, it provides visual insights and instant explainability of a KGQA process through semantic annotations. The authors aim to present the efectiveness of the Qanary Builder in addressing the reproducibility crisis and demonstrate how this tool can improve the KGQA system development and evaluation eficiency.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Qanary Builder</kwd>
        <kwd>Qanary Framework</kwd>
        <kwd>Question Answering</kwd>
        <kwd>Evaluation</kwd>
        <kwd>Reproducibility</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>questions. It acts as an orchestrator of diferent pre-defined components that can be combined
in a KGQA system. The SAs are used to persist the outputs of all components of a
Qanarybased system, hence, each KGQA process can be traced by following the SAs. In this regard,
the KGQA systems and their components may store their confidence score, execution time,
identified resources, and other information in SAs. In a nutshell, the Qanary Builder provides its
users—researchers—with a full-cycle development process of KGQA systems by interactively
(re)configuring them from pre-defined components and providing built-in evaluation functionality
without writing code. In this demo paper, we present the aforementioned features of the Qanary
Builder and describe how it is addressing the reproducibility crisis and enabling more eficient
and reliable research in this field.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Reproducibility is a general challenge in many research communities. In particular, for the
KGQA field, a researcher may not be able to reproduce results presented a few years ago or even
the most recent ones [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Therefore, a number of various solutions were proposed to address
this problem. The authors of [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] introduce Qanary as a knowledge-based methodology for
orchestrating component-based KGQA systems distributed over the Web. It employs its own
RDF ontology (based on the Web Annotation Data Model) as an exchange format (Semantic
Annotations) for components to build KGQA systems in a more flexible and standardized way.
GERBIL [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has been introduced as an evaluation framework for semantic entity annotation
and KGQA (cf., GERBIL-QA [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). This framework generates data in a machine-readable format
and provides persistent URIs for each experiment, ensuring the reproducibility and archiving of
the corresponding evaluation results. Furthermore, there were several initiatives to provide
standardized benchmarks [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and leaderboards [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for diferent KGQA tasks.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Qanary Builder’s Use Cases</title>
      <p>The use cases that demonstrate the efectiveness of Qanary Builder in addressing the
reproducibility crisis are: (1) Researchers may create their own KGQA system from available Qanary
components and evaluate it on a provided dataset; (2) Researchers may take existing KGQA
systems, which are represented as a single Qanary component, run the evaluation, and compare
the obtained results. The use cases do not require any coding as everything is pre-defined,
therefore, it standardizes the evaluation process and decreases the chances of making mistakes
in between. For a better understanding, we provide a video2 that covers Qanary Builder’s
use-cases and encourage readers to test the application online3.</p>
      <p>Figure 1 presents the designer module of Qanary Builder. The designer enables users to manage
the available Qanary-based systems and the corresponding configurations , i.e., a sequential
order of components to form the process of a KGQA system. The workspace of the designer
allows a user to select components, try single questions to test the functionality, and see the
answer as well as the SAs created by each component during the KGQA process. Thus, the
designer contributes to both first and second use cases. The instant explainability is provided
2https://drive.google.com/file/d/10DT9UfgjFUObbhE6fsbT4EcjxRahl2Yc/view
3Live demo link: https://builder.qanary.net/. Login: “iswc2023”, Password: “dem0”.</p>
      <p>4
3
5
through the SAs viewer (Element 6 of Figure 1) to enable user directly observe what a particular
KGQA component has identified. The datasets’ manager is responsible for managing custom
datasets that are further used for the evaluation. The accepted data format is a .csv file that
contains two fields: “question” and “answer”. An “answer” may be represented in diferent
forms: a textual answer, a SPARQL query, a named entity’s URI and many more. Hence, the
datasets’ manager is a crucial component for establishing a reproducible evaluation process
related the first and second use cases. The tester facilitates the evaluation runs given a specified
configuration and a dataset. Each run contains information on the run time, configuration,
dataset, and a question-wise accuracy score. The tester utilizes a dataset created with the datasets’
manager and iteratively sends questions to a KGQA system configuration defined in the designer.
The results appear after a particular question has been processed. Therefore, the tester addresses
both the first and second use cases as well.
4. Qanary Builder’s Technical Overview
The Qanary Builder is split into front-end and back-end subsystems. It connects to a specified
Qanary KGQA system instance4 and monitors currently registered components. Hence, Qanary
Builder always has up-to-date information on what KGQA components can be used for
configuring a system. A configured KGQA system can be directly evaluated in the Qanary Builder by
selecting a specific test dataset. In its turn, the test datasets are custom and are managed by a
dedicated module. The overview of the architecture of Qanary Builder is presented in Figure 2.
4The Qanary was developed outside of this work.</p>
      <p>Generated
Web pages
for a user</p>
      <p>&lt;&lt;subsystem&gt;&gt; Qanary Builder's Front-end
NextJS</p>
      <p>Axios
Metadata Database (e.g.,</p>
      <p>MongoDB)
&lt;&lt;subsystem&gt;&gt; Qanary Builder's Back-end</p>
      <p>Database
driver</p>
      <p>Database Port</p>
      <p>SPARQL
Interface Port</p>
      <p>Spring Boot
Apache Jena</p>
      <p>Apache
Jena's Java</p>
      <p>Interface</p>
      <p>SPARQL Endpoint
(to encapsulate the Triplestore)</p>
      <p>RESTful API</p>
      <p>Port</p>
      <p>RESTful API</p>
      <p>Interface
QuestionAnswering
Interface
SPARQL</p>
      <p>Endpoint
QuestionAnswering</p>
      <p>Port
&lt;&lt;external subsystem&gt;&gt; Qanary System</p>
      <p>Exposed</p>
      <p>SPARQL Port</p>
      <p>Qanary Pipeline
Encapsulated
SPARQL Port
SPARQL
Endpoint</p>
      <p>Component
registration Port
Registration</p>
      <p>Interface
Triplestore (e.g., Stardog)</p>
      <p>Qanary Component 1
(e.g., Query Builder)</p>
      <p>Qanary Component N
(e.g., Named Entity</p>
      <p>Recognition)</p>
      <p>The front-end subsystem of the Qanary Builder is a Web application written with Next.js. It
contains three functional modules: designer, datasets manager, and tester that were described
in the above section. Thus, it helps users with managing their KGQA system configurations,
datasets, and test runs. The back-end subsystem is a RESTful API written using the Spring
Boot framework. It handles the logic for managing the metadata about Qanary Systems, KGQA
system configurations, datasets, and test runs. The storage of this metadata is done with
MongoDB via the corresponding database driver. The back-end requests the Qanary System via
its Question-Answering interface to trigger processing of a question given a set of components.
The back-end communicates with the Qanary System’s SPARQL endpoint via Apache Jena
library, which provides Java interface from one side and connects to the SPARQL endpoint from
the other side. This is used to fetch the SAs and present them at the front-end subsystem.</p>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>In conclusion, the reproducibility crisis in the KGQA field has been a major concern for
researchers. The development of the Qanary Builder ofers a solution to this challenge by allowing
the creation and evaluation of component-based KGQA systems without the need for coding.
With built-in development and evaluation functionality, the Qanary Builder provides visual
insights and instant explainability. By utilizing this tool, researchers can improve the
reproducibility of KGQA system development and evaluation, leading to more eficient and reliable
research in the field of KGQA. The source code of the whole project is published online 5 as
open source (MIT License).</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This research has been partially funded by the Federal Ministry of Education and Research
(BMBF) under grant 01IS17046. as part of the Software Campus project “LASS KG: Language
Agnostic Semantic Search driven by Knowledge Graphs”, and by grants for the ITZBund6-funded
research project “Entwicklung und Erforschung von IT-basierten Lösungen im Rahmen des
ChatBot-Frameworks des Bundes (Question-Answering-Komponenten zur Erweiterung des
ChatBot-Frameworks)” at the Leipzig University of Applied Sciences.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Perevalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kovriguina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Both</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <article-title>Knowledge graph question answering leaderboard: A community resource to prevent a replication crisis</article-title>
          ,
          <source>in: Proceedings of the Thirteenth Language Resources and Evaluation Conf</source>
          .,
          <year>2022</year>
          , pp.
          <fpage>2998</fpage>
          -
          <lpage>3007</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Both</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shekarpour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cherix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <article-title>Qanary-a methodology for vocabulary-driven open question answering systems</article-title>
          ,
          <source>in: European Semantic Web Conference</source>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>625</fpage>
          -
          <lpage>641</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Both</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cherix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <article-title>The Qanary ecosystem: getting new insights by composing question answering pipelines</article-title>
          ,
          <source>in: ICWE</source>
          <year>2017</year>
          , Rome, Italy, June 5-8,
          <year>2017</year>
          , Proceedings 17, Springer,
          <year>2017</year>
          , pp.
          <fpage>171</fpage>
          -
          <lpage>189</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Röder</surname>
          </string-name>
          , A.
          <string-name>
            <surname>-C. Ngonga Ngomo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Baron</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Both</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Brümmer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ceccarelli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cornolti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Cherix</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Eickmann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Ferragina</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lemke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piccinno</surname>
            , G. Rizzo,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Sack</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Speck</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Troncy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Waitelonis</surname>
          </string-name>
          , L. Wesemann, GERBIL:
          <article-title>General entity annotator benchmarking framework</article-title>
          ,
          <source>in: Proceedings of the 24th International Conference on World Wide Web, WWW '15</source>
          ,
          <year>2015</year>
          , p.
          <fpage>1133</fpage>
          -
          <lpage>1143</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Röder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Conrads</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huthmann</surname>
          </string-name>
          , A.
          <string-name>
            <surname>-C. NgongaNgomo</surname>
            ,
            <given-names>C.</given-names>
            Demmler, C.
          </string-name>
          <string-name>
            <surname>Unger</surname>
          </string-name>
          , A.
          <string-name>
            <surname>-C. Ngonga Ngomo</surname>
            ,
            <given-names>I. Fundulaki</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <article-title>Benchmarking question answering systems</article-title>
          ,
          <source>Semantic Web</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>293</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Gusmita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Saleem, 9th challenge on question answering over linked data (QALD-9), in: Joint proc</article-title>
          .
          <source>of the 4th Workshop on Semantic Deep Learning (SemDeep-4) and NLIWoD4 and QALD-9 co-located with ISWC</source>
          <year>2018</year>
          ,
          <year>2018</year>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>