=Paper=
{{Paper
|id=Vol-1963/paper607
|storemode=property
|title=QAestro Framework - Semantic Composition of QA Pipelines
|pdfUrl=https://ceur-ws.org/Vol-1963/paper607.pdf
|volume=Vol-1963
|authors=Kuldeep Singh,Ioanna Lytra,Kunwar Abhinav Aditya,Maria Esther Vidal
|dblpUrl=https://dblp.org/rec/conf/semweb/SinghLAV17
}}
==QAestro Framework - Semantic Composition of QA Pipelines==
<pdf width="1500px">https://ceur-ws.org/Vol-1963/paper607.pdf</pdf>
<pre>
    QA ESTRO Framework – Semantic Composition of QA
                      Pipelines

 Kuldeep Singh1,2 , Ioanna Lytra1,2 , Kunwar Abhinav Aditya2 Maria-Esther Vidal1,2
     1
         Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Germany
               2
                  Institute for Applied Computer Science, University of Bonn, Germany
                             kuldeep.singh@iais.fraunhofer.de
                                {lytra,vidal}@cs.uni-bonn.de
                                      s6kuadit@uni-bonn.de
           Abstract. Many question answering systems and related components have been
           developed in recent years. Since question answering involves several tasks and
           subtasks, common in many systems, existing components can be combined in
           various ways to build the tailored question answering pipelines. QAestro frame-
           work provides the tools to semantically describe question answering components
           and automatically generate possible pipelines given developer requirements. We
           demonstrate the functionality of QAestro framework for building the question
           answering pipelines including different tasks and components. Attendees will be
           able to semantically describe question answering pipelines and integrate them
           into existing frameworks. A video of the demonstration is available online3 .

           Keywords: Question Answering, Software Reusability, SAT solver


1        Introduction
Question answering (QA) systems allow users to extract useful information from linked
open data sets such as DBpedia. At an abstract level, QA systems perform tasks, for ex-
ample, question analysis, query generation, and answer generation as part of their QA
pipeline [3]. However, a QA system developer implements these tasks either dedicating
separate QA components for it or combining few tasks together as one component [3].
Hence, components performing one QA task in a QA system can be reusable in other
QA systems. The Qanary ecosystem [2] supports the reusability of such QA compo-
nents. However, there is no systematic way to describe existing QA components – ei-
ther standalone or parts of other QA systems – based on their functionality, i.e., the task
they perform. Therefore, with the increasing number of QA components, identifying all
viable combinations of components when creating new QA pipelines requires a manual
and time-consuming search in the large combinatorial space of solutions.
    We demonstrate QA ESTRO, a framework able to deal with the QA pipeline com-
position problem by casting it to the query rewriting problem [1] and leveraging state-
of-the-art SAT solvers [3]. QA ESTRO helps QA developers to semantically describe
QA components and developer requirements based on these semantic descriptions; a
controlled vocabulary is utilized to model QA tasks and exploited in the description of
the QA components. Attendees will be able to semantically compose QA pipelines and
reuse them in frameworks like QANARY, OKBQA, or QALL-ME [3]. Moreover, they
will observe how QA component descriptions can be exploited to enhance reusability.
 3
     https://www.youtube.com/watch?v=9lhamebx7JM&feature=youtu.be
                                                                              Input/Output
                        Encoder                Finder        Decoder
                                                                                Validator
QA                                CNF Theory            Models
Developer                                 MCDSAT
Requirement                                                                     QAestro
                       QAestro                                                  Decoder
              UI       Encoder

                                                          Compositions   C1       C1         C2
                                                          of QA
                       LAV Rules                                         C0       C3         C3
                                                          Components
                        (QACM)                                           C0       C4         C5


Fig. 1: The QA ESTRO Architecture. QA ESTRO receives as input a QA developer re-
quirement and a set of rules describing QA components, and produces all the valid
compositions that implement this requirement.

2     The QA ESTRO Architecture

QA ESTRO is built on top of MCDSAT [3] which relies on state-of-the-art SAT solvers to
efficiently enumerate compositions of QA components. QA ESTRO allows a developer
to build a QA pipeline on demand; it utilizes a controlled vocabulary which encodes
the properties of generic QA tasks and is used to semantically describe QA compo-
nents. Further, it enumerates the compositions of the QA components that implement
a given QA developer requirement. Fig. 2 illustrates the architecture of QA ESTRO. It
accepts as input a QA developer requirement which is a conjunctive query over QA
tasks which a developer wants to implement. Furthermore, all the mappings of existing
QA components to QA tasks act as input to QA ESTRO encoder that translates it into
an instance of the Query Rewriting Problem (QRP) and passes it to MCDSAT. MCD-
SAT encodes the instance of QRP into a CNF theory in a way that encoding of this
theory correspond to solutions of QRP. A SAT solver is used in MCDSAT to model
all valid query re-writings that correspond to models of the CNF theory. The output of
QA ESTRO is the valid compositions of QA components based on the corresponding
developer requirement. For the detailed description of the semantic descriptions of QA
components, developer requirements, and an empirical evaluation of QA ESTRO, the
reader can refer to [3].


3     Demonstration of Use Case

We motivate our demonstration by considering the problem of semantic composition of
a QA pipeline. Let us consider two semantically described QA components:

AIDA($y, z) : – disambiguation(x, y, z, t), question(y), disEntity(z)
StanfordNER($y, x) : – recognition(y, x), question(y), entity(x)
These rules state the following functionalities of AIDA and Stanford NER compo-
nents: (i) AIDA4 implements the QA task of disambiguation; a question is received
 4
     https://gate.d5.mpi-inf.mpg.de/webaida/
Fig. 2: QA ESTRO Demo. A QA developer wants to implement Named Entity Recogni-
tion (NER) and Named Entity Disambiguation (NED) tasks together with a constraint
that the NED component accepts question and entity as input.

as input (marked with $), and a disambiguated entity is produced as output; (ii) Stan-
ford NER5 implements the QA task of entity recognition; it takes a question as input
and outputs spotted entities in the question. In these rules, disEntity, recognition,
disambiguation etc. are the controlled vocabulary terms in QA ESTRO. Using similar
rules, we have semantically described 51 QA components from 25 QA systems. Con-
sider a QA system developer requirement to implement two QA tasks, for instance,
NER and NED. Such requirements can be semantically described as:

QADevReq : – recognition, disambiguation
   Using 51 QA components from 20 QA systems6 and 30 different QA system devel-
oper requirements, we will demonstrate the following use cases:
 – QA developer requirement to implement one QA task. Developers will be able
   to express requirements similar to the ones in the previous example and implement
   a pipeline of one QA task. QA ESTRO allows a developer to choose which task she
   wants to implement and returns the list of components that are associated with the
   chosen task. Component semantic descriptions are presented as well.
 5
     http://nlp.stanford.edu:8080/ner/
 6
     http://wdaqua.eu/QAestro/qasystems/
 – QA developer requirement to implement two QA tasks. We further demonstrate
     how QA ESTRO can semantically compose the possible combination of components
     when a developer wants to implement two tasks (e.g., named entity recognition,
     named entity disambiguation together). This can be done with and without con-
     straints. In case of “with constraint” option, the developer has a specific require-
     ment, e.g. “only return the components which accept particular input in one of the
     tasks”. For example, while implementing NER and NED tasks together, the devel-
     oper wants to know the composition of QA pipeline in which the NED component
     accepts an entity as input from the NER task. In this case, QA ESTRO will not re-
     turn all the components implementing these two tasks, but only the QA components
     which produce an entity as output in the NER task and the QA components which
     accept this entity as input in the NED task. Hence, by demonstrating this use case,
     we illustrate the power of a SAT solver in QA ESTRO to check whether the inter-
     pretation of developers’ requests holds.
 – QA developer requirement to implement three, four, and five QA tasks. In this
     use case, we demonstrate how QA ESTRO returns valid compositions of QA com-
     ponents implementing three or more QA tasks based on developer requirements.
     For example, if a developer seeks to implement NER, NED, and query generation
     (i.e., the components which build a SPARQL query from disambiguated resources)
     together and see the possible compositions of QA components, QA ESTRO will
     return all viable compositions. In case the developer has some constraints on a par-
     ticular task (e.g., its input or output), QA ESTRO respects this as well. Similarly, we
     demonstrate QA ESTRO for four and five QA tasks.
All the use cases of QA ESTRO are publicly available 7 .

4     Conclusion
We demonstrate the QA ESTRO framework that allows to semantically describe QA
components, as well as developer requirements to compose a QA pipeline from reusable
QA components. We illustrate the functionality of QA ESTRO using 51 semantically
described QA components and different developer requirements. Moreover, attendees
will be able to specify QA pipelines and integrate them into existing QA frameworks.
Acknowledgement Parts of this work received funding from the European Union’s Horizon 2020
research and innovation programme under the Marie Skłodowska-Curie grant agreement No.
642795, project: Answering Questions using Web Data (WDAqua).


Bibliography
[1] A. Y. Halevy. Answering queries using views: A survey. VLDB J., 10(4), 2001.
[2] K. Singh, A. Both, D. Diefenbach, S. Shekarpour, D. Cherix, and C. Lange. Qanary
    - the fast track to creating a question answering system with linked data technology.
    In ESWC 2016 Satellite Events.
[3] K. Singh, I. Lytra, M.-E. Vidal, D. Punjani, H. Thakkar, C. Lange, and S. Auer.
    QAestro – semantic-based composition of question answering pipelines. In DEXA
    2017.
 7
     http://wdaqua.eu/QAestro/

</pre>