<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Let the Database Talk Back: Natural Language Explanations for SQL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stavroula Eleftherakis</string-name>
          <email>seleftheraki@athenarc.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Orest Gkini</string-name>
          <email>orestg@athenarc.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgia Koutrika</string-name>
          <email>georgia@athenarc.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Athena Research Center</institution>
          ,
          <addr-line>Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Database interaction is often characterized as a non-trivial and timeconsuming process due to user's inexperience with the data or the query language. Therefore, there is a need for the databases to be able to "talk back" in order to assist the users during data exploration and eventually lead them to the desired results. In this paper, we tackle the problem of SQL-to-NL by extending the graph-based model of Logos [3]. Our novel extensions include improvements in terms of the system's translation capabilities and the fluency of the generated explanations. Finally, we report several challenges, highlighted by experiments on diferent user cases, i.e, astronomy and policy making.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Nowadays, the availability of data coming from many and diferent
application fields urges the deployment of sophisticated tools which
help us efectively explore and extract knowledge out of this data.
However, the variety of the data sources (astronomical, biomedical,
etc.) as well as the complexity of query languages such as SQL
and SPARQL often pose obstacles during data exploration due to
the users’ unfamiliarity with the database content or the query
language. Explaining queries in text can help tackle both problems
as it enables users to understand the SQL queries that are used to
retrieve the answers through the data exploration process [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        The problem of translating SQL queries to natural language, or
SQL-to-NL in short, appears to be deceivingly simple, as SQL queries
are using a restricted vocabulary, comprising SQL elements (clauses,
operators, etc.) and database elements (i.e., relations, attributes), and
there is no ambiguity in interpreting such elements. Hence, initially
a straightforward translation appears adequate. Reality teaches
us quite the opposite, as the resulting text should be accurate in
capturing the respective SQL query, and efective allowing fast
and unique interpretation of it. Achieving both of these qualities is
very dificult and raises several technical challenges that need to be
Copyright © 2021 for the individual papers by the papers’ authors. Copyright © 2021
for the volume as a collection by its editors. This volume and its papers are published
under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Published in the Proceedings of the 2nd Workshop on Search, Exploration, and
Analysis in Heterogeneous Datastores, co-located with VLDB 2021 (August 16-20, 2021,
Copenhagen, Denmark) on CEUR-WS.org.
addressed [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Furthermore, several diferent explanations can be
generated for the same SQL query, making evaluation challenging.
      </p>
      <p>
        Logos [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a well-established system which translates SQL
queries into narratives. In contrast to neural approaches, Logos’
model can be interpreted. This gives us the advantage of ofering
precise translations, knowing exactly how translations are
produced. Nevertheless, the system’s translation capabilities are highly
depended on the query’s syntax, meaning that for each type of
query the system should have the necessary tools to model it.
Moreover, depended on the query’s complexity (number of attributes,
bridge tables, etc.) translations can become unnatural.
      </p>
      <p>
        In our work, we provide NL explanations as part of a data
exploration platform used by our collaborators representing diferent
ifelds, including astrophysics and policy making [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In this
platform, NL explanations are provided for explaining the internal SQL
queries generated by diferent system components used for
recommendations, NL-to-SQL translation, and data exploration. Thus, we
have to translate a variety of queries containing diferent clauses
and operators. Furthermore, there is a need for these translations
to be as natural is possible.
      </p>
      <p>In an efort to tackle the above-mentioned challenges, we
extend Logos to two important directions: (a) translation capabilities,
where we focus on the system’s ability of translating diferent types
of queries, (b) fluent explanations , where we focus on improving
the system’s translations. Furthermore, we present experimental
results as well as results of a user study over two diferent databases:
astronomical data and policy-making data. Our evaluation shows
the efectiveness of our approach and provides several insights
regarding challenges that arise due to the ambiguous nature of the
NL explanations including: (a) scoring textual explanations, and (b)
generating explanations suitable for diferent groups of people.</p>
      <p>The rest of the paper is organised as follows. In Section 2, related
work is discussed. In Section 3, we provide background information
on Logos. In Section 4, we present our novel extensions. In Section
5, the efectiveness of the system is explored. Finally, concluding
remarks are provided in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Existing approaches for SQL-to-NL can be broadly divided into two
categories. Template-/rule-based approaches (e.g., [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) require
the design of special templates and rules that are used to compose
sentences. Due to the non-ambiguous nature of the SQL queries,
templates can help limit the syntactic variability of their output
and smooth explanations out [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Logos [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] falls into this category.
      </p>
      <p>
        The second line of research tackles the problem as neural machine
translation and uses sequence-to-sequence (Seq2Seq) models (e.g.,
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]) which treat both the natural language description and
the query as sequences. In contrast to the previous category, those
models automatically learn how to translate queries without the
need of predefined query patterns. However, they require plenty of
training data and fine tuning and their efectiveness is still very low .
      </p>
      <p>
        Lately, hybrid models (e.g., [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]) have been proposed to
solve a similar problem to ours, i.e., data to natural language. Those
models automatically learn templates and use them in order
generate textual explanations.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>BACKGROUND</title>
      <p>Logos’ approach to NL generation comprises graph representations
of the database and the queries, template phrases associated with
parts of the graph, and graph traversal in particular directions to
compose the templates found on the way into the final text
formation. We provide an overview of the main ingredients: the database
graph, the query graph, labels, templates, and the algorithms.</p>
      <p>The database graph captures the relationships between the
database relations and attributes (nodes of the graph). Its edges are
divided into three types: (a) membership edges (attributes to
relations), representing attribute projections in queries, (b) selection
edges (relations to attributes), representing attributes in predicates,
and (c) join edges (attribute to attribute), representing joins.</p>
      <p>The query graph of a query is the part of the database graph
that the query refers to, extended with additional nodes (e.g., for
functions) and edges (e.g., for group-by’s) in order to capture the
entire query meaning. Each node or edge of the query graph can
be annotated with a label that signifies its meaning in natural
language. There are default labels that can be used for any database.
Additionally, labels can be provided by a domain expert.</p>
      <p>Example. Let us consider the CORDIS 1 database, which stores
information about research projects funded by the European Union.
In Figure 1, we see a subgraph of the corresponding database graph.
Note that for simplicity, we do not show the attributes used to
join the various relations. By default, the name of each node is
also its label. We also see how edges are annotated with default
labels. We can easily override these. For instance, for the relation
PROJECT_MEMBERS, instead of the system’s default label “project
members”, we can use the short label “participants".
1https://data.europa.eu/euodp/en/data/dataset/cordisH2020projects</p>
      <p>Figure 2 zooms in on the join path connecting relation PEOPLE
with relation PROJECTS and provides a more detailed view. A
possible label which expresses this connection is “principal investigators
of”, while the default one is “associated with”. Such designer labels
are stored in special tables called designer tables. □</p>
      <p>
        NL explanations are created by traversing the query graph
accompanied with a template mechanism [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The template mechanism
uses the provided labels (designer or default) to form meaningful
phrases. A template label  () or  ( (,  )) is assigned to a node 
or a path (,  ), respectively. For instance, a join template label
may have the form:  ( (,  )) =  () +  +  (,  ) +  ( ), where
 and  are relation nodes of the query graph. Using the example
of the previous paragraph (see Figure 2), a template for the path
connecting the tables PEOPLE and PROJECTS is the following:
 ( (PEOPLE, PROJECTS)) =  (PEOPLE) + " as " + “principal investigators
of” + (PROJECTS) = "people as principal investigators of projects" .
      </p>
      <p>Synthesis of the NL explanations is performed as graph traversal
of the query graph. There are three traversal strategies: (a) Binary
Search Tree (BST) algorithm, where the translation consists of
a composition of clauses each one of them focusing on specific
query semantics, (b) Multiple Reference Points (MRP) algorithm,
where information from all parts of the query graph is blended in
the translation, (c) Template composition (TMT) algorithm, where
predefined, richer templates corresponding to diferent query parts
are used in an efort to produce more concise translations. In what
follows, our examples were created using the MRP traversal strategy
and correspond to queries targeting the CORDIS database.
4</p>
    </sec>
    <sec id="sec-4">
      <title>TOWARDS RICHER TRANSLATIONS</title>
      <p>Logos was extended in two diferent directions. The first direction
aimed to enable the translation of more query types, while the other
one aimed to create fluent explanations. For this purpose, we have
upgraded the query parsing, graph generation, traversal strategies,
and template mechanism.</p>
      <p>Translation Capabilities. Our first goal was to extend the
translation capabilities of Logos towards translating more clauses
and operators. To that end, Logos should be able to analyze those
new types of queries, enrich (if necessary) the query graph with new
types of nodes and edges, and translate those new elements using
any of the traversal strategies (BST, MRP, TMT). Thus, we have
implemented changes in the system’s parser, introduced new graph
elements, and created new translation rules. As a result, the system
is now capable of translating queries with the SELECT TOP and
LIMIT clauses, as well as the (NOT) IN and (NOT) LIKE operators.</p>
      <p>Once an input query is given to the system, the parser analyzes
it and generates a parsing tree that stores important information for
the creation of the query graph. For the parser, we have introduced
to the system several new parsing nodes (e.g., for limit) capturing
information about all the aforementioned clauses and operators.</p>
      <p>Queries having SELECT TOP or LIMIT clauses are marked as
"limited" and their associated limitation number is temporally stored in
the system. The translation algorithm (BST, MRP, TMT) checks if
the input query is limited and if that is the case an explanation is
separately created and added to the end of the translation. As an
example, we provide that of query 1 in Table 1.</p>
      <p>In order for Logos to translate queries having IN and NOT IN
operators, new types of query graph elements were created: the in
edge, the not-in edge, and the value-list node. An illustrative example
of this case is query 2 in Table 1 whose query graph is given in
Figure 3. The translation algorithm detects those new elements of
the query graph and produces a textual explanation.</p>
      <p>Regarding the translation of queries having the (NOT) LIKE
operator, we created two new types of edges: the like edge and the
not-like edge. An example of this case is that of query 3 in Table 1.</p>
      <p>Our last improvement, in terms of translation capabilities, is the
creation of a new query graph node, the star node. In combination
with the function node that represents the COUNT operator, we now
get explanations like that of query 4 in Table 1.</p>
      <p>Fluent Explanations. Our second goal was to improve the
translations generated by the system in terms of fluency. This kind
of changes require only the modification of the traversal strategies
and the template mechanism.</p>
      <p>While exploring the system’s capabilities, we noticed that many
database schemas include bridge tables, i.e., special tables used to
transform many-to-many relationships into one-to-many
relationships. Bridge tables (usually) appear in queries in order to join two
tables. Although those tables play an important role in terms of
data modeling, they cause "noisy" explanations. For instance, let
us consider query 1 of Table 2. Apparently, the previously
produced explanation of the system is unnatural. Bridge tables are
manually stored in a designer table and excluded from the
translation. The template mechanism has been modified so that if ,
 , and  are table nodes with  corresponding to a bridge table,
the template label of the path connecting  to  is of the form
 ( (,  )) =  () +  (,  ) +  ( ).</p>
      <p>Moreover, a mini dictionary has been developed providing the
translation process with the plural form of all the attribute labels.
Essentially, for every attribute node  of the query graph we take
(if that exists) the plural form  of its label  (), i.e.,  ( ()).</p>
      <p>In addition, we have modified the way the MRP algorithm
translates the group by clause. Previously the translation for this part
would result in a separate sentence. Leveraging the nature of the
algorithm which aims on generating translations by blending
diferent parts of the query, we now blend this translation more naturally.</p>
      <p>An example that captures all the aforementioned changes (bridge
tables, plural form, group by on MRP) is that of query 2 in Table 2.</p>
      <p>
        Lastly, we have improved the translation of queries which
include only heading attributes [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in the SELECT clause. The heading
attribute is the most characteristic attribute of its relation. As an
example we ofer that of query 3 in Table 2.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5 EXPERIMENTAL RESULTS</title>
      <p>
        The evaluation of Logos is divided into two parts: (a) the automated
evaluation part, where we evaluate our results using the Bilingual
Evaluation Understudy (BLEU) automated metric [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and (b) the
human evaluation part, where we evaluate our results using the
help of SQL experts. The purpose of the first part is to use a
wellestablished metric to show how good the NL explanations are, while
the second part aims at evaluating qualitative aspects of the NL
explanations, such as clarity and fluency.
      </p>
      <p>For both types of evaluation, we created 28 queries (14 for the
CORDIS database, and 14 for the SDSS 2 database) (see Tables 6
and 7, respectively). Using the MRP algorithm, we translated those
queries twice. One time using the default version of the system,
which works by considering only the database schema (internal
knowledge), and one time using both the database schema and the
designer tables (internal and external knowledge), which, as
mentioned in the previous sections, store information about heading
attributes, bridge tables, and node or edge labels. First of all, we
want to investigate the efect of the designer tables to the
translations. Moreover, we want to know how close to the ground truth
(textual explanations given by SQL experts of the databases) the
system’s explanations are. In what follows, we denote the default
version of the system as Logos v.1 and the version that takes the
advantage of the designer tables as Logos v.2.
5.1</p>
    </sec>
    <sec id="sec-6">
      <title>Automated Evaluation</title>
      <p>
        Automated evaluation was carried out to compare the generated
explanations to the ground truth i.e., textual explanations of SQL
queries given by SQL experts of the databases, members of the
INODE project [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The quality of the results is measured using the
BLEU-4 score. BLEU is a score for comparing a candidate translation
of text to one or more reference translations. The scores range
between 0% and 100%. A score of 100% means that the estimated,
by the system, explanation matches completely the ground truth.
      </p>
      <p>The results are summarized in Table 3. We also report the
minimum and the maximum BLEU score. Furthermore, we noticed a
large variation between the scores; thus, we decided to report the
median BLEU score per translation system instead of the average.
For both databases, median scores are under 10%. Looking at the
medians, we conclude that Logos v.2 produces translations closer
to the ground truth than those obtained from Logos v.1. Especially
for the CORDIS database, the median BLEU score of Logos v.2 is
more than two times higher than that of Logos v.1.</p>
      <p>The low scores do not indicate that the translations produced
by Logos are not correct. BLEU scores work by counting matching
n-grams in the candidate translation to n-grams in the reference
text, where 1-gram or unigram would be each token and a bigram
comparison would be each word pair. That means that the score
is higher the more common parts a NL explanation has with the
ground truth. Manual examination of the NL explanations that the
two versions of Logos generated versus the ground truth showed
that the automatically generated translations were in fact correct.
However, they looked very diferent from the ground truth.
Indicatively, we show the translations of query 9 (from Table 6).</p>
      <p>CORDIS query 9 explanations:
– Logos v.1: “Find the institutions names of institutions associated
with countries whose country name is France.”
– Logos v.2: “Find institutions located in countries whose name is</p>
      <p>France.”
– Ground truth: "Show names of institutions from France.".</p>
      <p>We see that Logos would not necessarily produce translations the
way that a human mind would produce. And even diferent people
would provide diferent explanations for the same SQL query (albeit
all correct). This shows the opportunity of enhancing the translation
capabilities of the system with learning that not only leverages the
database schema but is also performed on previously defined human
translations. It also shows the challenge of creating automated
evaluation metrics which do not judge the explanations quality
only by estimating their similarity with the ground truth. These
observations lead to the need of conducting human evaluation as
well, which will be presented in the next subsection.</p>
      <p>Focusing now on the results of CORDIS, we see that there is
significant improvement on the translations of queries with id 3-6,
11, and 12. This is mainly due to the exclusion of bridge tables from
the translation procedure and the heading attribute addition. For
queries with id 9, and 10 we noticed a score reduction. Indicatively,
looking at the translations of query 9 above, we observe that
although the translation of Logos v.2 is more natural than that of
Logos v.1, the presence of the sentence “names of institutions” in
the translation of the latter leads to a higher BLEU score.</p>
      <p>Let us now focus on the results of SDSS. The scores are lower
than those of the CORDIS database. This is due to the nature of
the SDSS database that uses abbreviated names and letter symbols
in order to describe the content of its tables and attributes. For
instance, “photoobj” instead of “photometric objects”, or the letter “u”
to denote the magnitude of a photometric object in “u” (ultraviolet)
iflter. During the experiments, we realized that by transforming
those names and symbols into meaningful textual sentences, we
increase the size of the explanations compared to the size of
explanations provided by the astrophysicist SQL expert. This shows
another challenge for the automatic generation of NL explanations:
diferent styles of explanations may be given by domain experts
in diferent fields. Thus, diferent explanations are suitable for
different groups of people (e.g., astronomers and data scientists). For
example, for query 14, the BLEU score of Logos v.2 is substantially
lower than that of Logos v.1.</p>
      <p>SDSS query 14 explanations:
– Logos v.1: “Find the u, g, r, i and z of photoobj associated with specobj
whose class is QSO.”
– Logos v.2: “Find the magnitude u, magnitude g, magnitude r,
magnitude i and magnitude z of photometric objects corresponding to
spectroscopic objects whose class is QSO.”
– Ground truth: "Show me the u, g, r, i, z magnitudes of spectroscopic
quasars.".</p>
      <p>We concluded that this kind of notation (abbreviated names, and
letter symbols), for the attributes of the SDSS tables, is sometimes
preferred over full descriptions.</p>
      <p>Features
Clarity
Fluency
Precision</p>
      <p>Lastly, it has been observed that the SDSS database includes
many discrete variables (attributes) that define diferent types of
objects, e.g., stars. A fine example of that case is that of query 7.</p>
      <p>SDSS query 7 (Table 7) explanations:
– Logos v.1: “Find the specobjids of specobj whose subclass is OB and
class is STAR.”
– Logos v.2: “Find spectroscopic objects whose spectroscopic subclass
is OB and class is STAR.”
– Ground truth: "Find all spectroscopic stars which are massive and
hot.".</p>
      <p>We see that both versions of Logos do not understand that
“subclass = OB” and “class = STAR” means massive and hot stars. This
justifies the low BLEU scores in both versions of the system. For
this purpose, additional knowledge is required. We are in fact in the
process of integrating an ontology that provides such mappings.
5.2</p>
    </sec>
    <sec id="sec-7">
      <title>Human Evaluation</title>
      <p>
        For this experimental setting, an online survey was conducted.
A total of 21 people, all SQL experts, participated in the survey.
The experts were members of the INODE project [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. People who
contributed to the creation of labels (ground truth explanation)
were excluded from the evaluation process. From the pool of 28
SQL queries (see Tables 6 and 7), participants were asked to rate the
textual explanations of 4 randomly chosen queries (2 per database).
The queries were equally distributed to all participants. For each
query, the participant rated 3 explanations (1 per translation): (a) the
explanation produced by Logos v.1, (b) the explanation produced
by Logos v.2, and (c) the ground truth explanation, resulting in a
total of 84 explanations rated by humans.
      </p>
      <p>The participants judged the quality of the translations on the
7-point Likert-scales. They provided ratings for:
• clarity: how clear and understandable the explanation is.
• fluency : how natural the explanation is.
• precision: how well the information of the provided SQL
query is captured on its textual explanation.</p>
      <p>A score of 7 to all the aforementioned categories means that the
provided explanation is extremely clear, natural, and precise,
respectively. We score neutral as a 4.</p>
      <p>Data associated with respondents that completed the survey in
less than 5 minutes (half the approximate time for filling out the
survey) were deleted. Furthemore, we deleted the data of participants
which have selected the same response to every question,
regardless of the question. After cleaning the data, we ended up having 2
diferent scores (per explanation, and feature), corresponding to 2
diferent participants. The nfial score of an explanation for a given
feature is obtained by taking the average of the 2 diferent scores.
Therefore, we ended up having 1 single score per explanation, and
Features
Clarity
Fluency
feature. Indicatively, in Table 5 we show the data collected for the
explanations of the CORDIS query with id 9, and the obtained final
scores. From those final scores, 18 rating sets (2 databases x 3
translation systems x 3 features) consisting of 14 elements each (1 for
every query), were created.</p>
      <p>In Table 4, we show the averages of those sets, accompanied
with their standard deviation between brackets. Logos v.2 leads to
better translations in terms of clarity and fluency for both databases
(average score increases). However, we see that these scores do not
surpass those of the ground truth. An interesting observation is
that as the explanations become clearer and more fluent, precision
decreases. In other words, as the explanations become more natural,
they tend to lose their ability to explicitly explain each part of their
associated SQL query. Lastly, we see that the diference between
the average fluency scores of Logos v.2 and Logos v.1, increases for
the SDSS database. As mentioned in the previous Section, this is
due to the nature of the SDSS database which has a less explainable
database schema in terms of NL explanation. By adopting labels
for the components of the database schema (tables, attributes, and
joins), we increase the average fluency score (Logos v.2).
6</p>
    </sec>
    <sec id="sec-8">
      <title>CONCLUDING REMARKS</title>
      <p>
        In this paper, the translation of SQL queries has been discussed as a
potential solution to problems that rise in data exploration. In that
vein, we have extended the graph-based model of Logos [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The
system’s capabilities have been extended to two diferent directions:
(a) that of translating diferent types of queries and (b) that of
creating fluent explanations. To this end, we have implemented
changes in the system’s parser, introduced new types of nodes
and edges for the query graph, created new translation rules and
templates, and finally modified the available traversal strategies
(BST, MRP, TMT). In addition, a data set composed of 28 queries
coming from diferent user cases (astronomy, policy making) has
been created to perform both automated and human evaluation.
      </p>
      <p>The experiments highlighted the following two challenges: (a)
the need for creating automated metrics which do not consider
only the ground truth and (b) the need for generating diferent
explanations for diferent groups of people. Regarding the latter, we
observed that: (a) scientific related databases are far more dificult
to be explained by natural language and (b) sometimes the scientists
themselves ask for less detailed interpretations. Lastly, it has been
observed that as the explanations become more natural, they tend
to lose their ability to explicitly explain each part of the query.
Depending on the application, one may want to have more precise
or more natural explanations.</p>
      <p>Having said all this, SQL to text is a hard problem to solve and
there is no single solution. Diferent explanations may be suitable
for diferent applications or even diferent groups of people, albeit
all correct. As future work, we plan to improve our system by
adding an ontology which provides mappings for the interpretation
of several parts of the query graph.</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was supported by the European Union’s Horizon 2020
research and innovation programme under grant agreement (No
863410). We also thank the Intelligent Open Data Exploration
(INODE) team for contributing on the system’s evaluation.</p>
    </sec>
    <sec id="sec-10">
      <title>APPENDIX</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Sihem</given-names>
            <surname>Amer-Yahia</surname>
          </string-name>
          , Georgia Koutrika, Frederic Bastian, Theofilos Belmpas, Martin Braschler, Ursin Brunner, Diego Calvanese, Maximilian Fabricius, Orest Gkini, Catherine Kosten, Davide Lanti, Antonis Litke, Hendrik Lücke-Tieke, Francesco Alessandro Massucci, Tarcisio Mendes de Farias, Alessandro Mosca, Francesco Multari, Nikolaos Papadakis, Dimitris Papadopoulos, Yogendra Patil, Aurélien Personnaz, Guillem Rull, Ana Claudia Sima,
          <string-name>
            <given-names>Ellery</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Dimitrios</given-names>
            <surname>Skoutas</surname>
          </string-name>
          , Srividya Subramanian, Guohui Xiao, and
          <string-name>
            <given-names>Kurt</given-names>
            <surname>Stockinger</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]</article-title>
          .
          <source>CoRR abs/2104</source>
          .04194 (
          <year>2021</year>
          ). arXiv:
          <volume>2104</volume>
          .04194 https://arxiv.org/abs/2104.04194
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Srinivasan</given-names>
            <surname>Iyer</surname>
          </string-name>
          , Ioannis Konstas, Alvin Cheung, and
          <string-name>
            <given-names>Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Summarizing Source Code using a Neural Attention Model</article-title>
          . In ACL. The Association for Computer Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Kokkalis</surname>
          </string-name>
          , Panagiotis Vagenas, Alexandros Zervakis, Alkis Simitsis, Georgia Koutrika, and
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Logos: a system for translating queries into narratives</article-title>
          .
          <source>In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data</source>
          .
          <volume>673</volume>
          -
          <fpage>676</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Georgia</given-names>
            <surname>Koutrika</surname>
          </string-name>
          , Alkis Simitsis, and
          <string-name>
            <given-names>Yannis E</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Explaining structured queries in natural language</article-title>
          .
          <source>In ICDE. IEEE</source>
          ,
          <fpage>333</fpage>
          -
          <lpage>344</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Axel-Cyrille Ngonga</surname>
            <given-names>Ngomo</given-names>
          </string-name>
          , Lorenz Bühmann, Christina Unger, Jens Lehmann, and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Gerber</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Sorry, i don't speak SPARQL: translating SPARQL queries into natural language</article-title>
          .
          <source>In Proceedings of the 22nd international conference on World Wide Web</source>
          .
          <fpage>977</fpage>
          -
          <lpage>988</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Kishore</given-names>
            <surname>Papineni</surname>
          </string-name>
          , Salim Roukos, Todd Ward, and
          <string-name>
            <surname>Wei-Jing Zhu</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Bleu: a method for automatic evaluation of machine translation</article-title>
          .
          <source>In Proceedings of the 40th annual meeting of the Association for Computational Linguistics</source>
          .
          <fpage>311</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Ehud</given-names>
            <surname>Reiter</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robert</given-names>
            <surname>Dale</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Building applied natural language generation systems</article-title>
          .
          <source>Nat. Lang. Eng. 3</source>
          ,
          <issue>1</issue>
          (
          <year>1997</year>
          ),
          <fpage>57</fpage>
          -
          <lpage>87</lpage>
          . https://doi.org/10.1017/ S1351324997001502
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Alkis</given-names>
            <surname>Simitsis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yannis E.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>DBMSs Should Talk Back Too</article-title>
          .
          <source>In CIDR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Alkis</given-names>
            <surname>Simitsis</surname>
          </string-name>
          , Georgia Koutrika, Yannis Alexandrakis, and
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Synthesizing structured text from logical database subsets</article-title>
          .
          <source>In EDBT</source>
          .
          <volume>428</volume>
          -
          <fpage>439</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Chris</surname>
            <given-names>van der Lee</given-names>
          </string-name>
          , Emiel Krahmer, and
          <string-name>
            <given-names>Sander</given-names>
            <surname>Wubben</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automated learning of templates for data-to-text generation: comparing rule-based, statistical and neural methods</article-title>
          .
          <source>In Proceedings of the 11th International Conference on Natural Language Generation</source>
          .
          <fpage>35</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Sam</surname>
            <given-names>Wiseman</given-names>
          </string-name>
          , Stuart M Shieber, and
          <string-name>
            <surname>Alexander M Rush</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Learning neural templates for text generation</article-title>
          . arXiv preprint arXiv:
          <year>1808</year>
          .
          <volume>10122</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Kun</surname>
            <given-names>Xu</given-names>
          </string-name>
          , Lingfei Wu, Zhiguo Wang,
          <string-name>
            <surname>Yansong Feng</surname>
            , and
            <given-names>Vadim</given-names>
          </string-name>
          <string-name>
            <surname>Sheinin</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>SQL-to-Text Generation with Graph-to-Sequence Model</article-title>
          .
          <source>In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source>
          , Ellen Rilof, David Chiang,
          <string-name>
            <given-names>Julia</given-names>
            <surname>Hockenmaier</surname>
          </string-name>
          , and
          <string-name>
            <surname>Jun'ichi Tsujii</surname>
          </string-name>
          (Eds.).
          <fpage>931</fpage>
          -
          <lpage>936</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>