Towards Conversational OLAP
                  Matteo Francia                                          Enrico Gallinucci                             Matteo Golfarelli
          DISI - University of Bologna                               DISI - University of Bologna                  DISI - University of Bologna
              m.francia@unibo.it                                      enrico.gallinucci@unibo.it                    matteo.golfarelli@unibo.it

ABSTRACT                                                                                     #2 it must handle OLAP sequences rather than single queries:
The democratization of data access and the adoption of OLAP in                                   in an OLAP sequence the first query is fully described by
scenarios requiring hand-free interfaces push towards the cre-                                   the text, while the following ones are implicitly/partially
ation of smart OLAP interfaces. In this paper, we envisage a con-                                described by an OLAP operator and require the system to
versational framework specifically devised for OLAP applications.                                have memory of the previous ones;
The system converts natural language text in GPSJ (Generalized                               #3 it must be robust with respect to user inaccuracies in using
Projection, Selection and Join) queries. The approach relies on an                               syntax, OLAP terms, and attribute values, as well as in the
ad-hoc grammar and a knowledge base storing multidimensional                                     presence of implicit information;
metadata and cubes values. In case of ambiguous or incomplete                                #4 it must be easy to configure on a DW without a heavy
query description, the system is able to obtain the correct query                                manual definition of the lexicon.
either through automatic inference or through interactions with                          More technically, our text-to-SQL approach is based on a gram-
the user to disambiguate the text. Our tests show very promising                         mar parsing natural language descriptions of GPSJ queries. The
results both in terms of effectiveness and efficiency.                                   recognized entities include a set of typical query lexicons (e.g.
                                                                                         group by, select, filter) and the domain specific terms and val-
                                                                                         ues automatically extracted from the DW (see desiderata #1 and
1     INTRODUCTION                                                                       #4). Robustness (desiderata #3) is one of the main goals of our
Nowadays, one of the most popular research trends in computer                            approach and is pursued in all the translation phases: lexicon
science is the democratization of data access, analysis and visu-                        identification is based on a string similarity function, multi-word
alization, which means opening them to end users lacking the                             lexicons are handled through n-grams, and alternative query in-
required vertical skills on the services themselves. Smart per-                          terpretations are scored and ranked. The grammar proposed in
sonal assistants [10] (Alexa, Siri, etc.) and auto machine learning                      Section 4.3 recognizes full queries only, thus desiderata #2 is not
services [20] are examples of such research efforts that are now                         covered by the current work. To sum up, the main contributions
on corporate agendas [1].                                                                of this paper are:
   In particular, interfacing natural language processing (either                           (1) a list of features and desiderata for an effective conversa-
written or spoken) to database systems opens to new opportuni-                                   tional OLAP system;
ties for data exploration and querying [13]. Actually, in the area                          (2) an architectural view of the whole framework;
of data warehouse, OLAP (On-Line Analytical Processing) itself                              (3) an original approach to translate the natural language
is an "ante litteram" smart interface, since it supports the users                               description of an OLAP query in a well-formed GPSJ query
with a "point-and-click" metaphor to avoid writing well-formed                                   (Full query module in Figure 1).
SQL queries. Nonetheless, the possibility of having a conversa-                             (4) a set of tests to verify the effectiveness of our approach.
tion with a smart assistant to run an OLAP session (i.e., a set of                          The remainder of the paper is organized as follows: in Section 2
related OLAP queries) opens to new scenarios and applications. It                        the functional architecture and the modules of a conversational
is not just a matter of further reducing the complexity of posing a                      OLAP framework are sketched; Section 3 discusses related works;
query: a conversational OLAP system must also provide feedback                           Section 4 describes in detail the query generation process; in Sec-
to refine and correct wrong queries, and it must have memory                             tion 5 a large set of effectiveness and efficiency tests are reported;
to relate subsequent requests. A reference application scenario                          finally, Section 6 draws the conclusions and discusses the system
for this kind of framework is augmented business intelligence [9],                       evolution.
where hand-free interfaces are mandatory.
   In this paper, we envisage a conversational OLAP framework                            2   SYSTEM OVERVIEW
able to convert a natural language text into a GPSJ query. GPSJ
                                                                                         Figure 1 sketches a functional view of the architecture. Given a
[11] is the main class of queries used in OLAP since it enables
                                                                                         DW, that is a set of multidimensional cubes together with their
Generalized Projection, Selection and Join operations over a set of
                                                                                         metadata, the offline phase is aimed at extracting the DW spe-
tables. Although, some natural language interfaces to databases
                                                                                         cific terms used by user to express the queries. Such information
have already been proposed, to the best of our knowledge this is
                                                                                         are stored in the system knowledge base (KB). Noticeably, this
the first proposal specific to OLAP systems.
                                                                                         phase runs only once for each DW unless it undergoes modifica-
   In our vision, the desiderata for an OLAP smart interface are:
                                                                                         tions. More in detail, the Automatic KB feeding process extracts
    #1 it must be automated and portable: it must exploit cubes                          from the cubes categorical attribute values and metadata (e.g.
       metadata (e.g. hierarchy structures, role of measures, at-                        attribute and measure names, hierarchy structures, aggregation
       tributes, and aggregation operators) to increase its un-                          operators). With reference to the cube represented through a
       derstanding capabilities and to simplify the user-machine                         DFM schema in Figure 2, the KB will store the names of the
       interaction process;                                                              measures (e.g. StoreSales, StoreCost) together with their default
                                                                                         aggregation operators (e.g. sum, avg). Additional synonyms can
© Copyright 2020 for this paper held by its author(s). Published in the proceedings of   be automatically extracted from open data ontologies (Wordnet
DOLAP 2020 (March 30, 2020, Copenhagen, Denmark, co-located with EDBT/ICDT
2020) on CEUR-WS.org. Use permitted under Creative Commons License Attribution           [16] in our implementation); for example, "client" is a synonym
4.0 International (CC BY 4.0).                                                           for Customer. The larger the number of synonyms, the wider the
language understood by the systems. Besides the domain specific         ensure drill-across validity). Also differently from [14], the re-
terminology, the KB includes the set of standard OLAP terms that        sults we provide are supported by extensive effectiveness and
are domain independent and that do not require any feeding (e.g.        efficiency performance evaluation that completely lack in [14].
group by, equal to, select). Further enrichment can be optionally       Finally, existing dialog systems, such as [17], address the explo-
carried out manually (i.e., by the KB enrichment module) when           ration of linked data. Hence they are not suitable for analytics on
the application domain involves a non-standard vocabulary (i.e.,        the multidimensional model. As to question answering, existing
when the physical names of tables and columns do not match              systems are well understood and differentiate for the knowledge
the words of a standard vocabulary).                                    required to formulate the query and for the generative approach.
   The online phase runs every time a query is issued to the            Domain agnostic approaches solely rely on the database schema.
system. The spoken query is initially translated to text by the         NaLIR [13] translates natural language queries into dependency
Speech-to-text module. This task is out of scope in our research        trees [15] and brute-forcefully transforms promising trees un-
and we exploited the Google API in our implementation. The              til a valid query can be generated. In our approach we rely on
uninterpreted text is then analyzed by the Interpretation module        n-grams instead of dependency trees [15] since the latter can-
that actually consists of two sub-modules: Full query is in charge      not be directly mapped to entities in the knowledge base (i.e.,
of interpreting the texts describing full queries, which typically      they require tree manipulation) and are sensible to the query
happens when an OLAP session starts. Conversely, OLAP op-               syntax (e.g., "sum unit sales" and "sum the unit sales" produce two
erator modifies the latest query when the user states an OLAP           different trees with the same meaning). SQLizer [22] generates
operator along an OLAP session. On the one hand, understanding          templates over the issued query and applies a "repair" loop until
a single OLAP operator is simpler since it involves less elements.      it generates queries that can be obtained using at most a given
On the other hand, it requires to have memory of previous queries       number of changes from the initial template. Domain specific
(stored in the Log) and to understand which part of the previous        approaches add semantics to the translation process by means of
query must be modified.                                                 domain-specific ontologies and ontology-to-database mappings.
   Due to natural language ambiguity, user errors and system            SODA [5] uses a simple but limited keyword-based approach
inaccuracies, part of the text can be misunderstood. The role           that generates a reasonable and executable SQL query based on
of the Disambiguation module is to solve ambiguities by ask-            the matches between the input query and the database metadata,
ing appropriate questions to the user. The reasons behind the           enriched with domain-specific ontologies. ATHENA [18] and its
misunderstandings are manyfold, including (but not limited to):         recent extension [19] map natural language into an ontology
ambiguities in the aggregation operator to be used; inconsistency       representation and exploit mappings crafted by the relational
between attribute and value in a selection predicate; (3) identifica-   schema designer to resolve SQL queries. Analyza [6] integrates
tion of relevant elements in the text without full comprehension.       the domain-specific ontology into a "semantic grammar" (i.e., a
   The output of the previous steps is a data structure (i.e., a        grammar with placeholders for the typed concepts such as mea-
parsing tree) that models the query and that can be automatically       sures, dimensions, etc.) to annotate and finally parse the user
turned in a SQL query exploiting the DW structure stored in the         query. Additionally, Analyza provides an intuitive interface facil-
KB. Finally, the obtained query is run on the DW and the results        itating user-system interaction in spreadsheets. Unfortunately,
are reported to the user by the Execution & Visualization module.       by relying on the definition of domain specific knowledge and
Such module could exploit a standard OLAP visualization tool            mappings, the adoption of these approaches is not plug-and-play
or it could implement voice-based approaches [21] to create an          as an ad-hoc ontology is rarely available and is burdensome to
end-to-end conversational solution.                                     create.
                                                                            In the area of business intelligence the road to conversation-
                                                                        driven OLAP is not paved yet. The recommendation of OLAP
3   RELATED WORKS
                                                                        sessions to improve data exploration has been well-understood
Conversational business intelligence can be classified as a natural     [3] also in domains of unconventional contexts [9] where hand-
language interface (NLI) to business intelligence systems to drive      free interfaces are mandatory. Recommendation systems focus
analytic sessions. Despite the plethora of contributions in each        on integrating (previous) user experience with external knowl-
area, to the best of our knowledge, no approach lies at their           edge to suggest queries or sessions, rather than providing smart
intersection.                                                           interfaces to BI tools. To this end, personal assistants and con-
   NLIs to operational databases enable users to specify com-           versational interfaces can help users unfamiliar with such tools
plex queries without previous training on formal programming            and SQL language to perform data exploration. However, end-
languages (such as SQL) and software; a recent and comprehen-           to-end frameworks are not provided in the domain of analytic
sive survey is provided in [2]. Overall, NLIs are divided into          sessions over multidimensional data. QUASL [12] introduces a
two categories: question answering (QA) and dialog (D). While           QA approach over the multidimensional model that supports
the former are designed to operate on single queries, only the          analytical queries but lacks both the formalization of the dis-
latter are capable of supporting sequences of related queries as        ambiguation process (i.e., how ambiguous results are addressed)
needed in OLAP analytic sessions. However, to the best of our           and the support to OLAP sessions (with respect to QA, handling
knowledge, no dialog-based system for OLAP sessions has been            OLAP sessions requires to manage previous knowledge from the
provided so far. The only contribution in the dialog-based di-          Log and to understand whether the issued sentence refines previ-
rection is [14], where the authors provide an architecture for          ous query or is a new one). Complementarily to our framework,
querying relational databases; with respect to this contribution        [21] recently formalized the vocalization of OLAP results.
we rely on the formal foundations of the multidimensional model             To summarize, the main differences of our approach to the
to drive analytic sessions (e.g., according to the multidimensional     previous works are the following.
model it is not possible to group by a measure, compute aggrega-
tions of categorical attributes, aggregate by descriptive attributes,
                                            Interpretation
             Sales by                                                         Log
           Customer and                          OLAP
              Month                             operator                            Parse tree
                                     Raw                       Annotated                             Parse
                                     text                      parse forest                           tree                    SQL                         Results
                     Speech                                                                                    SQL                     Execution &
                                             Full query                       Disambiguation
                     -to-Text                                                                                generation                Visualization
Online
Offline

                                                      KB                                                 Automatic Metadata & values              DW
                                                  enrichment                          KB                 KB feeding
                                                                     Synonyms
                                                                                                                             Synonyms
                                                                                                                                             Ontology


                                Figure 1: A functional architecture for a conversational OLAP framework.

                          S.Region                C.Region
                                       Gender                                                    • DB tables: the structure of the database implementing
                            S.City                C.City                                           the DW including tables’ and attributes’ names, primary
                             Store              Customer                                           and foreign key relationships.
          Family      Name      Sales                      Month                             The set of DW entities includes:
                             StoreSales           Date               Year
                                                                                                 • DW element names: measures, dimensional attributes
                             StoreCost
                Type Product                                                                       and fact names.
       Category              UnitSales
                                                           Quarter                               • DW element values: for each categorical attribute, all
                                                                                                   the values are stored.
Figure 2: Simplified DFM representation of the Foodmart
Sales cube.                                                                              Several synonyms can be stored for each entity, enabling the
                                                                                         system to cope with slang and different shades of the text. Both
                                                                                         DW entities and DW structure are automatically collected by
    (1) The envisioning of an end-to-end general-purpose dialog-                         querying the DW and its data dictionary and are stored in a
        driven framework supporting OLAP sessions.                                       QB4OLAP [8] compatible repository. Besides DW entities, that
    (2) The definition of the framework’s functional architecture                        are domain specific, the knowledge base includes those keywords
        and the formalization of its modules.                                            and patterns that are typically used to express a query:
    (3) The enforcement of multidimensional constraints by the                                   • Intention keywords: express the role of the subsequent
        means of formal grammar supporting GPSJ queries.                                           part of text. Examples of intention keywords are group by,
    (4) A plug-and-play implementation of the core modules on                                      select and filter.
        top existing data warehouses: our framework does not                                     • Operators: include logic (e.g. and, or), comparison (e.g.
        impact existing schemata and the integration with external                                 greater, equal) and aggregation operators (e.g. sum, aver-
        knowledge is supported but not mandatory.                                                  age).
                                                                                                 • Patterns of dates and numbers: used to automatically
4     QUERY GENERATION                                                                             recognize dates and numbers in the raw text.
In the next subsections, we describe in detail the interpretation                        Overall, the entities defined in the system are defined as E =
process for a single full query. Figure 3 shows the necessary                            {E 1 , E 2 , ..., Em }; note that each entity can be a multi-word that
steps in the online phase. We remind that, with reference to                             can be modeled as a sequence of tokens E = ⟨t 1 , ..., tr ⟩.
Figure 1, modules Speech-to-Text, OLAP operator and, Execution
& Visualization are out of scope here. Conversely, the offline                           4.2       Tokenization and Mapping
phase process and the related modules are implemented but not                            A raw text T can be modeled as a sequence of tokens (i.e., single
reported.                                                                                words) T = ⟨t 1 , t 2 , ..., tz ⟩. The goal of this phase is to identify in
                                                                                         T the known entities that are the only elements involved in the
4.1       The Knowledge Base                                                             Parsing phase.
The system Knowledge Base (KB in Figure 1) stores all the infor-                            Turning a text into a sequence of entities means finding a
mation extracted from the DW that is necessary to support the                            mapping between tokens in T and E. More formally:
translation phases. Information can be classified in DW Entities
and Structure. More in detail, the DW structure enables consis-                             Definition 4.1 (Mapping & Mapping function). A mapping func-
tency checks on parse trees and, given a parse tree, allows the                          tion M(T ) is a partial function that associates sub-sequences1
SQL generation. It includes:                                                             from T to entities in E such that:
      • Hierarchy structure: the roll-up relationships between                                   • sub-sequences of T have length n at most;
        attributes.                                                                              • the mapping function determines a partitioning of T ;
      • Aggregation operators: for each measure, the applicable
                                                                                         1 The term n -gram is used as a synonym of sub-sequence in the area of text mining.
        operators and the default one.
             Sales by
           Customer and
              Month
                                                         Interpretation: Full query
                                                             Parse                           Annotated                                   Parse
                               Mappings                      forest                          parse forest                                 tree                       SQL
          Tokenization                                                       Checking &                                                              SQL
                                             Parsing                                                           Disambiguation
           & Mapping                                                        Enhancement                                                            generation


        Figure 3: Query generation steps for a text describing a complete query. The KB is involved in all the steps.


    • a sub-sequence T ′ = ⟨ti , ..., tl ⟩ ∈ T with (l ≤ n) is associ-                          Edgar        Allan   Poe   Edgar        Allan   Poe   Edgar        Allan   Poe
      ated to an entity E if and only if E ∈ TopN E ∈ E Sim(T ′, E)
      and Sim(T ′, E) > α.                                                                              P.      Edgar              P.      Edgar              P.      Edgar

The output of a mapping function is a sequence M = ⟨E 1 , ..., El ⟩                             Edgar        Allan   Poe   Edgar        Allan   Poe   Edgar        Allan   Poe
on E that we call a mapping. Given the mapping M the similarities
between each entity Ei and the corresponding tokens Sim(T ′, Ei )                                       P.      Edgar              P.      Edgar              P.      Edgar
are also retained and denoted with Sim M (Ei ). A mapping is said
to be valid if the fraction of mapped tokens in T is higher than a                        Figure 4: Token dispositions: arcs denote the correspon-
given threshold β. We call M the set of valid mappings.                                   dence of tokens for a specific disposition (the bolder the
    Several mapping functions (and thus several mappings) may                             line, the higher the similarity). The disposition determin-
exists between T and E since Definition 4.1 admits sub-sequences                          ing the maximum similarity is shown in a dashed rectan-
of variable lengths and retains, for each sub-sequence, the top                           gle.
similar entities. This increases interpretation robustness, but it
can lead to an increase in computation time. Generating several
different mappings increases robustness since it allows to choose,                        ⟨P ., Edдar ⟩ and W = ⟨Edдar , Allan, Poe⟩. The disposition deter-
in the next steps, the best text interpretation out of a higher num-                      mining the highest similarity is surrounded by a dashed rectangle;
ber of candidates. Generated mappings differ both in the number                           the similarity is 0.46 and it is calculated as
of entities involved and, in the specific entities mapped to a token.
                                                                                                                 sim(P ., Poe)|Poe | + sim(Edдar, Edдar )|Edдar |
In the simple case where multi-token mappings are not possible                              Sim(T ,W ) =
(i.e., n = 1 in Definition 4.1) the number of generated mappings                                                            |Poe | + |Edдar | + |Allan|
for a raw text T , such that |T | = z, is:                                                                                                                                       □
                              z
                                    z
                                    
                                        · Ni                                                 We assume the best interpretation of the input text to be the
                             Õ
                                    i                                                     one where (1) all the entities discovered in the text are included
                             i= ⌈z ·β ⌉
                                                                                          in the query (i.e., all the entities are parsed through the grammar)
The formula counts the possible configurations of sufficient                              and, (2) each entity discovered in the text is perfectly mapped
length (i.e., higher or equal to ⌈z · β⌉) and, for each length, count                     to one sub-sequence of tokens (i.e., Sim M (Ei ) = 1). The two
the number of mappings determined by the top similar enti-                                previous statements are modeled through the following score
ties. Since the number of candidate mappings is exponential, we                           function. Given a mapping M = ⟨E 1 , ..., Em ⟩, we define its score
consider only the most significant ones through α, β, and N : α                           as
imposes sub-sequence of tokens to be very similar to an entity; N                                                              m
                                                                                                                               Õ
further imposes to consider only the N entities with the highest                                                Score(M) =         Sim M (Ei )             (1)
similarity; finally, β imposes a sufficient portion of the text to be                                                                   i=1
mapped.                                                                                   The score is higher when M includes several entities with high
   The similarity function Sim() is based on the Levenshtein dis-                         values of Sim M . Although at this stage it is not possible to predict
tance and keeps token permutation into account to make similar-                           if a mapping will be fully parsed, it is apparent that the higher
ity robust to token permutations (i.e., sub-sequences ⟨P ., Edдar ⟩                       the mapping score, the higher its probability to determine an
and ⟨Edдar, Allan, Poe⟩ must result similar). Given two token                             optimal interpretation. As it will be explained in the Section 4.4,
sequences T and W with |T | = l, |W | = m such that l ≤ m it is:                          ordering the mapping by descending score also enables pruning
                                Sim(⟨t 1 , ..., tl ⟩, ⟨w 1 , ..., wm ⟩) =                 strategies to be applied.
                      Íl
                       i=1 sim(ti , w D(i) ) · max(|ti |, |w D(i) |)                         Example 4.3 (Tokenization and mapping). Given the set of enti-
           max        Íl                                                                  ties E and a tokenized text T = ⟨medium, sales, in, 2019, by, the,
                           i=1 max(|ti |, |w D(i) |) +
       D ∈Disp(l,m)
                                                         Í
                                                            i ∈ D̂ |w i |                 reдion⟩, examples of mappings M 1 and M 2 are:
where D ∈ Disp(l, m) is an l-disposition of {1, ..., m} and D̂ is
                                                                                                 M 1 =⟨avg, UnitSales, where, 2019, group by, region⟩
the subset of values in {1, ..., m} that are not present in D. Func-
tion Sim() weights token similarity based on their lengths (i.e.,                                M 2 =⟨avg, UnitSales, where, 2019, group by, Reдin⟩
max(|ti |, |w D(i) |) and penalizes similarities between sequences of
                                                          Í                               where the token the is not mapped and the token reдion is
different lengths that implies unmatched tokens (i.e., i ∈D̂ |w i |).
                                                                                          mapped to the attribute Region in M 1 and to the value Reдin
  Example 4.2 (Token similarity). Figure 4 shows some of the                              in M 2 (where Reдin is a value of attribute Customer that holds a
possible token dispositions for the two token sequences T =                               sufficient similarity).                                        □
Table 1: Generic and domain specific syntactic categories.
                                                                                         ⟨GPSJ⟩ ::= ⟨MC⟩⟨GC⟩⟨SC⟩ | ⟨MC⟩⟨SC⟩⟨GC⟩ | ⟨SC⟩⟨GC⟩⟨MC⟩
   Category          Entity                          Synonym samples                               | ⟨SC⟩⟨MC⟩⟨GC⟩ | ⟨GC⟩⟨SC⟩⟨MC⟩ | ⟨GC⟩⟨MC⟩⟨SC⟩
   Int               select                          return, show, get                             | ⟨MC⟩⟨SC⟩ | ⟨MC⟩⟨GC⟩ | ⟨SC⟩⟨MC⟩ | ⟨GC⟩⟨MC⟩
   Whr               where                           in, such that                                 | ⟨MC⟩
   Gby               group by                        by, for each, per                    ⟨MC⟩ ::= (⟨Agg⟩⟨Mea⟩ | ⟨Mea⟩⟨Agg⟩ | ⟨Mea⟩ | ⟨Cnt⟩⟨Fct⟩
   Cop               =, <>, >, <, ≥, ≤               equal to, greater than
   Agg               sum, avg                        total, medium                                 | ⟨Cnt⟩⟨Attr⟩)+
   Cnt               count, count distinct           number, amount                        ⟨GC⟩ ::= ⟨Gby⟩⟨Attr⟩+
   Fct               Facts                           Domain specific                       ⟨SC⟩ ::= ⟨Whr⟩⟨SCO⟩
   Mea               Measures                        Domain specific
                                                                                         ⟨SCO⟩ ::= ⟨SCA⟩”or ”⟨SCO⟩ | ⟨SCA⟩
   Att               Attributes                      Domain specific
   Val               Categorical values              Domain specific                      ⟨SCA⟩ ::= ⟨SCN⟩”and”⟨SCA⟩ | ⟨SCN⟩
                     Dates and numbers               -                                   ⟨SCN⟩ ::= ”not”⟨SSC⟩ | ⟨SSC⟩
                                                                                          ⟨SSC⟩ ::= ⟨Attr⟩⟨Cop⟩⟨Val⟩ | ⟨Attr⟩⟨Val⟩ | ⟨Val⟩⟨Cop⟩⟨Attr⟩

4.3      Query Parsing                                                                             | ⟨Val⟩⟨Attr⟩ | ⟨Val⟩

Parsing a valid mapping means searching in the text the complex
syntax structures (i.e., clauses) that build-up the query. Given a                        Figure 5: Backus-Naur representation of the grammar.
mapping M, the output of a parser is a parse tree PTM , i.e. an
ordered, rooted tree that represents the syntactic structure of
a string according to a given grammar. To the aim of parsing,                                                                   � GPSJ �
entities (i.e., terminal elements in the grammar) are grouped
                                                                                                            � MC �               � SC �        � GC �
in syntactic categories (see Table 1). The whole grammar is de-
scribed in Figure 5. As a GPSJ query consists of 3 clauses (measure,                                                                � SCO �
group by and selection), in our grammar we identify four types                                                                      � SCA �
of derivations2 :
                                                                                                                                    � SCN �
     • Measure clause ⟨MC⟩: this derivation consists of a list
       of measure/aggregation operator pairs. If the operator is                                                                    � SSC �
       omitted (i.e., ⟨MC⟩ ::= ⟨Mea⟩ applies) the default one                                           � Agg � � Mea � � Whr � � Val � � Gby � � Attr �
       specified in the KB will be applied to the measure during                                   M1 = � avg, UnitSales, where, 2019, group by, Region �
       the Checking & Enhancement step.
                                                                                                 (a) The entire mapping is parsed (i.e., P F M = PTM ).
     • Group by clause ⟨GC⟩: this derivation consists of a se-
       quence of attribute names preceded by an entity of type
                                                                                                                     � GPSJ �
       ⟨Gby⟩.
     • Selection clause ⟨SC⟩: this derivation consists of a Boolean                                         � MC �               � SC �           � SC �
       expression of simple selection predicates ⟨SSC⟩; follows
       standard SQL operator priority (see ⟨SCO⟩, ⟨SCA⟩, and                                                                         � SCO �      � SCO �
       ⟨SCN⟩ derivations).                                                                                                           � SCA �      � SCA �
     • GPSJ query ⟨GPSJ⟩: this derivation assembles the final
                                                                                                                                     � SCN �      � SCN �
       query. Only the measure clause is mandatory since a GPSJ
       could aggregate a single measure with no selections. The                                                                      � SSC �      � SSC �
       order of clauses is irrelevant; this implies the proliferation                                  � Agg � � Mea � � Whr � � Val � � Gby � � Val �
       of derivations due to clause permutations.                                                  M2 = � avg, UnitSales, where, 2019, group by, Regin �
   The GPSJ grammar is LL(1)3 [4], is not ambiguous (i.e., each                                             (b) ⟨Gby⟩ ⟨Val⟩ cannot be parsed.
mapping admits a single parsing tree PTM ) and can be parsed by
a LL(1) parser with linear complexity [4]. If the input mapping
                                                                                                 Figure 6: Parsing forests from Example 4.3.
M can be fully parsed, PTM will include all the entities as leaves.
Conversely, if only a portion of the input belongs to the grammar,
an LL(1) parser will produce a partial parsing, meaning that it will
return a parsing tree including the portion of the input mapping                         than the simple parse tree enables disambiguation and errors to
that belongs to the grammar. Remaining entities can be either                            be recovered during the Disambiguation phase.
singleton or complex clauses that were not possible to connect to
the main parse tree. We will call parse forest PF M the union of the                        Example 4.4 (Parsing). Figure 6 reports the parsing outcome
parsing tree with residual clauses. Obviously, if all the entities                       for the two mappings in Example 4.3. M 1 is fully parsed, thus
are parsed, it is PF M = PTM . Considering the whole forest rather                       its parsing forest correspond to the parsing tree (i.e., PTM1 =
2 A derivation in the form ⟨X ⟩ ::= e represent a substitution for the non-terminal      PF M1 ). Conversely, in M 2 the last token is wrongly mapped to
symbol ⟨X ⟩ with the given expression e . Symbols that never appear on the left side     the attribute value Regin rather than to the attribute name Region.
of ::= are named terminals. Non-terminal symbols are enclosed between ⟨⟩ .
3 The rules presented in Figure 5 do not satisfy LL(1) constraints for readability
                                                                                         This prevents the full parsing and the parsing tree PTM does not
reasons. It is easy to turn such rules in a LL(1) complaint version, but the resulting   include all the entities in M.
rules are much more complex to be read and understood.
  Annotation type                         Gen. derivation sample       Example
  Ambiguous Attribute                     ⟨SSC⟩ ::= ⟨Val⟩              "sum unit sales for Salem", but Salem is member of City and StoreCity
  Ambiguous Agg. Operator                 ⟨MC⟩ ::= ⟨Mea⟩               "unit sales by product", but sum and avg are valid aggregations
  Attribute-Value Mismatch                ⟨SSC⟩ ::= ⟨Attr⟩⟨Cop⟩⟨Val⟩ "sum unit sales for product New York", but N ew York is not a Product
  MD-Meas Violation                       ⟨MC⟩ ::= ⟨Agg⟩⟨Mea⟩          "sum prices per store", but Price is not addictive
  MD-GBY Violation                        ⟨GC⟩ ::= ⟨Gby⟩⟨Attr⟩+        "average prices by name", but Product is not in ⟨GC⟩
  Unparsed clause                         –                            "average unit sales by Regin", but Reдin is not an attribute
                                                              Table 2: Annotation types.


                                                                                                              � GPSJ �
4.4      Knowledge-based Parse Forest Checking                                                                                                            unparsed
         and Enhancement                                                                           � MC �                  � SC �                          � SC �
The Parsing phase verifies adherence of the mapping to the GPSJ                                                                � SCO �                    � SCO �
grammar, but this does not guarantee the executability of the
                                                                                                                               � SCA �                    � SCA �
corresponding SQL query due to implicit elements, ambiguities
and errors. The following list enumerates the cases that require                                                               � SCN �                    � SCN �
further processing.                                                                                                            � SSC � AVM                � SSC �
      • Ambiguous attribute: the ⟨SSC⟩ clause has an implicit
                                                                                               � Agg � � Mea � � Whr � � Attr � � Cop � � Val � � Gby �    � Val �
        attribute and the parsed value belongs to multiple attribute
                                                                                           M = � avg, UnitSales, where, Product, =, New York, group by, Regin �
        domains.                                                                                  pessimistic
      • Ambiguous aggregation operator: the ⟨MC⟩ clause has                                                 optimistic-pessimistic
        an implicit aggregation operator but the measure is asso-                                                            optimistic
        ciated with multiple aggregation operators.
      • Attribute-value mismatch: the ⟨SSC⟩ clause includes a                             Figure 7: Portion of the parsing forest contributing to its
        value that is not valid for the specified attribute, i.e. it is                   score depending on the adopted scoring criterion.
        either outside or incompatible with the attribute domain.
      • Violation of a multidimensional constraint on a mea-
        sure: the ⟨MC⟩ clause contains an aggregation operator                                • Proposing only the most promising choice may incur in
        that is not allowed for the specified measure.                                          missing the optimal derivation but it makes easier to create
      • Violation of a multidimensional constraint on an at-                                    a baseline query. This one can be improved by adding or
        tribute: the ⟨GC⟩ clause contains a descriptive attribute                               removing clauses through further interactions enabled by
        without the corresponding dimensional attribute4 .                                      the OLAP operator module.
      • Implicit aggregation operator: the ⟨MC⟩ clause has an                                Definition 4.5 (Parsing Forest Score). Given a mapping M and
        implicit aggregation operator and the measure is asso-                            the corresponding parsing forest PF M , we define its score as
        ciated with only one aggregation operator or a default                            Score(PF M ) = Score(M ′ ) where M ′ is the sub-sequence of M
        operator is defined.                                                              belonging to the parsing tree PTM .
      • Implicit attribute: the ⟨SSC⟩ clause has an implicit at-
        tribute and the parsed value belongs to only one attribute                           The parsing forest holding the highest score is the one pro-
        domain.                                                                           posed to the user. This ranking criterion is based on an optimistic-
      • Unparsed clause: The parser was not able to fully under-                          pessimistic forecast of the outcome of the Disambiguation phase.
        stand the mapping and the returned parse forest includes                          On the one hand, we optimistically assume that the annotations
        one or more dandling clauses.                                                     belonging to PTM will be positively solved in the Disambiguation
                                                                                          phase and the corresponding clauses and entities will be kept.
   Each bullet above determines a possible annotation of the
                                                                                          On the other hand, we pessimistically assume that non-parsed
nodes in the parsed forest. In case of implicit information, the
                                                                                          clauses belonging to PF M will be dropped. The rationale of our
entities in the KB represented by the implicit nodes are automat-
                                                                                          choice is that an annotated clause included in the parsing tree is
ically added to the parse tree; thus no user action is required. All
                                                                                          more likely to be a proper interpretation of the text. As shown
the other cases cannot be automatically solved and the nodes
                                                                                          in Figure 7, a totally pessimistic criterion (i.e., exclude from the
are annotated with a specific error type that will trigger a user-
                                                                                          score all the annotated clauses and entities) would carry forward
system interaction during the Disambiguation phase. Table 2
                                                                                          a too simple, but non-ambiguous, forest; conversely, a totally
reports annotation examples.
                                                                                          optimistic criterion (i.e., consider the score of all the entities in
   A textual query generates several parsing forests, one for
                                                                                          PF M ) would make preferable a large but largely non-parsed forest.
each mapping. In our approach, only the most promising one is
                                                                                          Please note that the bare score of the mapping (i.e., the one avail-
proposed to the user in the Disambiguation phase. This choice
                                                                                          able before parsing) corresponds to a totally optimistic choice
comes from two main motivations:
                                                                                          since it sums up the scores of all the entities in the mapping.
      • Proposing more than one alternative queries to the user                              The ranking criterion defined above enables the pruning of the
        can be confusing and makes very difficult to contextualize                        mappings to be parsed as shown by Algorithm 1 that reports the
        the disambiguation questions.                                                     pseudo-code for the Full query module. Reminding that mappings
4 According to DFM a descriptive attribute is an attribute that further describes a
                                                                                          are parsed in descending score order, let us assume that, at some
dimensional level (i.e., it is related one-to-one with the level), but that can be used   step, the best parse forest is PF M ′ with score Score(PF M ′ ). If
for aggregation only in combination with the corresponding level.                         the next mapping to be parsed, M ′′ , has score Score(M ′′ ) <
Algorithm 1 Parsing forest selection                                        • FROM: measures and attributes/values identify, respectively,
Require: M: set of valid mappings                                             the fact and the dimension tables involved in the query.
Ensure: PF ∗ : best parsing forest                                            Given these tables, the join path is identified by follow-
 1: M ← sort(M)                          ▷ sort mappings by score             ing the referential integrity constraints (i.e., by following
 2: PF ∗ ← ∅                                                                  foreign keys from dimension tables imported in the fact
 3: while M , ∅ do ▷ while mapping space is not exhausted                     table).
 4:     M ← head(M) ▷ get the mapping with highest score                Example 4.6 (SQL generation). Given the GPSJ query "sum the
 5:     M ← M \ {M }                          ▷ remove it from M      unit sales by type in the month of July", its corresponding SQL is:
 6:     PF M ← parse(M)                       ▷ parse the mapping
                                                                      SELECT Type, sum(UnitSales)
 7:     if score(PF M ) > score(PF ∗ ) then
                                                                      FROM Sales s JOIN Product p ON (s.pid = p.id)
                 ▷ if current score is higher than previous score
                                                                                   JOIN Date d ON (s.did = d.id)
 8:         PF ∗ ← PF M           ▷ store the new parsing forest
                                                                      WHERE Month = "July"
 9:         M ← M \ {M ′ ∈ M, score(M ′ ) ≤ score(PF ∗ )}
                                                                      GROUP BY Type
                 ▷ remove mappings with lower scores from M
    return PF ∗                                                                                                                                  □

                                                                      5     EXPERIMENTAL TESTS
Score(PF M ′ ), we can stop the algorithm and return PF M ′ since     In this section, we evaluate our framework in terms of effective-
the optimistic score of M ′′ is an upper bound to the score of the    ness and efficiency. Tests are carried out on a real-word bench-
corresponding parse forest.                                           mark of analytics queries [7]. Since queries from [7] refers to
    Algorithm 1 works as follows. At first, mappings are sorted       private datasets, we mapped the natural language queries to the
by their score (Line 1), the best parse forest is initialized, and    Foodmart schema5 . The Automatic KB feeding populated the
the iteration begins. While the set of existing mappings is not       KB with 1 fact, 39 attributes, 12337 values and 12449 synonyms.
exhausted (Line 3), the best mapping is picked, removed from the      Additionally, only 50 synonyms where manually added in the KB
set of candidates, and its parsing forest is generated (Lines 4–6).   enrichment step (e.g., "for each", "for every", "per" are synonyms
If the score of the current forest is higher than the score of the    of the group by statement). While mapping of natural language
stored one (Line 7), then the current forest is stored (Line 8) and   queries to the Foodmart domain, we preserved the structure of
all the mappings with a lower score are removed from the search       the original queries (e.g., word order, typos, etc.). Overall, 75%
space (Line 9) as the pruned mappings cannot produce parsing          of the queries in the dataset are valid GPSJ queries, confirming
forests with a score greater than what has been already parsed.       how general and standard GPSJ queries are. The filtered bench-
                                                                      mark includes 110 queries. We consider token sub-sequences of
                                                                      maximum length n = 4 (i.e., the [1..4]-grams) as no entity in the
4.5     Parse Tree Disambiguation                                     KB is longer than 4 words. Each sub-sequence is associated to
Each annotation in the parse forest requires a user choice to         the top N similar entities with similarity at least higher than α
restore query correctness. User-system interaction takes place        such that at least a percentage β of the tokens in T is covered.
through a set of questions, one for each annotation. Obviously,       The value of β is fixed to 70% based on an empirical evaluation
each answer must be parsed following the same step sequence           of the benchmark queries. Table 4 summarizes the parameters
discussed so far. Table 3 reports the question templates for each     considered in our approach.
annotation type. Each question allows to either provide the miss-
ing information or to drop the clause. Templates are standardized     5.1     Effectiveness
and user choices are limited to keep interaction easy. This al-
                                                                      The effectiveness of our framework quantifies how well the
lows also unskilled users to obtain a baseline query. Additional
                                                                      translation of natural language queries meets user desiderata.
clauses can be added through the OLAP operator module that
                                                                      Effectiveness is primarily evaluated as the parse tree similarity
implements OLAP navigation.
                                                                      T Sim(PT , PT ∗ ) between the parse tree PT produced by our sys-
   At the end of this step, the parse forest is reduced to a parse
                                                                      tem and the correct one PT ∗ , which is manually written by us for
tree since ambiguous clauses are either dropped or added to the
                                                                      each query in the benchmark. Parse tree similarity is based on
parsing tree.
                                                                      the tree distance [23] that keeps into account both the number of
                                                                      correctly parsed entities (i.e., the parse tree leaves) and the tree
4.6     SQL Generation                                                structure that codes the query structure (e.g., selection clauses
Given a parsed tree PTM , the generation of its corresponding         "(A and B) or C" and "A and (B or C)" refers to the same parsed
SQL requires to fill in the SELECT, WHERE, GROUP BY and FROM          entities but underlie different structures). In this subsection, we
statements. The SQL generation is applicable to both star and         evaluate how our framework behaves with(out) disambiguation.
snowflake schemata and is done as follows:                               Let’s consider first the system behavior without disambigua-
      • SELECT: measures and aggregation operators from ⟨MC⟩          tion. Figure 8 depicts the performance of our approach with
        are added to the query selection clause together with the     respect to variations in the number of retrieved top similar enti-
        attributes in the group by clause ⟨GC⟩;                       ties (i.e., N ∈ {2, 4, 6}). Values are reported for the top-k trees (i.e.,
      • WHERE: predicate from the selection clause ⟨SC⟩ (i.e., val-   the k trees with the highest score). We remind that only one parse
        ues and their respective attributes) is added to the query    forest is involved in the disambiguation phase; nonetheless, for
        predicate;                                                    testing purposes, it is interesting to see if best parse tree belongs
      • GROUP BY: attributes from the group by clause ⟨GC⟩ are        5 A public dataset about food sales between 1997 and 1998 (https://github.com/
        added to the query group by set;                              julianhyde/foodmart-data-mysql).
    Annotation type           Description                                                Action
    Ambiguous Attribute       ⟨Val⟩ is member of these attributes [...]                  Pick attribute / drop clause
    Ambiguous Agg. Operator ⟨Mea⟩ allows these operators [...]                           Pick operator / drop clause
    Attribute-Value Mismatch ⟨Attr⟩ and ⟨Val⟩ domains mismatch, possible value are [...] Pick value / drop clause
    MD-Meas Violation         ⟨Mea⟩ does not allow ⟨Agg⟩, possible operators are [...]   Pick operator / drop clause
    MD-GBY Violation          It is not allowed to group by on ⟨Attr⟩ without ⟨Attr⟩     Add attribute / drop clause
    Unparsed ⟨GC⟩ clause      There is a dangling grouping clause ⟨GC⟩                   Add clause to ⟨GPSJ⟩ / drop clause
    Unparsed ⟨MC⟩ clause      There is a dangling measure clause ⟨MC⟩                    Add clause to ⟨GPSJ⟩ / drop clause
    Unparsed ⟨SC⟩ clause      There is a dangling predicate clause ⟨SC⟩                  Add clause to ⟨GPSJ⟩ / drop clause
                  Table 3: Templates and actions required to solve misunderstandings in textual queries.


 Symbol      Range           Meaning
 N           {2, 4, 6}       Num. of top similar entities                                        N=2       N=4           N=6
 α           {0.4, 0.5, 0.6} Token/entity minimum similarity
                                                                                       1.0
 β           70%             Sentence coverage threshold
 n           4               Maximum sub-sequence length                               0.8
            Table 4: Parameter values for testing.                                     0.6


                                                                                TSim
                                                                                       0.4
                                                                                       0.2
                                                                                       0.0
                                                                                             1     2       3         4         5
                                                                                                           k
to the top-k ranked ones. Effectiveness slightly change varying
N and it ranges in [0.88, 0.91]. Effectiveness is more affected by
the entity/token similarity threshold α and ranges in [0.83, 0.91].     Figure 8: Effectiveness varying the number of retrieved
In both cases, the best results are obtained when more similar          top similar entities N and the number of Top-k queries re-
entities are admitted and more candidate mappings are gener-            turned (with α = 0.4).
ated. Independently of the chosen thresholds the system results
very stable (i.e., the effectiveness variations are limited) and even                            = 0.6     = 0.5          = 0.4
considering only one query to be returned its effectiveness is at                      1.0
the state of the art [13, 18, 22]. This confirms that (1) the choice
                                                                                       0.8
of proposing only one query to the user does not negatively im-
                                                                                       0.6
                                                                                TSim


pact on performances (while it positively impacts on interaction
complexity and efficiency) and (2) our scoring function properly                       0.4
ranks parse tree similarity to the correct interpretation for the                      0.2
query since the best ranked is in most cases the most similar
                                                                                       0.0
to the correct solution. As the previous tests do not include dis-                           1     2       3         4         5
ambiguation, only 58 queries out of 110 are not ambiguous and                                              k
produce parse trees that can be fed as-is to the generation and
execution phases. This means that 52 queries — despite being
very similar to the correct tree, as shown by the aforementioned        Figure 9: Effectiveness varying the similarity threshold α
results — are not directly executable without disambiguation (we        and the number of Top-k queries returned (with N = 6).
recall the ambiguity/error types from Table 2). Indeed, of these 52                    1.0
queries, 38 contains 1 ambiguity annotation, 12 contains two am-                       0.8
biguities annotations, and 2 contains three or more ambiguities
                                                                                       0.6
                                                                                TSim


annotations. Figure 10 depicts the performance when the best
parse tree undergoes iterative disambiguation (i.e., an increasing                     0.4
number of correcting action is applied). Starting from the best                        0.2
configuration from the previous tests (for N = 6 and α = 0.4),                         0.0
by applying an increasing number of correcting actions the ef-                               0         1         2             3
fectiveness increases from 0.89 up to 0.94. Unsolved differences                                 Disambiguation step
between PT and PT ∗ are mainly due to missed entities in the
mappings.                                                               Figure 10: Effectiveness as a function of the number of dis-
   Although our approach shows effectiveness comparable to              ambiguation steps (i.e., application of correcting actions;
state-of-art proposals [13, 18, 22], it was not possible to run a de-   with k = 1, N = 6 and α = 0.4).
tailed comparison against them because the implementations are
not available and the provided descriptions are far from making
them reproducible. Furthermore, despite the availability of some
natural language datasets, the latter are hardly compatible with
GPSJ queries and are not based on real analytic workloads.
                                                                            6     CONCLUSIONS AND FUTURE WORKS
                                Parsed         Generated                    In this paper, we proposed a conversational OLAP framework and
                     105                                                    an original approach to translate a text describing an OLAP query
                                                                            in a well-formed GPSJ query. Tests on a real case dataset have
                     104
                                                                            shown state-of-the-art performances. More in detail, we have
                     103
           | |


                                                                            shown that coupling a grammar-based text recognition approach
                     102                                                    with a disambiguation phase determines a 94% accuracy. We are
                                                                            now working in many different directions to complete the frame-
                     101                                                    work by: (a) supporting an OLAP navigation session rather than
                            1     3             5          7                a single query, as this requires to extend the grammar and to have
                                         |M|                                memories of previous queries in order to properly modify them;
                                                                            (b) designing a proper visual representation of the interpreted
                                                                            text to improve the system effectiveness, which can be obtained
Figure 11: Number of "Generated" and "Parsed" mappings
                                                                            by defining a proper graphical metaphor of the query on top of a
varying the number |M | of entities in the optimal tree
                                                                            conceptual visual formalism such as DFM; (c) testing the whole
(with N = 6 and α = 0.4).
                                                                            framework on real users to verify its effectiveness from points of
                                                                            view that go beyond accuracy, e.g. usability, immediacy, mem-
                                N=2       N=4       N=6                     orability; (d) exploiting the Log to improve the disambiguation
                                                                            effectiveness and to learn synonyms not available a-priori.
                      101
          Time (s)


                                                                            REFERENCES
                      100                                                    [1] [n. d.]. Hype Cycle for Artificial Intelligence, 2018. http://www.gartner.com/
                                                                                 en/documents/3883863/hype-cycle-for-artificial-intelligence-2018. ([n. d.]).
                                                                                 Accessed: 2019-06-21.
                     10 1                                                    [2] Katrin Affolter, Kurt Stockinger, and Abraham Bernstein. 2019. A comparative
                                                                                 survey of recent natural language interfaces for databases. The VLDB Journal
                            1      3            5          7                     28, 5 (2019), 793–819.
                                         |M|                                 [3] Julien Aligon, Enrico Gallinucci, Matteo Golfarelli, Patrick Marcel, and Stefano
                                                                                 Rizzi. 2015. A collaborative filtering approach for recommending OLAP
                                                                                 sessions. Decision Support Syst. 69 (2015), 20–30.
                                                                             [4] John C. Beatty. 1982. On the relationship between LL(1) and LR(1) grammars.
Figure 12: Execution time varying the number |M | of en-                         J. ACM 29, 4 (1982), 1007–1022.
tities in the optimal tree and the number of top similar                     [5] Lukas Blunschi, Claudio Jossen, Donald Kossmann, Magdalini Mori, and Kurt
                                                                                 Stockinger. 2012. SODA: Generating SQL for Business Users. PVLDB 5, 10
entities N (with α = 0.4).                                                       (2012), 932–943.
                                                                             [6] Kedar Dhamdhere, Kevin S. McCurley, Ralfi Nahmias, Mukund Sundararajan,
                                                                                 and Qiqi Yan. 2017. Analyza: Exploring Data with Conversation. In IUI. ACM,
                                                                                 493–504.
                                                                             [7] Krista Drushku, Julien Aligon, Nicolas Labroche, Patrick Marcel, and Verónika
                                                                                 Peralta. 2019. Interest-based recommendations for business intelligence users.
5.2    Efficiency                                                                Inf. Syst. 86 (2019), 79–93.
                                                                             [8] Lorena Etcheverry and Alejandro A. Vaisman. 2012. QB4OLAP: A Vocabulary
We ran the tests on a machine equipped with Intel(R) Core(TM)                    for OLAP Cubes on the Semantic Web. In COLD (CEUR Workshop Proceedings),
i7-6700 CPU @ 3.40GHz CPU and 8GB RAM, with the framework                        Vol. 905. CEUR-WS.org.
                                                                             [9] Matteo Francia, Matteo Golfarelli, and Stefano Rizzi. 2019. Augmented Busi-
implemented in Java.                                                             ness Intelligence. In DOLAP (CEUR Workshop Proceedings), Vol. 2324. CEUR-
   In Section 4.2 we have shown that the mapping search space                    WS.org.
increases exponentially in the text length. Figure 11 confirms this         [10] Ramanathan V. Guha, Vineet Gupta, Vivek Raghunathan, and Ramakrishnan
                                                                                 Srikant. 2015. User Modeling for a Personal Assistant. In WSDM. ACM, 275–
result showing the number of generated mappings as a function                    284.
of the number of entities |M | included in the optimal parse tree           [11] Ashish Gupta, Venky Harinarayan, and Dallan Quass. 1995. Aggregate-Query
                                                                                 Processing in Data Warehousing Environments. In VLDB. Morgan Kaufmann,
PT ∗ , |M | is strictly related to |T |. Note that, with reference to the        358–369.
parameter values reported in Table 4, the configuration analyzed            [12] Nicolas Kuchmann-Beauger, Falk Brauer, and Marie-Aude Aufaure. 2013.
in Figure 11 is the worst case since it determines the largest                   QUASL: A framework for question answering and its Application to busi-
                                                                                 ness intelligence. In RCIS. IEEE, 1–12.
search space due to the high number of admitted similar entities.           [13] Fei Li and H. V. Jagadish. 2016. Understanding Natural Language Queries over
Noticeably, pruning rules strongly limit the number of mappings                  Relational Databases. SIGMOD Record 45, 1 (2016), 6–13.
to be actually parsed.                                                      [14] Gabriel Lyons, Vinh Tran, Carsten Binnig, Ugur Çetintemel, and Tim Kraska.
                                                                                 2016. Making the Case for Query-by-Voice with EchoQuery. In SIGMOD
   Figure 12 shows the average execution time by varying |M |                    Conference. ACM, 2129–2132.
and the number of allowed top similar entities N . The execution            [15] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel,
                                                                                 Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural
time increase with the number of entities included in the optimal                Language Processing Toolkit. In ACL (System Demonstrations). The Association
parse tree, as such also the number of top similar entities impacts              for Computer Linguistics, 55–60.
the overall execution time. We recall that effectiveness remains            [16] George A. Miller. 1995. WordNet: A Lexical Database for English. Commun.
                                                                                 ACM 38, 11 (1995), 39–41.
high also for N = 2 corresponding to an execution time of 1                 [17] Amrita Saha, Mitesh M. Khapra, and Karthik Sankaranarayanan. 2018. To-
second, raising to 101 seconds in the worst case. We emphasize                   wards Building Large Scale Multimodal Domain-Aware Conversation Systems.
that the execution time corresponds to the time necessary for                    In AAAI. AAAI Press, 696–704.
                                                                            [18] Diptikalyan Saha, Avrilia Floratou, Karthik Sankaranarayanan, Umar Farooq
the translation, and not to the time to actually execute them.                   Minhas, Ashish R. Mittal, and Fatma Özcan. 2016. ATHENA: An Ontology-
Queries are executed against the enterprise data mart and their                  Driven System for Natural Language Querying over Relational Data Stores.
                                                                                 PVLDB 9, 12 (2016), 1209–1220.
performance clearly depends on the underlying multidimensional              [19] Jaydeep Sen, Fatma Ozcan, Abdul Quamar, Greg Stager, Ashish R. Mittal,
engine.                                                                          Manasa Jammi, Chuan Lei, Diptikalyan Saha, and Karthik Sankaranarayanan.
     2019. Natural Language Querying of Complex Business Intelligence Queries.
     In SIGMOD Conference. ACM, 1997–2000.
[20] Radwa El Shawi, Mohamed Maher, and Sherif Sakr. 2019. Automated Machine
     Learning: State-of-The-Art and Open Challenges. CoRR abs/1906.02287 (2019).
[21] Immanuel Trummer, Yicheng Wang, and Saketh Mahankali. 2019. A Holis-
     tic Approach for Query Evaluation and Result Vocalization in Voice-Based
     OLAP. In Proc. of the 2019 Int. Conf. on Manage. of Data, SIGMOD Conf. 2019,
     Amsterdam, The Netherlands, June 30 - July 5, 2019. 936–953.
[22] Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig. 2017.
     SQLizer: query synthesis from natural language. PACMPL 1, OOPSLA (2017),
     63:1–63:26.
[23] Kaizhong Zhang and Dennis E. Shasha. 1989. Simple Fast Algorithms for the
     Editing Distance Between Trees and Related Problems. SIAM J. Comput. 18, 6
     (1989), 1245–1262.