<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Querying RDF Data Cubes through Natural Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maurizio Atzori</string-name>
          <email>atzori@unica.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe M. Mazzeo</string-name>
          <email>mazzeo@cs.ucla.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlo Zaniolo</string-name>
          <email>zaniolo@cs.ucla.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Facebook Inc.</institution>
          <addr-line>Menlo Park</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Math/CS Department University of Cagliari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of California Los Angeles</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>24</fpage>
      <lpage>27</lpage>
      <abstract>
        <p>In this discussion paper we present QA3, a question answering (QA) system over RDF cubes. The system rst tags chunks of text with elements of the knowledge base (KB), and then leverages the wellde ned structure of data cubes to create the SPARQL query from the tags. For each class of questions with the same structure a SPARQL template is de ned. The correct template is chosen by using a set of regex-like patterns, based on both syntactical and semantic features of the tokens extracted from the question. Preliminary results are encouraging and suggest a number of improvements. Over the 50 questions of the QALD-6 challenge, QA3 can process 44 questions, with 0:59 precision and and 0:62 recall, remarkably improving the state of the art in natural language question answering over data cubes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Governments of several countries recently started to publish information about
the public expenses using the RDF data model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], in order to improve
transparency. The need to publish statistical data, which concerns not only
Governments but also many other organizations, has pushed the de nition of a speci c
RDF-based model, namely, the RDF data cube model [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], whose current
version was published in January 2014. The availability of data of public interest
from di erent sources has thus favored the creation of projects, such as
LinkedSpending [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that collect statistical data from several organizations, making
them available as an RDF KB, according to the Linked Data principles.
However, while RDF data can be e ciently queried using the powerful SPARQL
language, only technical users can bene t from their potential by extracting
human-understandable information. The problem of providing a user-friendly
interface that enables non-technical users to query RDF knowledge bases has
been widely investigated during the last few years. Some of the existing
approaches are the following. Exploratory browsing enables users to navigate the
RDF graph by starting from an entity (node) and then moving to other entities
by following properties (edges). Although users do not need to know beforehand
the names of properties, this approach is e ective only for exploring the graph
in the proximity of the initial entity, and it is not suitable for aggregate queries.
Faceted search supports top-down searches: starting with the whole dataset as
potential results of the search, the user can progressively restrict the results by
adding constraints on the properties of the current result set [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This approach
was recently applied to the RDF data cubes [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], following the long tradition of
graphical user interfaces for OLAP analysis, based on charts representing di
erent kind of aggregations of the underlying data. Another user-friendly system
for querying RDF data is SWiPE [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which enables users to type the constraints
directly in the elds of the infobox of WikiPedia pages. While this paradigm is
very e ective for nding list of entities with speci c properties, its generalization
to the RDF data cubes is not trivial. Natural language interfaces let the user
type any question in natural language and translate it into a SPARQL query.
The current state-of-the-art system is Xser [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], which was able to yield F-scores
equal to 0:72 and 0:63 in the 2014 and 2015 QALD challenges [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], respectively.
Although its accuracy is not very high, Xser largely won over the other
participating systems. This witnesses the fact that translating natural language
questions into SPARQL queries is a really hard task. The di culty of this task
can be reduced by using a controlled natural language (CNL), i.e., a language
which is is generated by a restricted grammar [
        <xref ref-type="bibr" rid="ref14 ref16 ref9">9,14,16</xref>
        ]. While the accuracy of
these systems is very high, some e ort is required for following the language that
can be recognized. None of the previously proposed systems, however, has been
specialized for question answering over RDF data cubes.
      </p>
      <p>
        Question answering over RDF cubes is, indeed, a brand new challenge, which
raises issues di erent from those of question answering on a \general" KBs [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
In fact, questions in this context are very speci c, i.e., oriented towards
extracting statistical information. For these questions a very accurate interpretation of
speci c features of the typical OLAP queries, such as di erent kind of
aggregations or constraints on dimensions, is crucial, and misinterpretations can not
be tolerated. In fact, while a partial interpretation of a general question might
yield an acceptable answer, in an aggregation query misinterpreting a constraint
is likely to result in a totally wrong answer.
      </p>
      <p>
        In this discussion paper we present QA3 (pronounced as QA cube), a question
answering system for RDF data cubes [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. An overview of QA3 is presented in
Section 2, and preliminary experimental results are reported in Section 3. We
nally conclude the paper with Section 4.
      </p>
    </sec>
    <sec id="sec-2">
      <title>An overview of QA3</title>
      <p>
        QA3 translates questions posed in natural language into SPARQL queries,
assuming that the answer to the questions can be provided using one of the RDF
data cubes \known" by the system. Given a KB containing datasets stored using
the RDF data cube model [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], QA3 works in three steps:
{ the question is tagged with elements of the KB, belonging to the same dataset.
      </p>
      <p>
        This step also detects which dataset is the most appropriate to answer the
question;
{ the question is tokenized, using Stanford parser [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and the annotations are
augmented using the tags obtained from the previous step. The sequence of
tokens is then matched against some regex-like patterns, each associated with
a SPARQL template;
{ the chosen template is lled with the actual clauses (constraints, lters, etc.)
by using the tags and the structure of dataset.
      </p>
      <p>In the following each step is described in more detail.
2.1</p>
      <sec id="sec-2-1">
        <title>Tagging questions with elements of the KB</title>
        <p>
          Given a string Q representing a question, for each dataset D we try to match the
literals in D against chunks of Q. Before performing the matching, the strings
are normalized, by removing the stop words and performing lemmatization. The
result of the matching between Q and the dataset D can be represented as set of
pairs hC; T i, where C is a chunk of Q and T is a set of triples in D. Each matching
is associated with a quality measure, which is based on the percentage of Q that
is covered by tagged chunks. Some other advanced ranking methods (such as [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ])
could be used to further improvements in case of multiple candidates. We notice
that related work [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] extracts connected subgraphs given keywords instead of
triples, although we connect them later constraining them to a sparql template
(see later). The quality measure (% of Q) enables the system to choose the
dataset which most likely has to be used to provide an answer to the question.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Finding the SPARQL query template</title>
        <p>
          The domain in which QA3 operates is quite structured, especially compared to
that of general question answering (e.g., DBpedia). As a consequence, the
meaningful questions (i.e., questions that can be answered using the available KB) are
likely to have a SPARQL translation which follow a limited set of well de ned
templates. For instance, if the user wants to know how much his city spent for
public works in 2015, the question has to contain all the elements needed to
detect the dataset to be used, the measure to aggregate and the aggregation
function, and the constraints to be applied to restrict the computation on the
observations in which the user is interested. This question, like a wide range of
similar questions, can be answered by using the following SPARQL query
template, based on the RDF cube dictionary [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]:
SELECT sum(xsd:decimal(?measure)) f
?observation qb:dataset &lt;dataset&gt; .
?observation &lt;measure&gt; ?measure .
        </p>
        <p>&lt;constraints&gt;
g
where &lt;dataset&gt; and &lt;measure&gt; have to be replaced with the correct URIs, and
&lt;constraints&gt; has to be replaced with a set of triples specifying the constraints
for the variable ?observation, representing the observations.</p>
        <p>
          In order to leverage the typical homogeneity of the structures of these
questions, we implemented a system, working of a set of SPARQL templates, that
automatically detects the template to be used to provide an answer. To this
end, each template is associated with one (or possibly more) regular expressions
built on the tokens of the questions. The tokens are obtained using the
Stanford parser [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which tokenizes the question and annotates each token with its
lemma, its POS (part of speech) tag, and its NER (named entity recognition)
tag. The annotations are augmented with the elements of the knowledge base
(dataset, measure, dimension, attribute, literal) obtained at the previous step,
and with a tag representing a possible aggregate function. For the latter, a
speci c (small) dictionary has been de ned, using the words that are used most
often to name aggregation functions (e.g., sum, total, average, maximum, etc.).
        </p>
        <p>Thus, for each token we have the following features: (1) the POS tag, (2) the
lemma, (3) the word (i.e., the original text), (4) the NER tag, (5) the KB tag (S:
dataset, M: measure, D: dimension, A: attribute, E: entity, L: literal, O: none),
and (6) the aggregation tag (S: sum, A: average, M: max, N: min, O: none).</p>
        <p>The patterns used to match the tokens are de ned as 7-tuples, where the
i-th element represents the possible set of values for the i-th feature of the token
(the features are assumed to follow the order in which they are listed above).
For instance, the pattern WP|WDT, , , ,!E matches with tokens having WP
or WDT as POS tag, any lemma, word, NER tag ( ), and the token must not
be annotated as entity (E). The features that are not explicitly speci ed in the
pattern (in this case the aggregation tag), are allowed to take any value ( is
implicitly assumed).</p>
        <p>Patterns can be followed by a # symbol and by a string that represents a
label name, that can be used to get the actual token/s that matched the pattern.
These simple patterns can be used to build more complex regular expressions.
We describe them by means of an example:
f g* WP|WDT f , , , ,O,Og* , , , ,O,!O#1 f , , , ,M|S#2g* f g*
Each pattern of the expression above must match subsequent chunks of the
whole question. The interpretation of the patterns appearing in the expression
is the following:
{ any sequence of tokens (f g*);
{ a token with the POS tag WP or WDT;
{ any sequence of tokens, whatever their POS tag, lemma, word, and NER are
( ), without any speci c KB annotation and without any speci c aggregation
function (O);
{ a token with any POS tag, any lemma, any word, and any NER, without any
speci c KB annotation (O) and with a speci c aggregation function (!O). This
token is assigned the label 1 (#1);
{ any sequence of tokens with any POS tag, any lemma, any word, and any
NER, with a KB annotation that can be a measure (M) or a dataset (S). The
type of tag for the aggregation function is not speci ed, which means it can
be anything (in practice, it will be none). These tokens are assigned the label
2 (#2);
{ any sequence of tokens (f g*).</p>
        <p>This expression can be matched against several questions, such as: \What
is the total aid to the Anti Corruption Commission in the Maldives in 2015?".
In general, that expression matches questions asking for the computation of an
aggregation function, which is represented by the token with label 1 computed
over a measure, which is represented by the token with label 2. The measure
could also be implicitly denoted by the name of the dataset (e.g., when the
dataset is about a speci c measure of a set of observations - expenditures of
town of Cary). These questions can be translated into a SPARQL query with
the following structure:
select &lt;groupByVar&gt;
&lt;aggregationFunction##1&gt;(xsd:decimal(?measure)) f
?observation qb:dataSet &lt;dataset&gt; .
?observation &lt;measure##2&gt; ?measure .</p>
        <p>&lt;constraints&gt;
g &lt;groupByClause&gt;
where &lt;aggregationFunction##1&gt; has to be replaced by the actual
aggregation function, which can be derived using the token annotated with label 1,
&lt;dataset&gt; must be replaced with the URI of the dataset found in the previous
step, &lt;measure##2&gt; must be replaced with the actual measure, using tokens
labeled with 2. Finally, &lt;constraint&gt;, &lt;groupByVar&gt; and &lt;groupByClause&gt; must
be replaced with the actual variable/clause (possibly empty), that are derived
using the KB tagging, as described in the following.</p>
        <p>We remark that this strategy for deriving the SPARQL template is quite
general and the de nition of new templates is quite simple. Although capturing
all the natural language questions is not possible through a nite set of patterns,
we found that very few expressions are enough to cover most of the questions
posed in a typical form (see Section 3).
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Filling out contraints and group-by</title>
        <p>The most interesting part is the construction of the constraints and the
groupby clauses. In order to construct the constraints, we observe that (i) if a literal
is tagged, then it must be connected to the observation through an attribute,
which is also reported in the annotation, and (ii) if an entity is tagged, then
it must be connected to the observation through a dimension. The dimension
could be explicitly tagged in the question, but it can be also derived by
maintaining an index that maps every entity e to the dimensions which can take e
as value. The substring &lt;constraints&gt; of the template can be thus replaced
with a string representing the triples obtained as described above. Regarding
the group-by variable and clause, we observe that a question requiring their use
has to contain an attribute or dimension which is not bound to a value (literal or
entity, respectively). Therefore, we can try to nd those tokens that are tagged
with a dimension/attribute X to which no value is associated. We then replace
&lt;groupByVar&gt; with a variable identi er, say ?gbvar, and ?observation is
connected to ?gbvar using the URI of X, and the &lt;groupByClause&gt; placeholder
is replaced with group by ?gbvar. If no such attribute/dimension X can be
found, both &lt;gbvar&gt; and &lt;groupby&gt; are replaced by the empty string.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental results</title>
      <p>
        QA3 participated in the task 3 of QALD-6 challenge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where it was able to
process 44 questions over the 50 available ones. The recall and precision over the 44
processed questions are 0:62 and 0:59 respectively, which correspond to F-score
0:6. The global F-score, assuming F-score 0 over the 6 unprocessed questions,
is 0:53. In more details, over the 44 processed questions, QA3 provides a correct
answer (F-score 1) to 25 questions, a partial answer (F-score strictly between
0 and 1) to 3 questions, and a wrong answer (F-score 0) to 16 questions. The
correct dataset can be found for 42 questions. For 30 of these questions the full
set of correct annotations is found. Finally, for 25 of the questions with correct
annotations QA3 generates the SPARQL query that provides the correct results.
A set of 6 pairs expression/template is currently being used, in order to detect
the correct SPARQL template to be used for each question. These
experimental results refer to our rst version of QA3, that we later improved and deeply
discussed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and made freely available.
      </p>
      <p>
        A direct comparison against other systems has been independently performed
by the QALD 6 Challenge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], as reported in Fig. 1. The best performer in
this comparison is a special version of the SPARKLIS system [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] tailored to
statistical questions, a system that does not accept natural language questions.
Instead, as explained in the next section it uses a faceted search approach, and
its performance is, therefore, dependant on the level of expertise the user has
(values for expert and beginner are shown). To the best of our knowledge, the
see the two modules https://github.com/atzori/qa3tagger (ad-hoc tagger) and
https://github.com/gmmazzeo/qa3 (template-based sparql generation)
SPARKLIS (used by an expert)
SPARKLIS (used by a beginner)
QA3 (initial tuning)
CubeQA
only two systems that answer free natural language questions over RDF cubes
are CubeQA [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], described in the next section, and our QA3. Compared to the
state-of-the-art CubeQA, we get a remarkable improvement of 0:62=0:41 = 51%
in recall, 20% in precision. This is in part given by the ability of QA3 to
selfevaluate the con dence of a computed answer (thanks to the score measure of
the tagger), and also by the good expressivity of the template/pattern system.
      </p>
      <p>We also remark that F1 and F1 Global, i.e., the F1 measure computed over
all questions (not only those for which the system provides an answer) are
respectively 33% and 20% higher than those of CubeQA. These results show that
QA3 provides a sensible improvement over the state of the art.</p>
      <p>In terms of running time, the most expensive part is performed by the search
for the correct dataset. Our in-memory tagging algorithm takes about 100ms to
annotate a question for a given dataset. Current version of QA3 takes therefore
100ms 50 5s to check for the best candidate dataset and annotate the question
with triples necessary to nd and ll in the correct SPARQL query template (real
times ranging from 2s to 7s). While reasonable, this exhaustive approach would
not scale well in case of thousands of datasets. We consider current results very
encouraging, and we plan to improve QA3 by using 1) a heuristic-based approach
for the dataset search, 2) word embeddings for semantic relatedness instead of
lemma-based term matching, and 3) a machine-learning approach for SPARQL
template generation.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>
        In this discussion paper we have presented QA3, a system that can answer
natural language questions about statistical data by nding the appropriate
LinkedSpending [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] dataset and computing a correct SPARQL query. It works on
different kind of typical statistical questions. The system has been implemented,
opensourced on GitHub and also freely available online at http://qa3.link/.
      </p>
      <p>
        During the discussion, questions taken from the QALD-6 challenge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] will
be lively employed, but the attendees will be also welcome to suggest other
questions. Future work will explore the use of machine learning to generate
the regex-like patterns and related privacy issues [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] related with the use of
statistical data.
This research was supported in part by a 2015 Google Faculty Research Award,
NIH 1 U54GM 114833-01 (BD2K) and Sardegna Ricerche (project OKgraph,
Capitale Umano Alta Quali cazione, CRP 120).
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. LinkedSpending project. http://linkedspending.aksw.org/.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Linking</given-names>
            <surname>Open</surname>
          </string-name>
          <article-title>Data cloud diagram</article-title>
          . http://lod-cloud.net/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Question</surname>
          </string-name>
          <article-title>Answering over Linked Data (QALD)</article-title>
          . http://qald.sebastianwalter. org/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>4. The RDF Data Cube Vocabulary</article-title>
          . https://www.w3.org/TR/vocab
          <article-title>-data-cube/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Atzori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Mazzeo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Zaniolo</surname>
          </string-name>
          .
          <article-title>QA3: a Natural Language Approach to Question Answering over RDF Data Cubes</article-title>
          .
          <source>Semantic Web Journal</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M.</given-names>
            <surname>Atzori</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Zaniolo</surname>
          </string-name>
          .
          <article-title>Swipe: searching wikipedia by example</article-title>
          .
          <source>In Proc. of the 21st Int. World Wide Web Conf., WWW 2012 (Companion Volume)</source>
          , pages
          <fpage>309</fpage>
          {
          <fpage>312</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Dessi</surname>
          </string-name>
          and
          <string-name>
            <surname>M. Atzori.</surname>
          </string-name>
          <article-title>A machine-learning approach to ranking RDF properties</article-title>
          .
          <source>Future Generation Computer Systems</source>
          ,
          <volume>54</volume>
          :
          <fpage>366</fpage>
          {
          <fpage>377</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>S.</given-names>
            <surname>Elbassuoni</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Blanco</surname>
          </string-name>
          .
          <article-title>Keyword search over rdf graphs</article-title>
          .
          <source>In Proceedings of the 20th ACM CIKM '11</source>
          , pages
          <fpage>237</fpage>
          {
          <fpage>242</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferre</surname>
          </string-name>
          . Sparklis:
          <article-title>An expressive query builder for sparql endpoints with guidance in natural language</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <volume>95</volume>
          {
          <fpage>104</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>R.</given-names>
            <surname>Hahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sahnwaldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Herta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Robinson</surname>
          </string-name>
          , M. Burgle, H. Duwiger, and
          <string-name>
            <given-names>U.</given-names>
            <surname>Scheel</surname>
          </string-name>
          .
          <article-title>Faceted wikipedia search</article-title>
          .
          <source>In Proc. of 13th Int. Conf. of Business Information Systems, BIS 2010</source>
          , pages
          <fpage>1</fpage>
          {
          <fpage>11</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>K. Ho</surname>
          </string-name>
          <article-title> ner and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          .
          <article-title>Towards question answering on statistical linked data</article-title>
          .
          <source>In Proc. of the 10th Int. Conf. on Semantic Systems, SEMANTICS</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. K. Ho ner, J. Lehmann, and
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          .
          <article-title>Cubeqa - question answering on RDF data cubes</article-title>
          .
          <source>In The Semantic Web - ISWC 2016 - 15th International Semantic Web Conference</source>
          , Kobe, Japan,
          <source>October 17-21</source>
          ,
          <year>2016</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <volume>325</volume>
          {
          <fpage>340</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>D.</given-names>
            <surname>Klein</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Accurate unlexicalized parsing</article-title>
          .
          <source>In 41st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <volume>423</volume>
          {
          <fpage>430</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>A.</given-names>
            <surname>Marginean</surname>
          </string-name>
          . Gfmed:
          <article-title>Question answering over biomedical linked data with grammatical framework</article-title>
          .
          <source>In Working Notes for CLEF 2014 Conference</source>
          , pages
          <volume>1224</volume>
          {
          <fpage>1235</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>M. Martin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Abicht</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Stadler</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          <string-name>
            <surname>Ngomo</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Soru</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Auer</surname>
          </string-name>
          . Cubeviz:
          <article-title>Exploration and visualization of statistical linked data</article-title>
          .
          <source>In Proc. of the 24th Int. World Wide Web Conf., WWW 2015 (Companion Volume)</source>
          , pages
          <fpage>219</fpage>
          {
          <fpage>222</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>G. M. Mazzeo</surname>
            and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zaniolo</surname>
          </string-name>
          .
          <article-title>Answering controlled natural language questions on RDF knowledge bases</article-title>
          .
          <source>In Proc. of the 19th International Conference on Extending Database Technology, EDBT 2016</source>
          , pages
          <fpage>608</fpage>
          {
          <fpage>611</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bonchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Verykios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Atzori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Malin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Moelans</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Saygin</surname>
          </string-name>
          .
          <article-title>Privacy protection: Regulations and technologies, opportunities and threats</article-title>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>K.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Huang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>Question answering via phrasal semantic parsing</article-title>
          .
          <source>In Proc. of 6th Int. Conf. of the CLEF Association, CLEF</source>
          <year>2015</year>
          , pages
          <fpage>414</fpage>
          {
          <fpage>426</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>