<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mitigating the impact of out of vocabulary words in a neural-machine-translation-based question answering system</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Manuel Borroto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernardo Cuteri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Ri</string-name>
          <email>francesco.riccag@unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Calabria</institution>
          ,
          <addr-line>Rende CS 87036, Italy https://informatica.unical.it</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The di usion of ontologies ended up in the development of rich knowledge bases featuring large volumes of information concerning multiple domains. However, the largest majority of potential users are unfamiliar with the SPARQL query language, and thus can enjoy only limited access to knowledge bases provided by prede ned interfaces. Systems able to translate questions posed in natural language in SPARQL queries have the potential of overcoming this problem. In this paper, we approach this problem as a Neural Machine Translation task to implement an automatic translation of natural language questions in SPARQL queries. A distinctive feature of our deep-learning-based approach is its robustness with respect to the presence of terms (referring to individuals) that do not occur in the training set. We demonstrate the potential of our approach by presenting its results on the Monument dataset, a benchmark for Question Answering on the well-known DBpedia ontology.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural Language Processing • Question Answering • Knowledge base • Neural Machine Translation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The di usion of ontologies as a mean for modeling, storing, and sharing
information determined the development of rich knowledge bases featuring large
volumes of information concerning multiple domains. As a result of this, we now
have vast and complex knowledge bases that allow gathering large volumes of
information through the intercommunication of thousands of datasets referring
to various domains in what is known as Linked Data. Thus, people have
potential access to a large amount of information never thought, and the DBpedia [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
project is a real example of that, which is one of the most popular knowledge
bases nowadays. However, the search and retrieval of the information stored in
this way can be a hard task for lay users because it is necessary to know the
structure of the knowledge base and the appropriate query languages, such as
      </p>
      <p>
        SPARQL [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. This means that the largest majority of users have only limited
access to knowledge bases as it is provided by prede ned interfaces.
      </p>
      <p>Systems able to translate questions posed in natural language in SPARQL
queries have the potential of overcoming this problem because they can remove
all technical complexity to the nal users. Thus, natural language Question
Answering (QA) is gaining importance in the area of the Semantic Web.</p>
      <p>
        The most recent QA approaches resulted [
        <xref ref-type="bibr" rid="ref14 ref16 ref3">3, 14, 16</xref>
        ] in systems for the
automatic translation from natural language questions to SPARQL queries. These
are mostly based on deep neural networks to tackle the problem and exploit the
great development achieved by Deep Learning in the last few years.
      </p>
      <p>In this paper, we approach this problem as a Neural Machine Translation task
to implement an automatic translation of natural language questions in SPARQL
queries. Our system was built to be robust with respect to the presence of terms
(referring to individuals) that do not occur in the training set, also called out
of vocabulary words. A feature particularly useful when dealing with evolving
ontologies that are continuously enriched with new individuals.</p>
      <p>
        We achieve this result with a novel architecture that combines based on a
Neural Machine Translation (NMT) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] module and a Named Entity Recognition
(NER) module both based on bidirectional recurrent neural networks [
        <xref ref-type="bibr" rid="ref10 ref18">18, 10</xref>
        ].
The NMT module translates the input NL question into a SPARQL template,
whereas the NER module extracts the entities from the question. The
combination of the results of the two modules results in a SPARQL query ready to be
executed. Importantly, we introduce a formal de nition of a training set format
that reduces the output space and is essential for the proper functioning of the
system and also allows us to tackle the problem with out-of-vocabulary (OOV)
words, a major weakness of the majority of the related approaches today. We
empirically test the system on the Monument dataset[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is a benchmark
for Question Answering on the well-known DBpedia ontology.
      </p>
      <p>This paper is structured as follows. In section 2, we go into the particular
details of our approach. Section 3 focuses on the discussion of experiments and
results. Then in section 4, we talk about related works, and nally, we provide
some conclusions and aspects for future work.</p>
      <p>
        In the following, we assume the reader already knows the main concepts and
techniques applied in the content of this research work, such as Knowledge bases,
Neural Networks, Deep Learning, Natural Language Processing, Neural Machine
Translation, Named Entity Recognition, among others. To go more in deep into
these topics, please refer to [
        <xref ref-type="bibr" rid="ref1 ref4 ref5 ref7 ref9">7, 5, 1, 9, 4</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>From</title>
    </sec>
    <sec id="sec-3">
      <title>Natural Language Questions to SPARQL</title>
      <p>Knowledge bases (KB) are a rich source of information related to a great variety
of domains, which can be accessed by experts of formal query languages. The
potential of exploiting knowledge bases can be greatly increased by allowing any
user to query the ontology by posing questions in natural language.</p>
      <p>In this paper, this problem is seen as the following Natural Language
Processing task: Given an RDF knowledge base O and a question Qnat in natural
language (to be answered using O), translate Q into a SPARQL query SQnat such
that the answer to Qnat can be obtained by running SQnat on the underlying
ontology O.</p>
      <p>The starting point is training set containing a number of pairs hQnat; GQnat i,
where Qnat is a natural language question, and GQnat is a SPARQL query,
called the gold query. The gold query is a SPARQL query that models (i.e.,
allows to retrieve from O) the answers to Qnat. The training set has to be
used to learn how to answer questions posed in natural language using O, so
that, given a question in natural language Qnat, the QA system can
generate a query SQ0nat that is equivalent to the gold query GQnat for Qnat, i.e.,
such that answers(SQ0nat ) = answers(GQnat ).1 In particular, we approach this
problem as a machine translation task, that is we compute SQ0nat as SQ0nat =
T ranslate(Qnat), where T ranslate is the translation function implemented by
our QA System, called sparql-qa.</p>
      <p>Most of the solutions currently proposed to convert from natural language to
SPARQL language make use of various techniques, either using patterns or deep
neural networks. In any machine translation technique, the de nition of input
and output vocabularies is necessary, which, working with natural language, can
become large enough to be a real problem when undertaking the translation
task. This large size directly a ects systems based on neural networks because
they depend on a training set that allows networks to generalize a given domain.
Obtaining a good dataset that includes all the words and names in the English
language and includes all DBpedia resources is a task with a high level of di
culty. The datasets currently available comprise only a part of the vocabulary,
generating a problem of Words Out Of Vocabulary (WOOV) that a ects both
the input and the output.</p>
      <p>Systems a ected by the WOOV problem have di culty dealing with words
not seen during the training phase because they do not know how to map those
words to the output vocabulary. For example, let's assume we have a training
set containing the "Abraham Lincoln" words and a system trained on it. If we
want to translate the question When Abraham Lincoln was born? ; the system
will be able to identify the right KB resource, but on the other hand, the system
will fail to translate a question using the same pattern, but changing "Abraham
Lincoln" by something not present in the vocabulary, let say "Barack Obama".</p>
      <p>To reduce the impact of the WOOV and to boost the training time of the
entire process, we will introduce in the next subsection a suitable format to
represent an NL to SPARQL datasets that we call QQT format.
1 Note that we are interested in computing the answers, and not in reproducing
syntactically the gold query.
In general, NL to SPARQL datasets are composed of a set of pairs hQnat; GQnat i.
In such a common type of representation, the named entities found in the
question are typically represented directly by their URIs in the SPARQL query, but
this transformation is hard to learn from mere examples, and the trained system
would fail if the transformation can not be described as simple rules. This is an
issue, especially in large ontologies, where there is a huge number of resources.</p>
      <p>A dataset in QQT is composed of a set of triples in the form hQuestion;
QueryT emplate; T aggingi, where Question is a natural language question, and
T agging marks which parts of Question are entities, and QueryT emplate is a
SPARQL query template with the following modi cations: (i) The KB resources
are replaced by one or more variables; (ii) A new triple is added for each variable
in the form "?var rdfs:label placeholder". P laceholders are meant to be replaced
by substrings of Question depending on T agging.</p>
      <p>In Table 1 we show an example of a hQnat; Qsparqli pair for the question
Who painted the Mona Lisa?, while Table 2 shows the corresponding hQuestion;
QueryT emplate; T aggingi triple in the QQT format.</p>
      <p>In table 2 the term $1 denotes a placeholder, where 1 means that it has to
be replaced by the rst entity occurring in the question, that is Mona Lisa as
represented by B and I in Tagging. Note that, in the QQT format, the query
template does not contain any DBpedia resource, thus the learning model (which
is the neural network in our case) does not need to understand that Mona Lisa
stands for the dbr:Mona Lisa resource and the QueryT emplate is exactly the
same for all questions asking the author of a given artwork.
Although we can reduce the size of the output vocabulary by creating a QQT
dataset, there is still a problem with the input vocabulary because there may be
many absent words. This problem causes the model not to learn how to
translate those OOV words because they were not seen during the training process.
Question QueryTemplate Tagging
Who painted the select ?a where
Mona Lisa? f ?w dbo:author ?a. O O O B I O</p>
      <p>
        ?w rdfs:label $1 g
To address this problem, we used the pre-trained word embeddings provided by
the FastText[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] library, allowing us to have access to thousands of embeddings
vectors learned over millions of words, becoming a positive aspect because to
obtain something similar, it is necessary a lot of time and computational resources.
FastText can provide a word-embedding of a token even if it was not part of the
vocabulary used to train the vectors, making it possible to manipulate OOV
words.
2.3
      </p>
      <sec id="sec-3-1">
        <title>The Model</title>
        <p>
          Our approach consists of two deep neural networks, the rst one specialized in
Neural Machine Translation (NMT) based on the well-known Seq2Seq[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] model
and the second one used for extracting the entities from the question using the
Named Entity Recognition (NER) technique.
        </p>
        <p>
          Neural Machine Translation The network focused on NMT is used to
translate the question into a SPARQL QueryT emplate. The network is based on an
Encoder-Decoder model with Luong's attention [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], in which the Encoder
extracts semantic content from the question in natural language and encodes it
into a xed-dimensional vector representation V . Instead, the Decoder tries to
decode V into a sequence in the output language (QueryT emplate).
        </p>
        <p>The Encoder is composed of an input layer that receives a question in natural
language converted into a sequence of word-embeddings obtained by mean of
FastText, in the form fx1; x2; :::; xtg, where xt is the vector representation of
the word t in the sentence. Next, we use a Bidirectional LSTM (BiLSTM) to
summarize fx1; x2; :::; xtg into V , in forward and reverse orders. V is formed by
concatenating the last hidden states in the two directions.</p>
        <p>On the other hand, during the training process, the Decoder is responsible
for calculating the word-embeddings of the output language tokens (SPARQL),
which is used together with the vector V , provided by the Encoder, as input
to a Luong-Decoder layer. This layer is responsible for decoding the sentence
supported by the attention mechanism. Finally, the values are feed to a Fully
Connected Network with a Softmax activation function that predicts the output
sequence by calculating the conditional probability over the output vocabulary.
Figure 1 shows the described network architecture.</p>
        <p>
          Named Entity Recognition To perform the entity recognition, we created
a BiLSTM-CRF [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] network that constitutes state-of-the-art for this type of
task. In this case, we again used FastText to obtain the word-embeddings and
deal with OOV words. The model is composed of an input layer that receives the
sequences of embeddings, followed by a BiLSTM connected to a Fully Connected
layer. Finally, the information ows through a CRF layer that predicts the nal
sequence of tags. Figure 2 shows the described network architecture.
        </p>
        <p>Finally, we mixed the results of both networks to obtain the nal query SQ0nat .
Here, the placeholders in the QueryT emplate are replaced by the corresponding
entities obtained with the NER network.
We report here an empirical assessment of our approach.</p>
        <p>Experiment Setup. We have implemented our models by using Keras, a
wellknown framework for machine learning, on top of TensorFlow.We trained the</p>
        <p>Mon300 Mon600</p>
        <p>P R F1 P R F1
NSpM 0.860 0.861 0.852 0.929 0.945 0.932
sparql-qa 0.78 0.78 0.78 0.791 0.791 0.791
networks by using Google Collaboratory, which is a virtual machine
environment hosted in the cloud and based on Jupyter Notebooks. The environment
provides 12GB of RAM and connects to Google Drive. We considered a
wellknown publicly available dataset for QA over the DBpedia ontology: the
Monument dataset. To assess the systems, we adopted the macro precision, recall, and
F1-score measures, which are the most used ones to assess this kind of system.
3.1</p>
      </sec>
      <sec id="sec-3-2">
        <title>Evaluation on Monument dataset</title>
        <p>
          The Monument dataset was proposed as part of the Neural SPARQL Machines
(NSpM) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] research. It contains 14,778 question-query pairs about the instances
of type monument present in DBpedia.
        </p>
        <p>
          For the sake of comparison with the state-of-the-art, we have trained the
Learner Module of NSpM as it was done in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], where the authors proposed
two instances of the Monument dataset that we will denote by Monumet300 and
Monument600 containing 8,544 and 14,788 pairs, respectively. In both cases, the
dataset split xes 100 pairs for both validation and test set and keeps the rest for
the training set. All the data is publicly available in the NSpM GitHub project.2
To train our system, we rst performed hyperparameter tuning focused on three
metrics: embedding-size of the target language, batch size, and LSTM hidden
units. The task was performed by using a grid search method. We set the number
of epochs to 5, shu ing the dataset at the end of each one. After tuning, we
set the hyperparameters of the two networks as follows: embedding-size is set
to 300, LSTM hidden units are set to 96, and batch size is set to 64. From the
results of the execution reported in Table 3, we can see that our system performs
reasonably well, reaching F1-score values greater than 0.7. On the other hand,
NSpM achieves better results.
        </p>
        <p>We have investigated the cases in which our system could not provide an
optimal answer, and we discovered that the performance of our approach is mainly
a ected by problems in the dataset. We found a set of questions that lacks
context to determine speci c expected URIs. For example, for the question \What is
Washington Monument related to?" our system uses \Washington Monument",
but the gold query uses the speci c URI: Washington Monument (Baltimore).
Note that there is no reference to Baltimore in the question text, and there
are Washington Monuments also in Milwaukee and Philadelphia, according to
DBPedia. Surprisingly, NSpM can often use the speci c URI of the gold query.
2 https://github.com/LiberAI/NSpM/tree/master/data
Thus, we decided to devise a tougher experiment to better understand the issue.
We used the templates provided by NSpM and a randomly selected set of unseen
monument entities extracted from DBpedia to create a new test set of 200 pairs.
The results reported in Table 4 show that our approach con rms the same good
performance (F1 score greater than 0.78), demonstrate to be capable of better
generalizing power being basically resilient to the presence of unseen entities,
and also performs better than NSpM.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Pattern-based. The idea of employing query patterns for mapping questions to
SPARQL-queries was already exploited in the literature [
        <xref ref-type="bibr" rid="ref15 ref17">15, 17</xref>
        ]. The approach
presented by Pradel and Ollivier [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] also adopts named entity recognition but
applies a set of prede ned rules to obtain all the query elements and their
relationships. The approach by Steinmetz et. al [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] has 4 phases, rstly, the
question is parsed and the main focus is extracted, then general queries are generated
from the phrases in natural language according to prede ned patterns and
nally make a subject-predicate-object mapping of the general question to triples
in RDF. Despite both of the above-mentioned approaches performed well in
selected benchmarks, they rely on patterns and rules de ned manually for all
existing types of questions. A limit that is not present in our proposal.
Deep Learning-based. In the Seq2SQL approach [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] an LSTM Seq2Seq model
is used to translate from natural language to SQL queries. The interesting thing
about this approach is that they use Reinforcement Learning to guide the
learning. The usage Encoder-Decoder model based in LSTM with an attention
mechanism to associate a vocabulary mapping between natural language and SPARQL
was proposed also in the literature [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] obtaining good results.
      </p>
      <p>
        The Neural SPARQL Machines (NSpM) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] approach is based on the idea of
modifying the SPARQL queries to treat them as a foreign language. To achieve
this, they encoded the brackets, URIs, operators, and other symbols, making the
tokenization process easier. The resulting dataset was introduced in a Seq2Seq
model responsible for performing the question-query mapping. The same authors
created the DBNQA dataset [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and their model was tested on a subdomain
referring to monuments and evaluated using the purely syntactic BLEU score [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
As a consequence, it performs well in reproducing the syntax of the gold query
but is less able to generalize to unseen natural language questions and OOV
words when compared with our approach.
      </p>
      <p>
        The query building approach by Chen et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] features two stages. The rst
stage consists of predicting the query structure of the question and leverages
the structure to constrain the generation of the candidate queries. The second
stage performs a candidate query rank. As in our approach, Chen et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] uses
BiLSTM networks, but query representation is based on abstract query graphs.
      </p>
      <p>
        Also, we report that eight di erent models based on RNNs and CNNs were
compared by Yin and colleagues [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. In this large experiment, the ConvS2S [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
model proved to be the best.
      </p>
      <p>
        For completeness, we studied another related line of work that aims to
translate the natural language questions into SQL queries. The work proposed by Yu
et. al [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] introduces a large-scale, complex, and cross-domain semantic parsing
and text-to-SQL dataset. To validate the work contribution, they used the
proposed dataset to train di erent models to convert text to SQL queries. Most of
the models were based on a Seq2Seq architecture with attention, demonstrating
an adequate performance. Another interesting case of study is the editing-based
approach for text-to-SQL generation introduced by Zhang et. al [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. They
implement a Seq2Seq model with Luong's attention, using BiLSTMs and BERT
embeddings. The approach demonstrates to perform well on SParC and Spider
datasets, outperforming the related work in some cases.
      </p>
      <p>Our architecture addresses many of the issues connected with the translation
resorting to speci c tools, an aspect that is not present in mentioned works.
Moreover, existing approaches based on NMT do nothing special to deal with
OOV words.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>The paper presents an approach based on deep neural networks to interrogate
knowledge bases by using natural language. We exploit the strength of several
well-known NLP tools and pose a special focus on reducing the target vocabulary
of the NMT task and attenuating the impact of the OOV words, an important
issue that is not well considered in existing approaches. Our system showed
competitive results on the Monument dataset and demonstrated a more general
and robust behavior on unseen questions among the compared system. In future
work, we plan to extend our system to improve translation performance by
integrating other NLP tools, such as Named Entity Linking and BERT contextual
word embeddings. We also plan to run our experiments to other well-known QA
benchmarks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.:</given-names>
          </string-name>
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>arXiv preprint arXiv:1409.0473</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>TACL 5</source>
          ,
          <issue>135</issue>
          {
          <fpage>146</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hua</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qi</surname>
          </string-name>
          , G.:
          <article-title>Formal query building with query structure prediction for complex question answering over knowledge base</article-title>
          .
          <source>In: IJCAI</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>V. M.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Gulcehre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Bahdanau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Bougares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Schwenk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>Learning phrase representations using rnn encoder-decoder for statistical machine translation</article-title>
          .
          <source>arXiv:1406.1078</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Francois</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Deep learning with Python</article-title>
          . Manning Publications Company (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gehring</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grangier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yarats</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dauphin</surname>
            ,
            <given-names>Y.N.</given-names>
          </string-name>
          :
          <article-title>Convolutional sequence to sequence learning</article-title>
          .
          <source>In: ICML. Proc. of ML Research</source>
          , vol.
          <volume>70</volume>
          , pp.
          <volume>1243</volume>
          {
          <fpage>1252</fpage>
          .
          <string-name>
            <surname>PMLR</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gruber</surname>
            ,
            <given-names>T.R.</given-names>
          </string-name>
          :
          <article-title>Toward principles for the design of ontologies used for knowledge sharing? Int</article-title>
          .
          <string-name>
            <given-names>J.</given-names>
            <surname>Hum</surname>
          </string-name>
          .-Comput. Stud.
          <volume>43</volume>
          (
          <issue>5-6</issue>
          ),
          <volume>907</volume>
          {
          <fpage>928</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hartmann</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marx</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soru</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Generating a large dataset for neural question answering over the DBpedia knowledge base (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Recurrent neural net learning and vanishing gradient</article-title>
          .
          <source>Intern. Jour. Of Uncert., Fuzz. and KB Systems</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <volume>107</volume>
          {
          <fpage>116</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bidirectional LSTM-CRF models for sequence tagging</article-title>
          .
          <source>CoRR abs/1508</source>
          .
          <year>01991</year>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isele</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jentzsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morsey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Kleef</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>Dbpedia{a large-scale, multilingual knowledge base extracted from wikipedia</article-title>
          .
          <source>Semantic Web</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <volume>167</volume>
          {
          <fpage>195</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>E ective approaches to attention-based neural machine translation</article-title>
          .
          <source>arXiv preprint arXiv:1508.04025</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Luz</surname>
            ,
            <given-names>F.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semantic parsing natural language into SPARQL: improving target language representation with neural attention</article-title>
          . CoRR abs/
          <year>1803</year>
          .04329 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Panchbhai</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soru</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marx</surname>
          </string-name>
          , E.:
          <article-title>Exploring sequence-to-sequence models for sparql pattern composition</article-title>
          .
          <source>In: Iberoamerican Knowledge Graphs and Semantic Web Conference</source>
          . pp.
          <volume>158</volume>
          {
          <fpage>165</fpage>
          . Springer (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Pradel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haemmerle</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hernandez</surname>
          </string-name>
          , N.:
          <article-title>Natural language query interpretation into sparql using patterns (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Soru</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marx</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moussallem</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Publio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valdestilhas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esteves</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neto</surname>
            ,
            <given-names>C.B.</given-names>
          </string-name>
          :
          <article-title>SPARQL as a foreign language</article-title>
          .
          <source>SEMANTiCS 2017 - Posters and Demos</source>
          (
          <year>2017</year>
          ), https://arxiv.org/abs/1708.07624
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Steinmetz</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arning</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>From natural language questions to SPARQL queries: A pattern-based approach</article-title>
          .
          <source>In: BTW. LNI</source>
          , vol. P-
          <volume>289</volume>
          , pp.
          <volume>289</volume>
          {
          <fpage>308</fpage>
          . Gesellschaft fur Informatik,
          <source>Bonn</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinyals</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q.V.</given-names>
          </string-name>
          :
          <article-title>Sequence to sequence learning with neural networks</article-title>
          .
          <source>In: NIPS</source>
          . pp.
          <volume>3104</volume>
          {
          <issue>3112</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. W3C:
          <article-title>Semantic web standards (</article-title>
          <year>2014</year>
          ), https://www.w3.org
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Yin</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gromann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Neural machine translating from natural language to SPARQL</article-title>
          . CoRR abs/
          <year>1906</year>
          .09302 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Yasunaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            ,
            <surname>Roman</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , et al.:
          <article-title>Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task</article-title>
          . arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>08887</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Er</surname>
            ,
            <given-names>H.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xue</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>X.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Editing-based sql query generation for cross-domain contextdependent questions</article-title>
          . arXiv preprint arXiv:
          <year>1909</year>
          .
          <volume>00786</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Zhong</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
          </string-name>
          , R.: Seq2sql:
          <article-title>Generating structured queries from natural language using reinforcement learning</article-title>
          .
          <source>CoRR abs/1709</source>
          .00103 (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>