<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Answering Natural Language Questions with Intui3</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Corina Dima</string-name>
          <email>corina.dima@uni-tuebingen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Seminar fur Sprachwissenschaft, University of Tubingen</institution>
          ,
          <addr-line>Wilhemstr. 19, 72074 Tubingen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>1201</fpage>
      <lpage>1211</lpage>
      <abstract>
        <p>Intui3 is one of the participating systems at the fourth evaluation campaign on multilingual question answering over linked data, QALD4. The system accepts as input a question formulated in natural language (in English), and uses syntactic and semantic information to construct its interpretation with respect to a given database of RDF triples (in this case DBpedia 3.9). The interpretation is mapped to the corresponding SPARQL query, which is then run against a SPARQL endpoint to retrieve the answers to the initial question. Intui3 competed in the challenge called Task 1: Multilingual question answering over linked data, which o ered 200 training questions and 50 test questions in 7 different languages. It obtained an F-measure of 0.24 by providing a correct answer to 10 of the test questions and a partial answer to 4 of them.</p>
      </abstract>
      <kwd-group>
        <kwd>information retrieval</kwd>
        <kwd>question answering</kwd>
        <kwd>linked data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Keyword-based search is the dominant search paradigm today. It permits
computers to sift through the massive amounts of unstructured information available
on the web and provide the users a ranked list of pages where the target
information might be found. However, as remarked in the literature [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], keyword-oriented
search has a major drawback: it does not readily provide an answer, but a list of
documents where the answer might be found. The user has to manually inspect
each of the provided pages in order to nd the actual answer.
      </p>
      <p>The system described in this paper, Intui3, belongs to an alternative search
paradigm, that takes a question formulated in natural language and provides a
precise answer to it. The system accepts as input a syntactically correct natural
language question and constructs its interpretation using syntactic and
semantic cues in the question and a target triple store. The construction of a question
interpretation is guided by Frege's Principle of Compositionality, namely the
interpretation of a complex expression is determined by the interpretation of its
constituent expressions and the rules used to combine them. The system uses
an ontology, a predicate index and an entity index to construct the
interpretations. In the current version all these components are related to the DBpedia
3.9 knowledge base, but with little e ort they can be replaced, thus allowing the
system to construct interpretations with respect to other knowledge bases.
1.1</p>
      <p>
        Previous work in Question Answering on structured data
Natural language questioning answering systems that serve as an interface to
structured databases have a long history, going back to systems like
BASEBALL [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and LUNAR [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] which were able to correctly process and answer
natural language questions, but only on a very limited domain. More recent
efforts ([
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]) have focused on learning to map a complex question to a logical
form using an existing lexicon for connecting words to corresponding predicates
in a database of facts. These systems focused on the depth of the analysis rather
than on the breadth of the domain. Recently, however, there has been a shift from
question answering on small databases to question answering on large knowledge
bases ([
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) or to open-domain question answering ([
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). These approaches
address the problem of automatically mapping a word sequence to a predicate
in the database, as well as the one of constructing and combining possible
interpretations. The system Intui3 follows in this trend, as it constructs detailed
semantic interpretations for natural language questions by taking into account
syntactic and semantic cues in the question and connecting them to the facts
available in a speci ed triple store.
      </p>
      <p>The current paper continues with Section 2 which gives an overview of the
main third party resources used by Intui 3. Section 3 details all the steps required
to construct a SPARQL interpretation of a natural language question. Section 4
presents the results of Intui3 on the training and test sets provided by the
organizers of the QALD41 challenge in Task 1, Multilingual question answering over
linked data. A discussion of issues that a ect the interpretation capabilities of
the system and possible ways to address them follows in Section 5, which also
concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Resources</title>
      <p>Intui3 makes use of a series of third party resources to construct the
interpretation of natural language questions. This section brie y introduces each of them.
They are further referenced in Section 3 when describing the interpretation
process.</p>
      <p>
        Two natural language processing (NLP) suites are used: SENNA [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and
Stanford CoreNLP [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. SENNA is a deep neural networks-based system that outputs
a host of NLP predictions: part-of-speech (PoS) tags, chunking, name entity
recognition (NER), semantic role labeling and syntactic parsing. SENNA's main
interest points are its very high processing speed and its state-of-the-art
performance. Intui3 uses SENNA (v3.0) for PoS tagging, chunking and performing
NER on the input question.
      </p>
      <p>The Stanford CoreNLP suite (v.3.2.0) is used in Intui3 for obtaining lemma
information for each token. Stanford CoreNLP is a exible NLP suite that o ers
multiple annotators such as PoS taggers, lemmatizers, named entity annotators,
sentiment and coreference annotators, as well as annotators for constituency and
1 QALD4 challenge website: http://www.sc.cit-ec.uni-bielefeld.de/qald/
dependency parsing. Multiple annotators can be combined and used to annotate
the same input text on various levels.</p>
      <p>The system combines the output from the two NLP suites. Initial tests on
the QALD4 training set showed that SENNA (v3.0) provides the correct PoS tag
labeling of a question more often than Stanford CoreNLP (v.3.2.0) does. SENNA
also readily provides the chunking information, which the system uses as a
support for constructing the semantic interpretation of the question. SENNA was
thus chosen as the main NLP processing suite of the system. SENNA, however,
does not o er lemma information, which was then obtained from the Stanford
CoreNLP suite.</p>
      <p>
        Intui3 uses a locally installed version of the DBpedia Lookup service2 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
DBpedia Lookup service provides an easy method for nding DBpedia URIs for
a given sequence of words. It is implemented as a Web service and it is based
on a Lucene index that provides a weighted label lookup. It combines string
similarity with relevance ranking in order to nd the most likely matches for a
given word or word sequence. To illustrate the advantage of using the DBpedia
Lookup service instead of simple string matching techniques we use Q21 from
the QALD4 Task 1 test set, Where was Bach born? A query for the string
Bach in the triple store directly would result in a multitude of pages referring
either to Bach himself, to his opera or to various other entities that are named
after Bach. The rst match returned by the DBpedia Lookup service is the URI
http://dbpedia.org/resource/Johann_Sebastian_Bach, which is, in the case
of Q21, the correct choice.
      </p>
      <p>We compiled a list of demonyms and their corresponding country by scraping
the information about demonyms that can be found on Wikipedia3. The list
contains about 600 pairs of the form (demonym, country). This information is
also available in DBpedia but the current version (DBpedia 3.9) contains only
440 concepts that are marked as demonym of a country.</p>
      <p>The system uses a predicate index obtained by selecting all the predicates in a
local DBpedia 3.9 install that was used for obtaining results for the QALD4
challenge. This index contains 49,714 predicates, most of them from the dbpedia.
org/property namespace (48, 294).</p>
      <p>
        Intui3 computes similarities between words in the question and predicates in
the triple store using the word similarity measures implemented in the WordNet
Similarity for Java4 (WS4J) library. All the similarity measures in the package
were tested using a set of pairs of the form (word, dbpedia predicate). The focus
of this test was to choose a similarity measure that: (i) is able to score pairs
of words with di erent PoS tags; (ii) provides a high score when the word and
the predicate are related and a low score when they are not. The measure that
ful lled both criteria was the Hirst &amp; St. Onge similarity measure [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. It is based
on an idea that two lexicalized concepts are semantically close if their WordNet
2 The code was obtained from https://github.com/dbpedia/lookup
3 List of demonyms scraped from http://en.wikipedia.org/wiki/Adjectivals_
and_demonyms_for_countries_and_nations
4 The code for WS4J was obtained from https://code.google.com/p/ws4j/
synsets are connected by a path that is not too long and that \does not change
direction too often".
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Interpreting a Natural Language Question</title>
      <p>This section gives a detailed overview of the steps made by Intui3 to interpret
a natural language question and to map it to a syntactically correct SPARQL
query. The interpretation process involves the following steps: rst, the
question is tokenized, PoS tagged, and the named entities are identi ed. Then the
question is split into chunks, and the systems assigns initial interpretations to
each chunk. The interpretation of the question is constructed by combining the
interpretations assigned to each chunk.</p>
      <p>We will use question with id 12, Q12: How many pages does War and Peace
have? from the QALD4 Task1 test set to explain in more detail the interpretation
process.</p>
      <p>Each question received by the system is pre-processed using a series of
standard NLP tools. The SENNA and Stanford CoreNLP suites are used to process
the question. The system stores information related to PoS tagging, NER and
chunking from the output produced by SENNA and lemma information from the
output of Stanford CoreNLP. Next, the system performs a demonym resolution
step by looking up each individual token in the question in a list of demonyms
and their associated countries (see Section 2 for more details). If a demonym
is found, such as the word German in the phrase German lake, the system
retrieves and stores the DBpedia URI identifying the referred country (in this case
http://dbpedia.org/resource/Germany). Table 1 shows the information
obtained in the pre-processing step for Q12. All the information obtained in the
pre-processing phase is stored using the tokens as a reference entity.</p>
      <p>Intui3 uses sentence chunks as a basis for constructing interpretations. That
is why establishing appropriate chunk boundaries is a crucial step towards
constructing the correct interpretation. To this end the system combines the chunk
information it has obtained from the SENNA chunker and from the SENNA
NER system. Depending on the returned chunks and their associated PoS
information, the system can choose to split the provided chunks, or to combine
several provided chunks into a larger chunk.</p>
      <p>An example of splitting an existing chunk is given in the last column of
Table 1, where the chunk [NP How many pages] recognized by the chunker is
further split into [WHADJP How many] and [NP pages].</p>
      <p>The converse situation of the system combining multiple chunks can be
illustrated using Q42 in the QALD4 Task1 test set, What is the o cial color of the
University of Oxford?. The phrase the University of Oxford is initially chunked
as [NP the University] [PP of ] [NP Oxford]. The NER system identi es in the
same phrase a named entity of type organization: the [ORG University of
Oxford]. Intui3 combines the information from the chunker and the NER system
and merges the three initial chunks into a single NP chunk, [NP the University
of Oxford].</p>
      <p>The processing continues with the interpretation of the chunks obtained in
the pre-processing step. The system analyses each chunk individually and assigns
one or more interpretations depending on the chunk's type and on the additional
semantic and syntactic information available for that chunk. An overview of the
types of interpretations that can be assigned by the system is presented below. In
all the descriptions, the terms subject, object and predicate refer to the elements
of an RDF triple.</p>
      <p>functional interpretations de ne a functor such as count, min or max ; such
interpretations are triggered by lexical cues (e.g. how many).
concept interpretations are used to model class membership via the rdf:type5
property; they are triggered by noun phrases that contain plural nouns
like languages and are mapped to RDF triples like (?answer, rdf:type,
http://dbpedia.org/ontology/Language). The set of reference classes is
de ned by the DBpedia ontology6. The DBpedia ontology de nes a
hierarchy of 529 classes that include both top-level concepts such as Person,
Organization, Event, Place, but also more re ned concepts such as Athlete,
Company, SportsEvent and Mountain
subject interpretations are used to map a word sequence to a subject triple as
in (http://dbpedia.org/resource/War_and_Peace, ?p, ?o); object
interpretations are used to map a word sequence to an object triple as in (?s,
?p, http://dbpedia.org/resource/War_and_Peace); such interpretations
are triggered by noun phrases such as War and Peace, the o cial color, etc.;
when a chunk that requires a subject or object interpretation is discovered
both subject and object interpretations are generated; this ensures that the
system has a chance to investigate all the triples that contain a particular
URI in a subject or object position; subject and object interpretations are
based on the output returned by the DBpedia Lookup service described in
Section 2; if the associated chunk was tagged with the NER label PER, the
5 http://www.w3.org/1999/02/22-rdf-syntax-ns#type
6 http://wiki.dbpedia.org/Ontology
system lters the results returned by the DBpedia Lookup service to have the
rdf:type dbo:Person, foaf:Person or yago:Person. Similarly, the rdf:type
of the chunks with the NER label LOC is restricted to dbo:Place. The MISC
and ORG NER labels are too coarse to create such restrictions.
empty interpretations are used to signal the fact that the mapped word
sequence does not have a meaning on its own; the triggers for empty
interpretations are function words such as me, I, do, does, etc.
predicate interpretations are used to map a word or word sequence in the
question to a predicate that is available in the triple store; predicate
interpretations are triggered either by verbs (e.g. spoken), common nouns (e.g.
pages, ingredients ) or noun phrases (e.g. programming languages, the o cial
color ); as opposed to subject, object or concept interpretations, the
predicate interpretations are not immediately resolved, as their interpretation
is highly dependent on the exact question context; instead, the predicate
interpretation stores a predicate pattern that is taken into account in the
interpretation process;
triple interpretations are used to map a sequence of words to an RDF triple;
subject, object and concept interpretations are all instances of triple
interpretations;
query interpretations are used to map a sequence of words to a SPARQL
query; query interpretations are the most complex type of interpretation that
can be constructed by the system, and are obtained through composition
from the other types of interpretations</p>
      <p>Each type of interpretation comes with a set of combination rules. These
rules de ne how the current interpretation can be combined with any other type
of interpretation and also construct the combined interpretation.</p>
      <p>Table 2 presents all the interpretations that have been initially assigned to
the chunks in our running example, Q12: How many pages does War and Peace
have? The next step involves traversing all the chunks in a right-to-left order
and combining the interpretations.</p>
      <p>IQ12 = [Ihow many; Ipages; Idoes; IW ar and P eace; Ihave]
(1)</p>
      <p>First, the interpretations of rightmost two chunks (IW ar and P eace and Ihave)
are combined by considering each possible pair of interpretations of the two
chunks in the order they appear in the sentence. The possible number of
interpretations is the product of the number of interpretations for each chunk. In our
case, jIW ar and P eacej = 10 and jIhavej = 1, so there are 10 1 = 10 possible
combinations. In this particular case, Ihave contains only the empty
interpretation, so the interpretations that result by combining IW ar and P eace and Ihave
are the interpretations in IW ar and P eace.</p>
      <p>For each subsequent step we combine the results of the previous combination
with the rightmost interpretation that has not yet been combined. In our case,
we combine Idoes with the result of combining IW ar and P eace and Ihave, which
was IW ar and P eace. Idoes contains again only the empty interpretation, so the
result of this combination is still only IW ar and P eace.</p>
      <p>In the next step we combine Ipages with the results of the previous
combinations, in our case IW ar and P eace. This time we have jIpagesj jIW ar and P eacej =
20 combined interpretations. The combination of a subject and a predicate
interpretation involves querying the triple store for all the triples with the given
subject and extracting a set of predicates that co-occurred with the subject. All
the extracted predicates in this set are scored with respect to the pattern
provided in the predicate interpretation. In our example, the system looks for all
the triples with the subject http://dbpedia.org/resource/War_and_Peace,
extracts the set of predicates it occurs with (25 di erent predicates), and scores
all of them with respect to the predicate pattern (page or pages).</p>
      <p>The scoring mechanism uses two scorers chained together: rst a string
similarity scorer, then a WordNet-based similarity scorer which uses the
HirstSt.Onge similarity measure (see Section 2). Chaining allows the system to look
rst for lexical similarities between the candidate predicates and the provided
pattern, and if the lexical similarity is too low, to look for similarity at the
semantic level. This allows the system to nd both the lexical similarity between
the predicate http://dbpedia.org/property/pages and the pattern pages and
the semantic similarity between http://dbpedia.org/property/spouse and
the pattern husband. The system collects all the candidate predicates that score
above a speci ed threshold and constructs query interpretations by combining
the predicate interpretations for Ipages with the subject/object interpretations
for IW ar and P eace. 13 query interpretations are constructed in this particular
case, one of them displayed in Listing 1.1.</p>
      <p>Listing 1.1. Final query for Q12: How many pages does War and Peace have?
SELECT DISTINCT ? answer
WHERE
f
g
&lt;http : / / dbpedia . org / r e s o u r c e / War and Peace&gt;
&lt;http : / / dbpedia . org / p r o p e r t y / pages&gt;
? answer .</p>
      <p>All the combined interpretations are scored by multiplying the scores of the
initial interpretations. The last step made by the system in the particular case
of Q12 is combining the interpretations it has obtained until this point with
Ihow many. The functional:count interpretation can only be combined with a
query interpretation and requires a numeric answer. To obtain the combined
interpretation, the system rst runs the query interpretation and retrieves the
answers. If there is only one answer that can be cast to a number, then the
combined interpretation will be the existing query interpretation. Otherwise, the
system constructs the SPARQL representation of the combined interpretation
by attaching a count clause to the existing query.</p>
      <p>The system outputs the query interpretation with the highest score as the
nal interpretation, or OUT OF SCOPE if the nal interpretation is an empty
interpretation. If the nal interpretation is any of the other types of
interpretation, then the system cannot construct a valid interpretation for that particular
question and thus cannot answer the question.</p>
      <p>In the case of Q12: How many pages does War and Peace have?, the system
chooses the query in Listing 1.1, which provides the correct answer, 1225 pages.
question mark does not have an interpretation as it was not assigned to a chunk.
Phrase
How many
pages
does
War and Peace NP</p>
      <p>NP
VP
Phrase Type Interpretation
WHADJP</p>
      <p>Functional:COUNT
Predicate: pattern=page
Predicate: pattern=pages
Empty
Subject [http://dbpedia.org/resource/War_and_Peace: 0.70]
Subject [http://dbpedia.org/resource/War_and_Peace_(opera): 0.33]
Subject [http://dbpedia.org/resource/Paris_Peace_Conference,_1919: 0.29]
Subject [http://dbpedia.org/resource/Peace_and_conflict_studies: 0.23]
Subject [http://dbpedia.org/resource/Peace_movement: 0.20]
Object [http://dbpedia.org/resource/War_and_Peace: 0.70]
Object [http://dbpedia.org/resource/War_and_Peace_(opera): 0.33]
Object [http://dbpedia.org/resource/Paris_Peace_Conference,_1919: 0.29]
Object [http://dbpedia.org/resource/Peace_and_conflict_studies: 0.23]</p>
      <p>Object [http://dbpedia.org/resource/Peace_movement: 0.20]
have</p>
      <p>VP</p>
      <p>Empty
4</p>
    </sec>
    <sec id="sec-4">
      <title>System Results in QALD4: Task1</title>
      <p>Intui3 was evaluated on the training and test sets o ered by the organizers of
QALD4 in Task 1, Multilingual question answering over linked data. Although
the organizers o er questions in seven languages, Intui3 can only interpret
questions written in English.</p>
      <p>The results of the system on the training and the test set are summarized in
There are several types of issues that prevented the system from having a better
coverage with respect to the provided datasets. We list them below, together
with possible methods for alleviating them:</p>
      <p>
        The system relies heavily on the part-of-speech tagging information when it
chooses a possible interpretation for a chunk. The prediction of a
part-ofspeech tagger can, however, be incorrect, especially in the case of questions.
For example in the case of Q7, Does the Isar ow into a German lake? the
part-of-speech tagger assigned the label NN to the verb ow, thus preventing
a correct chunking ([the Isar ow] was analyzed as one NP chunk). Such
mistakes might seem unlikely from a system like SENNA that reports an
per-word accuracy of 97.29%. But, as noted in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], although PoS tagging is
considered a \solved" problem due to systems that report over 97% per-word
accuracy, the same systems only achieve 55 to 57% per-sentence accuracy.
A better performance on PoS tagging questions might be obtained by
retraining the PoS tagger using both question and non-question data. Such
approaches have been shown to signi cantly improve parsing performance
on question data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The purely right-to-left method of combining the interpretations does not
always provide the best method of constructing the question's interpretation.
For example in the case of Q11, Give me all animals that are extinct, the
interpretation should start from the left rather than from the right. The
motivation for combining the interpretations using the such heuristics instead
of the output of a parser is given by our previous experience in the QALD3
challenge. There our system [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] used the output of a constituency parser to
construct interpretations. However, as the PoS tags were often incorrect, the
parser would have no possibility to construct the correct parse and the
system would fail to provide any interpretations. The SENNA chunker, on the
other hand, provided to be more robust: in some cases the system generated
an incorrect PoS tag for a word in the chunk, but the chunk itself was
correctly labeled. The heuristic for combining interpretations can be improved
by constructing interpretations in both directions and choosing the overall
best scoring one. Another avenue worth investigating is to use the output of
a dependency parser as a method for choosing the order in which the chunk
interpretations should be combined.
      </p>
      <p>In the current system the algorithm for re ning chunk boundaries uses
handwritten rules; a better solution would be to develop a dedicated chunker that
can provide the correct chunk boundaries for constructing interpretations.
The test set included many questions that have noun phrases containing
superlative adjectives (e.g. the tallest player, the most books, the youngest
Darts player, etc.). The current system is not equipped to construct an
interpretation for such cases. Such phrases are correctly interpreted only if a
dedicated predicate for that phrase exists in the knowledge base (e.g. the
predicate largestCity for an entity of type country in Q31).</p>
      <p>The system is not currently equipped to handle questions that require a
Yes/No answer.</p>
      <p>Intui3 is restricted to answering questions in English. Porting the system to
work with other languages is possible, although restricted to those languages
that have all the NLP tools required by Intui3 (PoS tagger, NER system,
chunker, lemmatizer, a wordnet and the corresponding similarity measures). Another
necessary resource is an instance of the DBpedia Lookup service customized for
the target language.</p>
      <p>
        We plan to further improve the system by taking into account all the issues
presented above. Another goal is to test the adaptability of the system by trying
to answer questions with respect to other freely available knowledge bases, using
already existing sets of questions and answers like the ones described in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>Acknowledgments. The research leading to these results has received
funding from the German Research Foundation (DFG) as part of the Collaborative
Research Center `Emergence of Meaning' (SFB 833). The author would like to
thank Emanuel Dima for his comments on the initial version of the paper, as
well as the anonymous reviewers for their suggestions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Berant</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frostig</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Semantic parsing on freebase from question-answer pairs</article-title>
          .
          <source>In: Proceedings of EMNLP</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Becker</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Dbpedia-a crystallization point for the web of data</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <volume>154</volume>
          {
          <fpage>165</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Collobert</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weston</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karlen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kavukcuoglu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuksa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Natural language processing (almost) from scratch</article-title>
          .
          <source>The Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2493</volume>
          {
          <fpage>2537</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dima</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Intui2: A prototype system for question answering over linked data</article-title>
          .
          <source>Proceedings of the Question Answering over Linked Data lab (QALD-3)</source>
          at CLEF (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Search needs a shake-up</article-title>
          .
          <source>Nature</source>
          <volume>476</volume>
          (
          <issue>7358</issue>
          ),
          <volume>25</volume>
          {
          <fpage>26</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Fader</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Paraphrase-driven learning for open question answering</article-title>
          .
          <source>In: ACL (1)</source>
          . pp.
          <volume>1608</volume>
          {
          <issue>1618</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Finkel</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grenager</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Incorporating non-local information into information extraction systems by gibbs sampling</article-title>
          .
          <source>In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics</source>
          . pp.
          <volume>363</volume>
          {
          <fpage>370</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Green</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.F.</given-names>
            ,
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.K.</given-names>
            ,
            <surname>Chomsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Laughery</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Baseball: an automatic question-answerer</article-title>
          .
          <source>In: Papers presented at the May 9-11</source>
          ,
          <year>1961</year>
          ,
          <article-title>western joint IREAIEE-ACM computer conference</article-title>
          . pp.
          <volume>219</volume>
          {
          <fpage>224</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>1961</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hirst</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>St-Onge</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Lexical chains as representations of context for the detection and correction of malapropisms</article-title>
          .
          <source>WordNet: An electronic lexical database 305</source>
          , 305{
          <fpage>332</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Judge</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cahill</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Genabith</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          : Questionbank:
          <article-title>Creating a corpus of parseannotated questions</article-title>
          .
          <source>In: Proceedings of the 21st International Conference on Computational Linguistics</source>
          and
          <article-title>the 44th annual meeting of the Association for Computational Linguistics</article-title>
          . pp.
          <volume>497</volume>
          {
          <fpage>504</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Part-of-speech tagging from 97% to 100%: is it time for some linguistics</article-title>
          ?
          <source>In: Computational Linguistics and Intelligent Text Processing</source>
          , pp.
          <volume>171</volume>
          {
          <fpage>189</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>L.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
            ,
            <given-names>R.J.:</given-names>
          </string-name>
          <article-title>Using multiple clause constructors in inductive logic programming for semantic parsing</article-title>
          .
          <source>In: Machine Learning: ECML</source>
          <year>2001</year>
          , pp.
          <volume>466</volume>
          {
          <fpage>477</fpage>
          . Springer (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Unger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Buhmann, L.,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngonga</surname>
            <given-names>Ngomo</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.C.</given-names>
            ,
            <surname>Gerber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Template-based question answering over rdf data</article-title>
          .
          <source>In: Proceedings of the 21st international conference on World Wide Web</source>
          . pp.
          <volume>639</volume>
          {
          <fpage>648</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Woods</surname>
            ,
            <given-names>W.A.</given-names>
          </string-name>
          :
          <article-title>Progress in natural language understanding: an application to lunar geology</article-title>
          .
          <source>In: Proceedings of the June 4-8</source>
          ,
          <year>1973</year>
          , national computer conference and exposition. pp.
          <volume>441</volume>
          {
          <fpage>450</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>1973</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Zelle</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
          </string-name>
          , R.J.:
          <article-title>Learning to parse database queries using inductive logic programming</article-title>
          .
          <source>In: Proceedings of the National Conference on Arti cial Intelligence</source>
          . pp.
          <volume>1050</volume>
          {
          <issue>1055</issue>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.S.</given-names>
          </string-name>
          , Collins,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Online learning of relaxed ccg grammars for parsing to logical form</article-title>
          .
          <source>In: In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing</source>
          and
          <string-name>
            <surname>Computational Natural Language Learning (EMNLP-CoNLL-2007. Citeseer</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>