=Paper=
{{Paper
|id=Vol-1180/CLEF2014wn-QA-XuEt2014
|storemode=property
|title=Answering Natural Language Questions via Phrasal Semantic Parsing
|pdfUrl=https://ceur-ws.org/Vol-1180/CLEF2014wn-QA-XuEt2014.pdf
|volume=Vol-1180
|dblpUrl=https://dblp.org/rec/conf/clef/XuFZ14
}}
==Answering Natural Language Questions via Phrasal Semantic Parsing==
    Xser@QALD-4: Answering Natural Language
      Questions via Phrasal Semantic Parsing
                   Kun Xu, Yansong Feng, and Dongyan Zhao
            Institute of Computer Science Technology, Peking University.
              Zhongguancun North Road 128#, Beijing, 100871, China.
                  {xukun, fengyansong, zhaodongyan}@pku.edu.cn
       Abstract. We present a question answering system (Xser) over Linked
       Data(DBpedia), converting users’ natural language questions into struc-
       tured queries. There are two challenges involved: recognizing users’ query
       intention and mapping the involved semantic items against a given knowl-
       edge base (KB), which will be in turn assembled into a structured query.
       In this paper, we propose an efficient pipeline framework to model a
       user’s query intention as a phrase level dependency DAG which is then
       instantiated according to a given KB to construct the final structured
       query. We evaluate our approach on the QALD-4 test dataset and achieve
       an F-measure score of 0.72, an average precision of 0.72 and an average
       recall of 0.71 over 50 questions.
       Keywords: Phrasal Semantic Parsing, Question Answering, Knowledge
       Base
1    Introduction
As very large structured knowledge bases have become available, e.g.,YAGO [10],
DBpedia [16] and Freebase[15], answering natural language questions over struc-
tured knowledge facts has attracted increasing research efforts, in both natural
language processing and information retrieval communities. Different from key-
word based retrieval, the structure of query intentions embedde in a user’s ques-
tion can be represented by a set of predicate-argument structures, e.g.,  triples, and effectively retrieved by a structured query engine
from a given knowledge base. Generally, the main challenge of understanding
the query intention in a structural form is to solve two tasks: recognizing the
predicate-argument structures and then instantiating these structures regarding
a given KB.
    Intuitively, the two subtasks would be solved in a joint framework, e.g., [22]
proposed a PCFG-based semantic parser to simultaneously learn the combina-
tion rules among words or phrases and the mappings to specific KB components.
However, given the size of existing KBs (usually thousands of predicates, mil-
lions of entities and billions of knowledge facts), it makes difficult to jointly train
such a PCFG-based parser (the model of [22] takes several days to train with
                                         1260
                                                                  phrase detecting
                        [which]       [country] did         [france]    [colonise]
                        variable      category               entity      relation
                                                                             parsing
                                                   PO
                                         SC                       SP
                       [which]v      [country]c did       [france]E [colonise]R
                                                                           instanting
                                                   PO
                                        SC                       SP
                       [which]v     [country]c did        [france]E [colonise]R
                         ?x       fb:location.country    fb:en.france   fb:colonise
                         select ?x where
                         {
                           ?x fb:object.type fb:location.country
                                fb:en.france fb:colonise ?x
                         }
Fig. 1. An example of converting a natural language question into a structured query
via phrasal semantic parsing.
3,000 sentences), and even more difficult to adapt to other KBs, let alone re-
trieving multiple KBs within one query, e.g., some queries in the QALD task[17]
are mixed with predicates from both DBpedia and Yago. In contrast, we find
that recognizing the query intention structure is usually KB-independent. Take
Figure 1 as an example, without grounding to a knowledge base, we can still
guess that a location called france has some relationship, indicated by the verb
“colonise”, with some countries, (the queried objects), which can be learned
directly without reliance on a specified KB. On the other hand, the task of map-
ping semantic phrases from the intention structures to items in a given KB and
producing the final structured queries is KB-dependent, since one has to solve
these mappings according to the schema of a specified KB.
    Given the observations above, we thus assume that the structure of a ques-
tion’s query intention can be learned independent from a specific knowledge base,
while grounding and converting a query intention into a structured query is de-
pendent on a knowledge base. Our assumption will naturally lead to a pipeline
paradigm to translating a natural language question into a structured query,
which can then be directly retrieved by a structured database query engine, e.g.,
Virtuoso 1 .
    In this paper, we deal with the task of understanding natural language ques-
tions in a pipeline paradigm, involving mainly two steps: recognizing the query
intention structure inherent in the natural language questions, and then instan-
tiating the query intention structures by mapping the involved semantic items
into existing KBs. In the first phase, we build a phrase detector to detect possible
1
    http://www.virtuoso.com
                                                        1261
semantic phrases, e.g., variables, entity phrases, category phrases and relation
phrases. We then develop a semantic parser to predict the predicate-argument
structures among phrases to represent the structure of query intentions. In the
second phase, given the intention structures, we are then able to adopt a struc-
tured perceptron model to jointly solve the mappings between semantic phrases
and KB items. By taking a two-phase format, our proposed model can benefit
from the separation of KB related components and KB independent steps, and
recognize the intention structures more efficiently while making the KB-related
component flexible, e.g., we can only retrain the second phase when adapting
to new KBs, which is similar in sprite with [23], who rely on a CCG parser
to produce an ontological-independent logical representation to express users’
intention. We evaluate our model on the QALD-4, and achieve an F-measure
score of 0.72, an average precision of 0.72 and an average recall of 0.71 over 50
questions.
2     The Task
We define the task of using a KB to answer natural language questions as follows:
given a natural language question qN L and a knowledge base KB, our goal is to
translate qN L into a structured query in certain structured query language, e.g.,
SPARQL, which consists of multiple triples: a conjunction of  search conditions.
3     Recognizing the Structure of Query Intention
Our framework first employ a pipeline of phrase detection and phrasal semantic
parsing to recognize the inherent structure of user’s query intention, and then
instantiates the query intention regarding specific KBs.
3.1   Phrase Detection
Before recognizing the query intentions of questions, we first detect phrases of in-
terest from the questions that potentially correspond to semantic items. where a
detected phrase is also assigned with a semantic label l ∈ {entity, relation, category, variable}.
For entity phrases, we need to recognize phrases that may correspond to entities
of the KB. Similarity, relation phrases may correspond to KB’s predicates, and
category phrases may be mapped in KB’s classes. This problem can be casted
as a sequence labeling problem, where our goal is to build a tagger whose input
is a sentence, for example:
                   W ho has T om Cruise been married to
                  V − B none E − B E − I none R − B R − I
(here, we use B-I scheme for each phrase label: R-B represents the beginning of
a relation phrase, R-I represents the continuation of a relation phrase). We use
                                       1262
structured perceptron[3] to build our phrase tagger. Structured perceptron is an
extension to the standard linear perceptron for structured prediction. Given a
question instance x ∈ X, which in our case is a sentence, the structured percep-
tron involves the following decoding problem which finds the best configuration
z ∈ Y , which in our case is a label sequence, according to the current model w:
                             z = arg 0max w · f (x, y 0 )
                                     y ∈Y (x)
where f (x, y 0 ) represents the feature vector for instance x along with configu-
ration y 0 . We observe that many phrases of a specific semantic label may share
similar POS tag sequence pattern and NER sequence pattern. For example, the
POS tag sequences of relation phrases “the area of ” and “the population in” are
in the same pattern “DT NN IN ”. Therefore, we use three types of features:
lexical features, POS tag features and NER features. Table 1 summarizes the
feature templates we used in the phrase detection.
               Table 1. Set of feature templates for phrase detection
     p = pos tag; n = ner tag; w = word; t = phrase type tag; i = current index
 1 unigram of POS tag                                   pi
 2 bigram of POS tag                                    pi pi+1 ; pi−1 pi
 3 trigram of POS tag                                   pi pi+1 pi+2 ; pi−1 pi pi+1 ; pi−2 pi−1 pi
 4 unigram of NER tag                                   ni
 5 bigram of NER tag                                    ni ni+1 ; ni−1 ni
 6 trigram of NER tag                                   ni ni+1 ni+2 ; ni−1 ni ni+1 ; ni−2 ni−1 ni
 7 unigram of word                                      wi
 8 bigram of word                                       wi wi+1 ; wi−1 wi
 9 trigram of word                                      wi wi+1 wi+2 ; wi−1 wi wi+1 ; wi−2 wi−1 wi
10 previous phrase type                                 ti−1
11 conjunction of previous phrase type and current word ti−1 wi
3.2   Phrase DAG Parsing with Multiple Heads
As shown in Figure 1, query intention can be represented by dependencies be-
tween “country”, “france” and “colonise”, forming a phrase DAG, we thus in-
troduce a transition-based DAG parsing algorithm to perform a structural pre-
diction process and reveal the inherent structures.
Phrase DAG In dependency grammar, structures are determined by the re-
lationship between a head and its dependents, e.g., syntactic or morphological
dependencies. Here, we propose to use predicate-argument dependencies to cap-
ture the query intention, that is, the arguments of a predicate are dependents of
that predicate. Each predicate is either a unary predicate (which characterizes
                                        1263
                 fb:award.award_winning_work.award
                 s_won..award.award_honor.award                                                                SP
                                                                            SP
                 fb:award.award_winning_work.award                                                             PO
                                                                            PO
                 s_won..award.award_honor.year
                                 ...
                                                                                                                                Phrase DAG
              In [what]v [year]C did [harry potter and the goblet of fire]E                 [win]R the [hugo award for best novel]E
                                                   fb:m.02hm249                                                        fb:m.04p4lvx
                 fb:time.month                                                                                         fb:m.04p3_v_
                                                   fb:m.05q1cnk         fb:sports.sports_team.championships
                 fb:time.day_of_year                                                                                   fb:m.0l_v4kd
                                                   fb:m.0dplz5f         fb:award.award_winning_work.award
                 fb:time.event                                                                                         fb:m.0gbwdzs
                                                   fb:m.08vs8qm         s_won..award.award_honor.award
                 fb:time.year                                                                                          fb:m.0gbwgrp
                                                   fb:m.031786                             ...
                            ...                                                                                        fb:m.01yz0x
                                                         ...
                                                                                                                              ...
Fig. 2. An example of phrasal semantic DAG, where the dashed boxes list the mapping
candidates for all phrases and the underlined are the gold-standard mappings.)
                                             Desired output
                                         A                  B       C              D
                                            Initial state
                                         Stack                    Input phrases
                                                                A              B               C           D
                                            Current arcs:       A              B               C           D
     (1) Action:SHIFT                                                          (5) Action:ARCLEFT
      Stack                 Input phrases                                           Stack                      Input phrases
       A                   B         C                  D                             B                    C             D
       Current arcs:        A            B              C           D              Current arcs:       A              B           C          D
     (2) Action:ARCLEFT                                                        (6) Action:SHIFT
      Stack                  Input phrases                                         Stack                    Input phrases
       A                   B            C               D                           C                      D
                                                                                    B
       Current arcs:       A             B             C            D               Current arcs:       A              B           C         D
     (3) Action:REDUCE                                                         (7) Action:ARCRIGHT
      Stack                     Input phrases                                          Stack                    Input phrases
                                                                                       C                    D
                            B           C                   D                          B
        Current arcs:       A            B              C           D               Current arcs:       A              B           C         D
     (4) Action:SHIFT                                                            (8) Action:SHIFT
       Stack                    Input phrases                                        Stack                      Input phrases
                                                                                     D
         B                  C                D                                       C
                                                                                     B
        Current arcs:       A            B              C           D               Current arcs:       A              B           C         D
                           Fig. 3. An example of shift-reduce DAG parsing.
                                                                        1264
its only argument) or a binary predicate (which represents the semantic rela-
tion between its two arguments). Here, category phrases correspond to unary
predicates, and relation phrases correspond to binary predicates.
    We formulate the problem of recognizing the dependency structures in a
phrase level: the detected phrases in a question form the nodes of the phrase
dependency DAG, with four types, variable, entity, relation, and category nodes,
where the former two are referred as content nodes in the following. We draw a
directed edge from the head to its dependent, indicating the predicate-argument
or argument-argument relationship.
           Table 2. The transitions for our arc-eager parsing algorithm
               Transitions
               Left-Arc      (σ|i, j|β, A) ⇒ (σ|i, j|β, A ∪ {(j, l, i)})
               Right-Arc     (σ|i, j|β, A) ⇒ (σ|i, j|β, A ∪ {(i, l, j)})
               Reduce        (σ|i, β, A) ⇒ (σ, β, A)
               Shift         (σ, i|β, A) ⇒ (σ|i, β, A)
               Preconditions
               Left-Arc      ¬[i = 0]
                             ¬∃j∃i[(i, l, j) ∈ A]
               Right-Arc     ¬∃j∃i[(j, l, i) ∈ A]
The Phrase DAG Parsing Note that, in our setup, one phrase can have more
than one head, as in Figure 2, variable node what has two heads in the resulting
dependency DAG. We thus use the framework proposed by [2], i.e., extending
traditional arc-eager shift-reduce parsing with multiple heads to find a DAG
directly. Specifically, given a question with sequence of phrases, our parser uses
a stack of partial DAGs, a queue of incoming phrases, and a series of actions to
build a dependency DAG. We assume that each input phrase has been assigned
a POS-tag (conjunction of each token’s POS-tag which composes of the input
phrase) and a semantic label.
    Our semantic parser uses four actions: SHIFT, REDUCE, ARCRIGHT and
ARCLEFT (Table 2).
    The SHIFT action follow the standard definitions that just pushes the next
incoming phrase onto the stack, as shown in Figure 3 action(1).
    The REDUCE action pops the stack top, as shown in figure 3 action(3).
Note that, the standard REDUCE action which is taken on the condition that the
stack top has at least one head. This precondition ensures the dependency graph
is a connected graph. However, our phrase dependency parser only concerns
the predicate-argument structures, and we add a dependency only between the
predicate and argument of our interest. In our case, the dependency graph can
be an unconnected directed graph.
    The ARCRIGHT action adds a dependency edge from the stack top to the
first phrase of the incoming queue, where the phrase on the stack is the head
                                        1265
Algorithm 1 The decoding algorithm for the phrase DAG parsing; K is the
beam size
Require: sentence x
    agenda: hold the K-best candidate items
Ensure:       candidate output
 1: agenda.clear()
 2: agenda.insert(GetStartItem(x ))
 3: candidate output = NONE
 4: while not agenda.empty() do
 5:   list.clear()
 6:   for all item ∈ agenda do
 7:      for all action ∈ getActions(actions, item) do
                   0
 8:         item = item.apply(action)
                      0
 9:         if item .F == TRUE then
10:             if candidate output == NONE
                        0
                or item .score > candidate output.score then
                                            0
11:                 candidate output = item
12:             end if
13:          else
                                  0
14:             list.append(item )
15:          end if
16:       end for
17:   end for
18:   agenda.clear()
19:   agenda.insert(list.best(K))
20: end while
and the phrase in the queue is the dependent (the stack and queue are left
untouched), as long as a left arc does not already exist between these two phrases,
as shown in Figure 3 action(7).
    The ARCLEFT action adds a dependency edge from the first phrase on the
queue to the stack top, where the phrase in the queue is the head and the phrase
on the stack is the dependent (again, the stack and queue are left untouched),
as long as a right arc does not already exist between the two phrases, as shown
in Figure 3 action(2).
    Figure 3 shows an example of shift-reduce parsing process.
The decoding algorithm for phrase DAG parsing We apply the standard
beam-search along with early-update to perform inexact decoding [9] during
training. To formulate the decoding algorithm, we define a candidate item as a
tuple hS, Q, F i, where S represents the stack with partial derivations that have
been built, Q represents the queue of incoming phrases that have not been pro-
cessed, and F is a boolean value that represents whether the candidate item has
been finished. A candidate item is finished if and only if the queue is empty, and
no more actions can be applied to a candidate item after it reaches the finished
status. Given an input sentence x, we define the start item as the unfinished item
                                        1266
with an empty stack and the whole input sentence as the incoming phrases(line
2). A derivation is built from the start item by repeated applications of actions
(SHIFT, REDUCE, ARCLEFT and ARCRIGHT) until the item is finished.
    To apply beam-search, an agenda is used to hold the K-best partial (unfin-
ished) candidate items at each parsing step. A separate candidate output is used
to record the current best finished item that has been found, since candidate
items can be finished at different steps. Initially the agenda contains only the
start item, and the candidate output is set to none(line 3). At each step during
parsing, each candidate item from the agenda is extended in all possible ways
by applying one action according to the current status(line 7), and a number of
new candidate items are generated(line 8). If a newly generated candidate is fin-
ished, it is compared with the current candidate output. If the candidate output
is none or the score of the newly generated candidate is higher than the score of
the candidate output, the candidate output is replaced with the newly generated
item(line 11); otherwise the newly generated item is discarded (line 14). If the
newly generated candidate is unfinished, it is appended to a list of newly gen-
erated partial candidates. After all candidate items from the agenda have been
processed, the agenda is cleared(line 18) and the K-best items from the list are
put on the agenda(line 19). Then the list is cleared and the parser moves on
to the next step. This process repeats until the agenda is empty (which means
that no new items have been generated in the previous step), and the candidate
output is the final derivation. Pseudocode for the decoding algorithm is shown
in Algorithm 1.
       Table 3. The set of feature templates used in our phrase DAG parser
                     p = phrase; t = POS-tag; s = phrase type
                Category Description           templates
                  lexical stack top            ST pt; ST p; ST t;
                 features current phrase       N0 pt; N0 p; N0 t
                           next phrase         N1 pt; N1 p; N1 t;
                           ST and N0           ST ptN0 pt; ST ptN0 p;
                           POS bigram          N0 tN1 t
                           POS trigrams        N0 N1 tN2 t;
                           N0 phrase           N0 pN1 tN2 t;
                 semantic Conjunction of       N0 s; N0 ts; N0 ps;
                 features phrase label and     N1 s;N1 ts;ST tN0 s;
                           pos tag             ST sN0 t; ST pN0 s;
                                               ST tN0 t; ST sN0 s;
                structural Indicates whether ArcLeft(ST s, N0 s);
                 features exists an arc        ArcRight(ST s, N0 s)
                           between the stack
                           top item and next
                           input item, and if
                           so what type of arc
                                       1267
Features Features play an important role in transition-based parsing. Our
parser takes three types of features: lexical, semantic and structure-related fea-
tures. We summarize our feature templates in Table 3, where ST represents
the top node in the stack, N0 , N1 , N2 represent the three incoming phrases
from the incoming queue, subscript t indicates POS tags, subscript p indi-
cates lexical surface forms and subscript s represent the semantic label of the
phrase(entity,relation, category and variable).
    Lexical features include features used in traditional word level dependency
parsing with some modifications: all co-occurrences are built on phrase nodes
and the POS tag of a phrase is defined as the concatenation of each token’s POS
tag in the phrase.
    Note that ARCLEFT and ARCRIGHT actions are considered only when
the top phrase of the stack and the next phrase are variable, entity or rela-
tion phrases.To guide the ARCLEFT and ARCRIGHT actions, we introduce
semantic features indicating the semantic label of a phrase.
    Recall that, our phrase semantic DAG parser allows one phrase to have mul-
tiple heads. Therefore, we modify the ARCLEFT and ARCRIGHT actions so
that they can create new dependency arcs without removing the dependent from
further consideration for being a dependent of other heads. We thus introduce
new structure-related features to indicate whether an arc already exists between
the top phrase on the stack and the next phrase on the queue.
4     Instantiating Query Intention Regarding Existing KBs
After recognizing the structure of the user’s query intention, our parser can out-
put a KB-independent phrase dependency DAG, as shown in Figure 2. To covert
the the query intention into structured queries, semantic items are grounded to
a specific KB, e.g., DBpedia. In experiment, we utilize different methods to
maintain the candidate sets for different types of phrases.
    For the entity phrase, we first employ the wikipedia miner tool2 to map the
phrase to wikipedia entities, and then generate the candidate set of DBpedia
entities which share the same wikipedia ids with wikipedia entities.
    For the relation and category phrase, we use the PATTY resource to construct
a lexicon that maps phrases to predicates and categories of DBpedia.
    Note that, a nice paradigm will be jointly resolving the mappings of all se-
mantic items involved: labeling phrase nodes and the dependency relations with
corresponding items in the given KB. Therefore, inspired by [21], we formulate
this problem as a structured learning task, where we choose a candidate for each
phrase and dependency relation, and find a globally maximum configuration
among multiple local candidates.
    For the dependency relation candidates especially for that between two con-
tent phrases, we develop a function getDepRel to retrieve the possible depen-
dency relation candidates which are coherent with the domain of the phrases in
the question from the set of all KB predicates.
2
    http://wikipedia-miner.cms.waikato.ac.nz/
                                        1268
Algorithm 2 The decoding algorithm to instantiate the query intention struc-
tures
Require:        Instance x =< (x1 , x2 , . . . , xm ), ξ, o > and the oracle output y if for
    training.
    K:Beam Size.
    Γ ∪ {⊥}: phrase candidates.
    Λ ∪ {⊥, SP, P O, SC}: dependency relation candidates.
Ensure:       1-best prediction z for x
 1: Set beam ← [ε] //empty configuration
 2: for all i ← 1..m do
 3:    buf ← {z 0 l|z 0 ∈ B, l ∈ Γ ∪ {⊥}} //phrase labeling
 4:    B ← K-best(buf )
 5:    if y1:(i−1)∗m+1 ∈ / B then
 6:       return B[0] //for early update
 7:    end if
 8:    for all (xk , ∗) ∈ ξ do
 9:       /* search for dependent relation*/
10:       buf ← ∅
11:       for all z 0 ∈ B do
12:          buf ← buf ∪ {z 0 ⊥}
13:          if dep(i, k, ∗) ∈ o then
14:             //this dependency from phrase i to phrase k
15:             //exists in the DAG structures
16:             /*dependency relation labeling*/
17:             buf ← buf ∪ {z 0 r|r ∈ getDepRel(Λ, z 0 ) ∪ SP ∪ P O ∪ SC∪ ⊥}
18:          end if
19:       end for
20:       B ←K-best(buf )
21:       if y1:i∗k ∈/ B then
22:          return B[0] // for early update
23:       end if
24:    end for
25: end for
4.1   Instantiating Query Intention with Structured Perceptron
As shown in Figure 2, each phrase and dependency relation have multiple can-
didates and the best configuration is consisted of the blue and underlined can-
didates. However, it is intractable to perform exact search in our framework be-
cause: (1) by simultaneously mapping phrase nodes and dependency relations,
the search space becomes much more complex, (2) we propose to make use of
arbitrary global features, which makes it infeasible to perform exact inference
efficiently.
   To address this problem, we apply beam-search along with early-update strat-
egy to perform inexact decoding.
                                           1269
4.2   Label sets
Here we introduce the label sets for phrase node and dependency relation in the
model. We use Γ ∪ {⊥} to denote the phrase candidates, and ⊥ indicates that
the phrase node can not be mapped to an entity or a predicate of the KB. Sim-
ilarly, Λ ∪ {SP, P O, SC, ⊥} denotes the dependency relation candidates, where
Λ is the set of all KB predicates, SP and PO are the two labels representing
the dependency relation between relation phrase and entity phrase, where SP
indicates that the entity phrase is the subject of the relation phrase, and PO
indicates the other way. SC is the label representing the dependency relation
between content phrase and category phrase, where the category phrase charac-
terized the content phrase. and ⊥ means that this dependency relation can not
be instantiated.
4.3   The decoding algorithm for instantiating query intentions
Let x =< (x1 , x2 , . . . , xm ), ξ, o > denote the sentence instance, where xi repre-
sents the i-th phrase in the sentence, ξ = {{(xk , ci )}si=1 }m   k=1 is the set of KB
item candidates for phrases where s denotes the size of candidates for xk , and
o = {{depl (i, j, ci )}di=1 }nl=1 is the set of KB predicate candidates for dependency
relations where i and j denotes the head and dependent index, d denotes the
size of candidates for the dependency relation. We use
                  y = (t1 , a1,1 , . . . , a1,m , . . . , tm , am,1 , . . . , am,m )
to denote the corresponding gold standard structure, where ti represents the KB
item assignment for the phrase xi , and ai,k represents KB item assignment for
the dependency relation between the head xi and dependent xk . Algorithm 2
describes the beam-search procedure with early-update. During each step with
phrase i, there are two sub-steps:
Phrase labeling We enumerate all possible KB item candidates for the current
   phrase(line 3). K-best partial configurations are selected to the beam(line 4),
   assuming the beam size is K.
Dependency relation labeling After the phrase labeling step, we traverse
   all configurations in the beam. Once a phrase candidate for xi is found in
   the beam, the decoder searches through the dependency relation candidates
   o to label each dependency relation where the current phrase as the head
   and xi as the dependent(line11-19). After labeling each dependency relation,
   we again score each partial assignment and select the K-best results to the
   beam(line 20).
    Note that, the score of a configuration is indeed the sum of two scores from
each sub-step. Therefore the resulting best configuration will jointly solve the
phrase and dependency relation labeling. We use two types of features: local
features and global features. Table 4 summarize these features.
                                               1270
               Table 4. Features used to instantiate query intention
               p = phrase; r = dependency relation c = candidate
                             s = the surface name of c
                                    Local features
         Category      Type     Feature Description
          Phrase       Text     p is equal/prefix/suffix of s
                    similarity overlap of p and s
                   Popularity the popularity of s
                     Lexical p is equal/prefix/suffix of the
                                synonym of s
        Dependency     Text     r is equal/prefix/suffix of s
         Relation similarity overlap of p and s
                                   Global features
          Phrase        KB      whether p has a head, and if so,
                      related if p fulfills the expectation type of its head
                                number of phrases in the question that are
                                in the same domain of p
        Dependency      KB      whether r has an argument, and if so, if the
         Relation     related argument fulfills the expectation type of r
                                number of phrases in the question that are
                                in the same domain of r
                                            p
                                   e1             e2         [e1 p e2]
                              sp                 po
                         e1             r               e2    [e1 r e2]
                                            sc
                                                             [e1 type c]
                                   c              e1
       Fig. 4. Rules used to convert query intentions into structured queries.
4.4   Converting Query Intention into Structured Queries
After instantiating the query intention, our system outputs a phrasal semantic
DAG regarding KB. We convert the DAG into structured queries, e.g., SPARQL,
mainly by the rules shown in Figure 4, where e1 and e2 represent entity phrases,
r denotes a relation phrase and c denotes a category phrase. The left is the
graph pattern which may occur in the semantic DAG, and the right is the query
triple converted from the pattern. Note that, multiple graph patterns may occur
in a semantic DAG, as shown in Figure 3. The final structured queries are the
conjunction of query triples converted from DAG.
                                                      1271
5     Experimental
5.1   Data set
We use the Free917 dataset [6] as our training data, which contains 917 questions
annotated with logical forms. Recall that, in our model, we require gold-standard
phrase dependency DAGs of these questions as our training data, which we
prepare as follows. For each question, we find its variable, entity, category and
relation phrases, manually annotate the question with phrase dependency DAG
and map phrases and dependencies to Freebase semantic items.
    The QALD-4 task for question answering over linked data consists of 50 test
questions. We use these 50 questions as our test data and directly use the parser
trained on Free917 to predict the query intention and instantiate the query
intention regarding DBpedia.
5.2   Results on QALD-4 data
The following table gives the evaluation results with average precision, average
recall and F-measure. It shows the number of question our system can answer,
the number of right and partially right answers among them.
                 Table 5. Evaluate results on QALD-4 test dataset
        Total Processed Right Partially right Avg.Precision Avg.Recall F1
         50      40      36          6            0.72        0.71    0.72
5.3   Error Analysis
We notice that in many cases our system can be further improved. For example,
in the question from QALD4 “what is the bridge with the longest span”, our
system incorrectly detects “the longest span” as a category phrase, while this
phrase is certainly a relation phrase. This detection error is propagated to phrasal
semantic parsing. A necessary direction for our future step should be jointly solve
phrase detection and phrasal semantic DAG parsing to avoid such errors.
6     Conclusion and Future Work
In this paper, we propose a novel framework to translate natural language ques-
tions into structural queries, which can be effectively retrieved by structured
query engines and return the answers according a KB. The novelty of our frame-
work lies in modeling the task in a KB-independent and KB-related pipeline
paradigm, where we use phrasal semantic DAG to represent users’ query in-
tention, and develop a KB-independent shift-reduce DAG parser to capture the
                                       1272
structure of the query intentions, which are then grounded to a given KB via
joint mappings. This gives the advantages to analyze the questions independent
from a KB and easily adapt to new KBs without much human involvement.
    Currently, our model requires question-DAG pairs as the training data, though
we do not need to annotate new data for every datasets or KBs, it is still promis-
ing to extend our model to train on question-answer pairs only, further saving
human involvement. Secondly, in the pipeline of phrase detection and phrasal
semantic parsing, error propagated from the upstream to downstream. A joint
model with multiple perceptrons may help to eliminate the error propagation.
References
1. A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information
   extraction. In EMNLP, pages 1535-1545, 2011.
2. K. Sagae and J. Tsujii. Shift-reduce dependency dag parsing. In COLING, pages
   753-760, 2008.
3. M. Collins. Discriminative training methods for hidden markov models: Theory and
   experiments with perceptron algorithms. In Proceedings of the 2002 Conference on
   Empirical Methods in Natural Language Processing (EMNLP-2002), 2002.
4. K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a col-
   laboratively created graph database for structuring human knowledge. In SIGMOD
   Conference, pages 1247-1250, 2008.
5. Q. Cai and A. Yates. Semantic Parsing Freebase: Towards Open-domain Semantic
   Parsing. In Proceedings of the Second Joint Conference on Lexical and Computa-
   tional Semantics (*SEM), 2013.
6. M. Collins. Discriminative training methods for hidden markov models: Theory and
   experiments with perceptron algorithms. In Proceedings of the 2002 Conference on
   Empirical Methods in Natural Language Processing (EMNLP-2002), 2002.
7. A. Frank, H.-U. Krieger, F. Xu, H. Uszkoreit, B. Crysmann, B. Jorg, and U. Schafer.
   Question answering from structured knowledge sources. J. Applied Logic, 5(1):20-
   48, 2007.
8. T. Lin, Mausam, and O. Etzioni. Entity linking at web scale. In Proceedings of
   the Joint Workshop on Automatic Knowledge Base Construction and Web-scale
   Knowledge Extraction, AKBC-WEKEX 12, pages 84-88, Stroudsburg, PA, USA,
   2012. Association for Computational Linguistics.
9. M. Collins and B. Roark. Incremental parsing with the perceptron algorithm. In
   ACL, pages 111-118, 2004.
10. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge.
   In WWW, pages 697-706, 2007.
11. J. Krishnamurthy and T. Mitchell. Weakly supervised training of semantic parsers.
   In EMNLP-CoNLL, pages 754-765, 2012.
12. M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum.
   Natural language questions for the web of data. In EMNLP-CoNLL, pages 379-390,
   2012.
13. C. Unger and P. Cimiano. Pythia: Compositional meaning construction for
   ontology-based question answering on the semantic web. In Proceedings of the 16th
   International Conference on Natural Language Processing and Information Systems,
   NLDB11, pages 153-160, Berlin, Heidelberg, 2011. Springer-Verlag.
                                        1273
14. C. Unger, L. Buhmann, J. Lehmann, A.-C. N. Ngomo, D. Gerber, and P. Cimiano.
   Template-based question answering over rdf data. In WWW, pages 639-648, 2012.
15. K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a col-
   laboratively created graph database for structuring human knowledge. In SIGMOD
   Conference, pages 1247-1250, 2008.
16. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. G. Ives. Dbpedia:
   A nucleus for a web of open data. In ISWC/ASWC, pages 722-735, 2007.
17. P. Cimiano, V. Lopez, C. Unger, E. Cabrio, A.-C. N. Ngomo, and S. Walter.
   Multilingual question answering over linked data (qald-3): Lab overview. In CLEF,
   pages 321-332, 2013.
18. T. Kwiatkowski, L. S. Zettlemoyer, S. Goldwater, and M. Steedman. Inducing prob-
   abilistic ccg grammars from logical form with higher-order unification. In EMNLP,
   pages 1223-1233, 2010.
19. J. Berant and P. Liang. Semantic parsing via paraphrasing. In Association for
   Computational Linguistics (ACL), 2014.
20. L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural lan-
   guage question answering over rdf. In Special Interest Group on Management Of
   Data(SIGMOD), 2014.
21. Q. Li, H. Ji, and L. Huang. Joint event extraction via structured prediction with
   global features. In ACL, pages 73-82, 2013.
22. J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from
   question-answer pairs. In EMNLP, pages 1533-1544, 2013.
23. T. Kwiatkowski, E. Choi, Y. Artzi, and L. S. Zettlemoyer. Scaling semantic parsers
   with on-the-fly ontology matching. In EMNLP, pages 1545-1556, 2013.
                                        1274