=Paper=
{{Paper
|id=Vol-2721/paper541
|storemode=property
|title=Improving Knowledge Base Question Answering with Question Understanding Augment
|pdfUrl=https://ceur-ws.org/Vol-2721/paper541.pdf
|volume=Vol-2721
|authors=Peiyun Wu,Xiaowang Zhang
|dblpUrl=https://dblp.org/rec/conf/semweb/WuZ20
}}
==Improving Knowledge Base Question Answering with Question Understanding Augment==
Improving Knowledge Base Question Answering
with Question Understanding Augment
Peiyun Wu and Xiaowang Zhang
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
{wupeiyun,xiaowangzhang}@tju.edu.cn ??
Abstract. The basement of knowledge base question answering (KBQA)
is to understand the given question and extract the meaning from it. Ex-
isting works largely focus on generating query graphs to represent the
semantics of the question while ignoring understand the real meaning of
it. To augment question understanding, in this paper, we leverage rich
external linguistic knowledge to enhance question semantics. First, we in-
tegrate the sememe and gloss information into words, where sememe (the
minimum semantic units of word meanings) and gloss (sense definition)
are used to disambiguate the word sense and enrich questions informa-
tion. Moreover, we present a co-attention network to build co-dependent
representations for the sememe and gloss. Experiments evaluated on two
data sets show that our model outperforms existing approaches.
1 Introduction
Semantic parsing is an important approach to KBQA, which constructs a query
structure (called query graph) that represents the semantics of questions. Seman-
tic parsing based approaches effectively transform questions into logical forms
where the reliability of logical forms can ensure the correctness of answering
questions. The success of semantics parsing lies in representing the semantics of
questions to better capture users’ intention.
However, in recent years, many semantic parsing approaches focus on com-
plex query graph generation and re-ranking[1, 2, 5] with paying little attention
to understanding the meanings of questions accurately. They aim to leverage
a ranking model to score and find the best query graph. As a result, exist-
ing works without processing ambiguous questions cannot always perform query
graph ranking better.
In this paper, we propose an augmented question representation method by
leveraging the sememe and gloss information. Specifically, we take advantage
of the gloss information from WordNet and the sememe [3] information from
Hownet into word embeddings from the given question. A word may have mul-
tiple senses and a sense consists of several sememes and a gloss. To highlight
important information in the sememe and gloss, we present a co-attention net-
work to generate better representations.
??
Copyright 2020 for this paper by its authors. Use permitted under Creative Com-
mons License Attribution 4.0 International (CC BY 4.0).
cos similarity
pooling
RGCN
layer ... ...
Relation2
+ Relation1
Question:where was president from?
avg
Co-Attention
g s
U U Head of
government
head of government
context
1.the person who holds the
office of head of state of the
United States government
president president president
2.an executive officer of a firm (state president) (principal) ... (CEO)
Senses
Layer
or corporation
3.the head administrative
officer of a college or
human/occupation/
politics/HeadOfStat
human/occupation
/education/official/
... human/occupation
/economy/official/ Sememes
university Layer
... e/manage/country... study/teach... manage/primary...
Gloss selector Sense selector
Fig. 1: Diagram for Our Model.
2 Our Approach
Given a question Q = {w1 , . . . , wn }, we generate its candidate query graph set
by method in [1]. We measure the semantic similarities between the question
and each query graph to find the optimal one. For word wi in Q, we denote the
set of its senses as S wi , and E wi = {ei1 , . . . , eik } represents an unordered set of
all sememes contained in wi . We assume for word wi that we have a gloss set
Gwi . Our model is shown in Fig. 1.
Sense Selector: This selector selects each word sense that is most relevant
to the context. We first feed Q into a bi-directional long short-term memory
network (Bi-LSTM) to generate the hidden representation {h1 , . . . , hn }. Then
we generate the context representation of Q as below:
n n
X 1X
context = softmax(tanh(h> i · ( hi ))) · hi (1)
i=1
n i=1
To calculate the correlation between each sememe and context, we use Sigmoid
function to obtain probability value by:
p(eij |context) = Sigmoid context · e>
ij , ∀ j ∈ (1, . . . , k) (2)
For each sense in S wi , its probability is calculated by the average of all the
sememe probabilities that it contains. In this way, we can select the sense with
wi
the highest probability value under the current context and denote it as Smax =
{e1 , . . . , ek }, where {e1 , . . . , ek } represents all sememe vectors it contains.
Gloss Selector: This selector selects the gloss that is most relevant to the
selected word sense. Analogously, we use Sigmoid function to obtain the highest
wi
probability of gloss that is most relevant to the average embedding of Smax . We
wi
denote the selected gloss as Gmax = {o1 , . . . , om }, where {o1 , . . . , om } represents
all word embeddings it contains.
Co-Attention: To model the mutual influence and highlight the important
information in the sememe and gloss, we introduce a co-attention network to
dynamically combine the sememe and gloss representation as:
> wi >
U s = tanh(Smax
wi
· Gw
max ),
i
U g = tanh(Gw
max · Smax )
i
(3)
k
X m
X
wi s
SG =λ [U · softmax(U:s )]:j + (1 − λ) [U g · softmax(U:g )]:j (4)
j=1 j=1
where SGwi is the combination representation of the selected sememe and gloss
of word wi , λ is the parameter and λ ∈ [0, 1]. softmax(U:s ) and softmax(U:g ) are
attention weight matrix for softmax function across each column of U s and U g ,
respectively. []:j denotes j-th column of []. Finally, we concatenate SGwi to the
word embedding of wi to enrich the semantics and reduce ambiguity.
Question Representation: We treat {x1 , . . . , xn } as the initial representa-
tions of the given question which has been integrated with sememe and gloss.
To further augment the contextual embeddings of the question, we parse the
question into its syntactic dependencies graph DG and adopt relational graph
convolutional network (RGCN) to digest this structural information:
(l+1)
X X 1 (l) (l) (l)
xi = ReLU W (l) x + W0 xi (5)
r
|Nir | r j
r∈R j∈Ni
Here R is a set of dependency-relation. l is l-th layer, Nir is the set of all r-
neighbors of i-th node in DG . Note that W0 and Wr are weighted matrixes.
Finally, we apply a pooling operation after the last RGCN layer to get the
representation of the question.
Relation Representation We represent relations from different granularity in
a query graph. For each relation, we take its relation-level and word-level repre-
sentations into consideration. The word-level relation is calculated by its average
word embeddings. The relation-level representation is the vector of unique token
of relation name. Then, each relation is represented by the sum operation and we
perform max pooling over relations to obtain the final relation representation.
3 Experiments and Evaluations
We use Wikidata as our KB and conduct on two data sets, namely, WebQSP-
WD(WSPWD)[1] and QALD-7 (Task 4, English), both support for Wikidata.
We use F1-score as metrics, where all results are macro-averaged scores.
By Table 1, we show that our model outperforms all datasets. Our model
achieves 54.2%, 23.3%, 8.9%, 11.9% higher F1-score compared to STAGG, HR-
BiLSTM, GGNN, Slot-Matching on WSPWD. Analogously, we achieve 59.3%,
45.7%, 39.1%,21.7% higher F1-score on QALD-7. We observe that if we only inte-
grate sememe information “+sememe” or gloss information “+gloss”, our model
performs worse but still keep competitive. We can conclude the effectiveness of
our augmented question representation with sememe and gloss integration.
Table 1: Overall Average Results over
Wikidata
Model WSPWD QALD-7
STAGG(2015) [2] 0.1828 0.1861
HR-BiLSTM(2017) [5] 0.2287 0.2035
GGNN(2018)[1] 0.2588 0.2131
Slot-Matching(2019)[4] 0.2519 0.2436
+sememe 0.2459 0.2546
+gloss 0.2597 0.2743
Our 0.2819 0.2965
Fig. 2: The number of relations to find cor-
rect answers
To measure the performance across questions of different complexity, we
break down the performance by the number of relations that are needed to
find the correct answer on WSPWD, and results are shown in Fig. 2, we can see
that our model is effective in dealing with different complexity questions.
4 Conclusion
In this paper, we augment question understanding to improve KBQA. The se-
meme and gloss information can benefit from each other and enhance question
semantics. In this way, our approach provides a new method of usage of exter-
nal knowledge in question representation. In future work, we are interested in
extending our model for more complex practical questions.
5 Acknowledgments
This work is supported by the National Key Research and Development Program
of China (2017YFC0908401) and the National Natural Science Foundation of
China (61972455). Xiaowang Zhang is supported by the Peiyang Young Scholars
in Tianjin University (2019XRX-0032).
References
1. Sorokin, D., Gurevych, I.: Modeling semantics with gated graph neural networks
for knowledge base question answering. In: COLING’2018, pp.3306–3317.
2. Yih, W., Chang, M., He, X., Gao, J.: Semantic parsing via staged query graph
generation: question answering with knowledge base. In: ACL’2015, pp.1321–1331.
3. Bloomfield, L.: A set of postulates for the science of language. Language, 1926, 2(3):
153-164.
4. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A.,
Lehmann, J. (2019). Learning to rank query graphs for complex question answering
over knowledge graphs. In: ISWC’19, pp.487-504.
5. Yu, M., Yin, W., Hasan, K.S., Santos, C.N., Xiang,B., Zhou,B.: Improved neural
relation detection for knowledge base question answering. In: ACL’2017, pp.571–
581.