Improving Knowledge Base Question Answering with Question Understanding Augment

Improving Knowledge Base Question Answering with Question Understanding Augment PeiyunWu wupeiyun@tju.edu.cn College of Intelligence and Computing Tianjin University

300350 Tianjin China

XiaowangZhang xiaowangzhang@tju.edu.cn College of Intelligence and Computing Tianjin University

300350 Tianjin China

Improving Knowledge Base Question Answering with Question Understanding Augment 2AB22E6AC2CFEA6FDA652A0998185369 GROBID - A machine learning software for extracting information from scholarly documents

The basement of knowledge base question answering (KBQA) is to understand the given question and extract the meaning from it. Existing works largely focus on generating query graphs to represent the semantics of the question while ignoring understand the real meaning of it. To augment question understanding, in this paper, we leverage rich external linguistic knowledge to enhance question semantics. First, we integrate the sememe and gloss information into words, where sememe (the minimum semantic units of word meanings) and gloss (sense definition) are used to disambiguate the word sense and enrich questions information. Moreover, we present a co-attention network to build co-dependent representations for the sememe and gloss. Experiments evaluated on two data sets show that our model outperforms existing approaches.

Introduction

Semantic parsing is an important approach to KBQA, which constructs a query structure (called query graph) that represents the semantics of questions. Semantic parsing based approaches effectively transform questions into logical forms where the reliability of logical forms can ensure the correctness of answering questions. The success of semantics parsing lies in representing the semantics of questions to better capture users' intention.

However, in recent years, many semantic parsing approaches focus on complex query graph generation and re-ranking [1,2,5] with paying little attention to understanding the meanings of questions accurately. They aim to leverage a ranking model to score and find the best query graph. As a result, existing works without processing ambiguous questions cannot always perform query graph ranking better.

In this paper, we propose an augmented question representation method by leveraging the sememe and gloss information. Specifically, we take advantage of the gloss information from WordNet and the sememe [3] information from Hownet into word embeddings from the given question. A word may have multiple senses and a sense consists of several sememes and a gloss. ...

Sense selector Gloss selector

Co-Attention

Our Approach

Given a question Q = {w 1 , . . . , w n }, we generate its candidate query graph set by method in [1]. We measure the semantic similarities between the question and each query graph to find the optimal one. For word w i in Q, we denote the set of its senses as S wi , and E wi = {e i1 , . . . , e ik } represents an unordered set of all sememes contained in w i . We assume for word w i that we have a gloss set G wi . Our model is shown in Fig. 1.

Sense Selector: This selector selects each word sense that is most relevant to the context. We first feed Q into a bi-directional long short-term memory network (Bi-LSTM) to generate the hidden representation {h 1 , . . . , h n }. Then we generate the context representation of Q as below:

context = n i=1 softmax(tanh(h i • ( 1 n n i=1 h i ))) • h i(1)

To calculate the correlation between each sememe and context, we use Sigmoid function to obtain probability value by:

p(e ij |context) = Sigmoid context • e ij , ∀ j ∈ (1, . . . , k)(2)

For each sense in S wi , its probability is calculated by the average of all the sememe probabilities that it contains. In this way, we can select the sense with the highest probability value under the current context and denote it as S wi max = {e 1 , . . . , e k }, where {e 1 , . . . , e k } represents all sememe vectors it contains.

Gloss Selector: This selector selects the gloss that is most relevant to the selected word sense. Analogously, we use Sigmoid function to obtain the highest probability of gloss that is most relevant to the average embedding of S wi max . We denote the selected gloss as G wi max = {o 1 , . . . , o m }, where {o 1 , . . . , o m } represents all word embeddings it contains.

Co-Attention: To model the mutual influence and highlight the important information in the sememe and gloss, we introduce a co-attention network to dynamically combine the sememe and gloss representation as:

U s = tanh(S wi max • G wi max ), U g = tanh(G wi max • S wi max )(3)SG wi = λ k j=1 [U s • softmax(U s : )] :j + (1 − λ) m j=1 [U g • softmax(U g : )] :j(4)

where SG wi is the combination representation of the selected sememe and gloss of word w i , λ is the parameter and λ ∈ [0, 1]. softmax(U s : ) and softmax(U g : ) are attention weight matrix for softmax function across each column of U s and U g , respectively. [] :j denotes j-th column of []. Finally, we concatenate SG wi to the word embedding of w i to enrich the semantics and reduce ambiguity.

Question Representation: We treat {x 1 , . . . , x n } as the initial representations of the given question which has been integrated with sememe and gloss. To further augment the contextual embeddings of the question, we parse the question into its syntactic dependencies graph D G and adopt relational graph convolutional network (RGCN) to digest this structural information:

x (l+1) i = ReLU   r∈R j∈N r i 1 |N r i | W(l) r x (l) j + W (l) 0 x (l) i (5)

Here R is a set of dependency-relation. l is l-th layer, N r i is the set of all rneighbors of i-th node in D G . Note that W 0 and W r are weighted matrixes. Finally, we apply a pooling operation after the last RGCN layer to get the representation of the question.

Relation Representation

We represent relations from different granularity in a query graph. For each relation, we take its relation-level and word-level representations into consideration. The word-level relation is calculated by its average word embeddings. The relation-level representation is the vector of unique token of relation name. Then, each relation is represented by the sum operation and we perform max pooling over relations to obtain the final relation representation.

Experiments and Evaluations

We use Wikidata as our KB and conduct on two data sets, namely, WebQSP-WD(WSPWD) [1] and QALD-7 (Task 4, English), both support for Wikidata. We use F1-score as metrics, where all results are macro-averaged scores.

By Table 1, we show that our model outperforms all datasets. Our model achieves 54.2%, 23.3%, 8.9%, 11.9% higher F1-score compared to STAGG, HR-BiLSTM, GGNN, Slot-Matching on WSPWD. Analogously, we achieve 59.3%, 45.7%, 39.1%,21.7% higher F1-score on QALD-7. We observe that if we only integrate sememe information "+sememe" or gloss information "+gloss", our model performs worse but still keep competitive. We can conclude the effectiveness of our augmented question representation with sememe and gloss integration. To measure the performance across questions of different complexity, we break down the performance by the number of relations that are needed to find the correct answer on WSPWD, and results are shown in Fig. 2, we can see that our model is effective in dealing with different complexity questions.

Conclusion

In this paper, we augment question understanding to improve KBQA. The sememe and gloss information can benefit from each other and enhance question semantics. In this way, our approach provides a new method of usage of external knowledge in question representation. In future work, we are interested in extending our model for more complex practical questions.

Fig. 1 :1Fig. 1: Diagram for Our Model.

Fig. 2 :2Fig. 2: The number of relations to find correct answers

To highlight important information in the sememe and gloss, we present a co-attention network to generate better representations....context1.the person who holds theoffice of head of state of the2.an executive officer of a firm United States governmentpresident (state president)president (principal)...president (CEO)or corporation3.the head administrative officer of a college orhuman/occupation/ politics/HeadOfStathuman/occupation /education/official/...human/occupation /economy/official/universitye/manage/country...study/teach...manage/primary...

Table 1 :1Overall Average Results overWikidataModelWSPWD QALD-7STAGG(2015) [2]0.1828 0.1861HR-BiLSTM(2017) [5] 0.2287 0.2035GGNN(2018)[1]0.2588 0.2131Slot-Matching(2019)[4] 0.2519 0.2436+sememe0.2459 0.2546+gloss0.2597 0.2743Our0.2819 0.2965

Acknowledgments

This work is supported by the National Key Research and Development Program of China (2017YFC0908401) and the National Natural Science Foundation of China (61972455). Xiaowang Zhang is supported by the Peiyang Young Scholars in Tianjin University (2019XRX-0032).

Modeling semantics with gated graph neural networks for knowledge base question answering DSorokin IGurevych COLING' 2018 Semantic parsing via staged query graph generation: question answering with knowledge base WYih MChang XHe JGao ACL'2015 A set of postulates for the science of language LBloomfield Language 2 3 1926 Learning to rank query graphs for complex question answering over knowledge graphs GMaheshwari PTrivedi DLukovnikov NChakraborty AFischer JLehmann ISWC'19 2019 Improved neural relation detection for knowledge base question answering MYu WYin KSHasan CNSantos BXiang BZhou ACL'2017