=Paper=
{{Paper
|id=Vol-2721/paper543
|storemode=property
|title=A Sememe-based Approach for Knowledge Base Question Answering
|pdfUrl=https://ceur-ws.org/Vol-2721/paper543.pdf
|volume=Vol-2721
|authors=Peiyun Wu,Xiaowang Zhang
|dblpUrl=https://dblp.org/rec/conf/semweb/WuZ20a
}}
==A Sememe-based Approach for Knowledge Base Question Answering==
<pdf width="1500px">https://ceur-ws.org/Vol-2721/paper543.pdf</pdf>
<pre>
 A Sememe-based Approach for Knowledge Base
            Question Answering

                           Peiyun Wu and Xiaowang Zhang

     College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
                          {wupeiyun,xiaowangzhang}@tju.edu.cn??


         Abstract. In this poster, we present a sememe-based approach to se-
         mantic parsing in question answering over knowledge base by leveraging a
         sememe-level semantics to improve the performance of semantic similar-
         ity between question and relations. Firstly, we propose a double-channel
         model to extract both sememe-level semantics and word-level seman-
         tics. Moreover, we present a context-based representation to encode the
         sememe of questions to refine sememe incorporation for reducing noise.
         Finally, we introduce a hierarchical representation to encode the sememe
         representation of relations to remove the ambiguity of words maximally.
         Experiments evaluated on benchmarks show that our model outperforms
         off-the-shelf models


1      Introduction
Knowledge base question answering (KBQA) is the task of accurately and con-
cisely answering a natural language question over knowledge base (KB) by under-
standing the intention of the question. As a critical branch of KBQA, semantic
parsing based approaches construct semantic parsing trees or equivalent query
structures (also called query graph) to represent the given question, and then
ranking them by calculating the semantic similarity with the question.
    Most current works focus on selecting the semantic relations that most similar
to a question to find the optimal query graph. Unfortunately, those existing
approaches are limited in differentiating two relations with the similar word-level
semantics due to the following issues: (1) Polysemous and low-frequency words
often undermine the overall performance of semantic similarity measurement. (2)
More minimal semantics of words between question and relations are ignored.
Existing models heavily dependent on the embeddings closest to the question
representation instead of extracting minimal semantic similarity between them.
    To overtake the above limitations, we leverage sememe from external lexical-
semantic resources. Sememes are minimum semantic units of word meanings [3],
a word may have multiple senses, and a sense consists of several sememes. In
this poster, we present a sememe-based approach to semantic parsing in KBQA
by leveraging a sememe-level semantics.
??
     Copyright 2020 for this paper by its authors. Use permitted under Creative Com-
     mons License Attribution 4.0 International (CC BY 4.0).
                                                                                score
                                                                                 +
                                                            Cosine similarity


                                                        word-channel                                                                              word-channel
                                                           question                                                                                  relation
                                                        representation                                                                            representation
                                                                                                              sememe-channel
                                     sememe-channel                                                              +
                                                                                                                   relation
                                  question representation                                                                                     +
                                                                                                               representation
                                                                                                                         avg
                                                                                                   avg                                               avg
                                                         ...        LSTM
                                   Tanh(STUQ)
                    Column-wise
                    Max-pooling
      softmax
                                                                                                           HOME    HOME    WORLD
                                                                    tokens                                 WORLD                   relation
                                                                                                                                                  Word level
                                                                                                                                     level
                                                                                     Sense layer                          Cq
                     ...                              LSTM
                                                                                                         Sememe
                                                ...                                                       layer
                Sememe
                                          Question


                                        Fig. 1: Diagram for our double-channel model.


2       Approach
Our model is shown in Fig. 1. In this paper, based on a heuristic algorithm
in [2], we generate candidate query graphs with considering five kinds of seman-
tic constraints: entity, type, temporal (explicit and inexplicit time), order, and
compare.

2.1        Word-Channel Representation

In this part, we generate word-channel representation of question and rela-
tions. Given a question {w1 , w2 , ..., wn }, we feed it into a bi-directional long
short-term memory network (Bi-LSTM) to generate the hidden representation
Q = (h1 , . . . , hn ) and obtain hq after pooling operation. Then we transform
hq with a fully connected layer and a ReLU function to get the word-channel
representation of the question:

                                                         q w = ReLU (Wq · hq + b1 )                                                                            (1)

where Wq denotes the linear transformation matrix.
    To encode the word-channel relation representation, we take the relation-
level (e.g. “contained by”) and word-level(e.g. “contained ”,“by”) relation names
into consideration. Given relations {r1 , r2 , ..., rn } in a query graph, for relation-
level representations, we simply take each relation name as a whole unit, and
translate it into vector representation as {r1rl , r2rl , ..., rnrl }. For word-level repre-
sentations, we represent the word sequence of each relation using word averaging
as {r1wl , r2wl , ..., rnwl }. We get the final vector of each word-channel relation repre-
sentation as ri = rirl + riwl . Finally, we apply pooling operation over all relations
and obtain the word-channel representation of relations, denoted by rw .
2.2   Sememe-Channel Representation
Sememe-Channel Question Representation We denote Sm as the set of
all sememes occurring in the question. Then we map Sm into the vectors S =
(s1 , . . . , sn ) and adopt a context attention mechanism to deemphasize irrelevant
sememes and focusing on more correlative to context ones. The interactive con-
text matrix is calculated as S q = tanh(S > U Q) . Then we obtain the vector sq by
the column-wise max-pooling operation over S q and use the softmax function.
Finally, we get the sememe-channel question representation as following:

                              q s = Wsq (softmax(sq ) · S) + b2                       (2)

where Wsq is the parameter matrix.

Sememe-Channel Relation Representation In this part, we adopt a hier-
archical attention method to obtain the sememe-level representation of relations
and maximally remove the ambiguity. We denote Rsense      wij      as a set of sense vec-
tors of word wij as Rsense
                      wij := {seij1 , . . . , seijk }. We denote Rsijk
                                                                       sememe
                                                                               as a set of
                                  sememe
sememe vectors of sense sijk as Rsijk        := {smijk1 , . . . , smijkm }.
    To obtain the context information of the given question, we denote its word
embeddings average as qavg and construct a context representation Cq as follow-
ing:
                           Xn
                    Cq =      softmax(tanh(wi> · qavg )) · wi                          (3)
                                i=1

Through this, the vector of sense seijk in Rsense
                                            wij is represented as below:
                      m
                      X
            seijk =         softmax(Wsm · tanh(sm>
                                                 ijkc · Cq ) + b3 ) · smijkc          (4)
                      c=1

   where Wsm is a weight matrix. And the representation of the j-th word in
the i-th relation is a weighted sum of its senses {seij1 , . . . , seijk }:
                       k
                       X
               rij =         softmax(Wse · tanh(se>
                                                  ijy · Cq ) + b4 ) · seijy           (5)
                       y=1

   Finally, we can apply average operation over all words in all relations and
obtain the sememe-channel representation of relations, denoted by rs . In this
way, we can compute the semantic similarity score of two channel as following:

                            Score = cos (q w , rw ) + cos (q s , rs ) .               (6)


3     Experiments and Evaluations
Due to Freebase no longer up-to-date, including the unavailability of APIs and
new dumps, we use the full Wikidata dump as our KB. We conduct our exper-
iments, namely, WebQuestionSP (WebQSP),QALD-7 (Task 4, English). We use
sememe annotations in HowNet for sememe-channel representation.
                   Table 1: Overall Average Results over Wikidata

                                       WebQSP               QALD-7
             Model
                               Precision Recall F1 Precision Recall  F1
       STAGG(2015) [2]          0.1911 0.2267 0.1828 0.1934 0.2463 0.1861
       Yu et al.(2017) [5]      0.2094 0.2453 0.1987 0.2173 0.2084 0.1958
     Sorokin et al.(2018) [1]   0.2686 0.3179 0.2588 0.2176 0.2751 0.2131
    Maheshwari et al.(2019) [4] 0.2678 0.3182 0.2619 0.2493 0.2691 0.2436
          word-channel          0.2686 0.3179 0.2588 0.1948 0.2535 0.2048
        sememe-channel          0.2467 0.2991 0.2438 0.2309 0.2919 0.2382
     Double-channel(Our) 0.2721 0.3343 0.2776 0.2678 0.3182 0.2619


    By Table 1, we show that our model is superior to all datasets and met-
rics. Our model achieves 51.9%, 39.7%, 7.3%, 6.0% higher F1-score compared
to STAGG, Yu et al. (2017), Sorokin et al. (2018), Maheshwari et al. (2019) on
WebQSP. Analogously, we achieve 40.6%, 33.8%, 22.9%, 7.5% higher F1-score on
QALD-7. We can conclude that our double-channel representation method per-
forms better than all baselines. We observe that ignoring either word or sememe,
perform worse than the double-channel settings. The comparison demonstrates
that two-channel representation preserve the complementary.

4    Conclusion
In this poster, we present a sememe-based approach to differentiate relations
with similar semantics in KBQA, where sememe can be leveraged as the minimal
semantics of words as an extra natural knowledge to enrich semantics for parsing.
In future work, we are interested in maximizing the sememe-level semantics in
overtaking the weakness of the word-level semantics in KBQA.

5    Acknowledgments
This work is supported by the National Key Research and Development Program
of China (2017YFC0908401) and the National Natural Science Foundation of
China (61972455). Xiaowang Zhang is supported by the Peiyang Young Scholars
in Tianjin University (2019XRX-0032).

References
1. Sorokin, D., Gurevych, I.: Modeling semantics with gated graph neural networks
   for knowledge base question answering. In: COLING’2018, pp.3306–3317.
2. Yih, W., Chang, M., He, X., Gao, J.: Semantic parsing via staged query graph
   generation: question answering with knowledge base. In: ACL’2015, pp.1321–1331.
3. Bloomfield, L.: A set of postulates for the science of language. Language, 1926, 2(3):
   153-164.
4. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A.,
   Lehmann, J. (2019). Learning to rank query graphs for complex question answering
   over knowledge graphs. In: ISWC’19, pp.487-504.
5. Yu, M., Yin, W., Hasan,K.S., Santos, C.N., Xiang,B., Zhou,B.: Improved neural
   relation detection for knowledge base question answering. In: ACL’2017, pp.571–
   581.

</pre>