<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Word, Mention and Entity Joint Embedding for Entity Linking</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhichun Wang</string-name>
          <email>zcwang@bnu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Danlu Wen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yong Huang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chu Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Beijing Normal University</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Word, Mention and Entity Joint Embedding</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Entity linking is a important for connecting text data and knowledge bases. This poster presents a word, mention and entity joint embedding method, which can be used in computing semantic relatedness in entity linking approaches. Recently, several large-scale Knowledge Bases have been created and successfully applied to many areas, such as DBpedia, YAGO, and Freebase. In many applications of knowledge bases, a basic task is to identify entities in text and linking them to a given knowledge base, which is usually called entity linking. The task of entity linking is challenging because of entity name variations and entity ambiguity. On one hand, one entity can be mentioned in text by different names; for example, both ”Beijing” and ”Peking” can refer to the same entity ”Beijing City”. One the other hand, the same mention can refer to multiple different entities; for example, ”Apple” may refer to ”Apple Inc” and the fruit ”Apple”, etc. Lots of work has been done on the problem of entity linking, [5] gives detailed review of all kinds of entity linking approaches. In the entity linking approaches, computing semantic relatedness between entities and contextual context is very important for entity disambiguation. In this poster, we propose a new way to compute the relatedness. A word, mention and entity joint embedding learning methods is proposed. Based on the results of the joint embedding, different kinds of relatedness among words, mentions, and entities can be easily computed. The rest of this paper introduces the proposed embedding method in detail. In this paper, we propose to use Skip-gram model [3] to jointly map entities, mentions and words to the same low-dimensional vector space. By using the jointly learned vectors, various relatednesses can be efficiently computed, such as entity-word relatedness, mention-word relatedness and entity-entity relatedness.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
2.1</p>
      <p>The Skip-gram model
The skip-gram model is a recently published learning framework to learn continuous
word vectors from text corpora. Each word in the text corpora is mapped to a continuous
embedding space. The model is trained to find word representations that are good at
predicting the surrounding words in a sentence or a document. Given a sequence of
training words w1; w2; :::; wT , the objective of the model is to maximize the average
log probability
where c is the size of training context, p(wt+j jwt) is defined as</p>
      <p>T
O = 1 X</p>
      <p>T</p>
      <p>X</p>
      <p>log p(wt+j jwt)
t=1 c j c;j6=0
p(wOjwI ) =
exp(vw0O T</p>
      <p>vwI )
PwW=1 exp(vw0O TvwI )
(1)
(2)
where vw and vw0 are the input and output vector representations of w, and W is the
number of words in the vocabulary. The learned vectors of words can capture the
semantic similarity of words; similar words are mapped to nearby places in a low-dimensional
vector space.</p>
      <p>r(a; b) =</p>
      <p>a b
k a kk b k
Input</p>
      <p>Projection
Beijing is the capital of People's
Republic of China and ……
MEN(Beijing) ENT(Beijing) is the capital of
MEN(People'sRepublicofChina) ENT(China)
and ……
capital</p>
      <p>Output
MEN(Beijing)
ENT(Beijing)
is
the
of
MEN(People'sRepublicofChina)</p>
      <p>ENT(China)
and
Skip-gram model is initially designed to learn embeddings of words. In order to
extend the word model to a joint model of word, entity and mention, we add mentions
and entities in the training corpus which only contains words before. Let the original
corpus be C = fw1; w2; :::; wN g, if a certain word sequence s = fwi; :::; wi+kg in
C is a mention to an entity e in the knowledge base, we replace s with two tokens
fM EN (wiwi+1:::wi+k); EN T (e)g; after that, the original word sequence containing
s in C becomes fwi 1; M EN (wiwi+1:::wi+k); EN T (e); wi+k+1g. After annotating
all the mentions and their corresponding entities, C is converted to a hybrid corpus C0
that containing words, mentions, and entities. C0 is then used to train the Skip-gram
model, which will generate representations in the same vector space for words,
mentions, and entities. Figure 1 shows an example of using the Skip-gram model to predict
the surrounding tokens of the word capital in the example sentence.
2.3</p>
      <p>Using Wikipedia as Training Corpus
Annotating mentions and entities in a corpus is a time-consuming task. Fortunately, if
we use Wikipedia as a knowledge base, it contains all the annotations we need. Figure 2
shows part of the page of Beijing in Wikipedia and its source text in editing model. In
Wikipedia, a internal hyperlink is annotated by [[entity j mention]]; it also could be
[[entity]] when the entity is mentioned by the exact name of it. Processing these inner
links, we can generate the corpus containing words, mentions and entities together. So
in this paper, Wikipedia is used as the target knowledge base that entities link to, and
its articles are processed to train the skip-gram model.</p>
      <p>
        '''Beijing''', formerly '''Peking''', is the capital of the [[China|People's Republic of China]]
and one of the most [[List of metropolitan areas by population|populous cities in the world]]
…
To evaluate the effectiveness of the proposed embedding model, we use the embedding
results in a entity linking approach. The entity linking approach was first introduced in
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and we replace the relatedness measure in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] with the cosine similarity between
vectors of entities. And furthermore, we add relatedness between entities and their
contextual words, which is computed as the cosine similarity between vectors of entities
and words.
      </p>
      <p>In the evaluation, English Wikipedia is used as the target Knowledge Base for entity
linking. The dataset of Yahoo Search Query Log To Entities 1 is used for the evaluation.
1 http://webscope.sandbox.yahoo.com/catalog.php</p>
      <p>?datatype=l&amp;did=66
This dataset contains manually identified links to entities in Wikipedia. In total, there
are 2,635 queries in 980 search sessions, 4,691 mentions are annotated which link to
4,725 entities in Wikipedia.</p>
      <p>
        We also compared our approach with two entity linking systems, Illinois Wikifier [
        <xref ref-type="bibr" rid="ref1 ref4">4,
1</xref>
        ] and DBpedia Spotlight [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Illinois Wikifier is a entity linking system that was
developed by University of Illinois at Urbana-Champaign. DBpedia Spotlight is a system for
automatically annotating text documents with DBpedia URIs. Because DBpedia is built
from Wikipedia and each DBpedia URI corresponds to a Wikipedia entity, the results
of DBpedia Spotlight can be easily converted to entity links of Wikipedia.
      </p>
      <p>Table 1 shows the evaluation results of three different approaches. The precision and
recall of each approach are evaluated. According to the results, our approach achieves
the best precision and recall. It shows that the joint embedding model is effective in
entity linking problem.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. X. Cheng and
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          .
          <article-title>Relational inference for wikification</article-title>
          .
          <source>In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J.</given-names>
            <surname>Daiber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hokamp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          .
          <article-title>Improving efficiency and accuracy in multilingual entity extraction</article-title>
          .
          <source>In Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          . In C. J.
          <string-name>
            <surname>C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Q. Weinberger, editors,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pages
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          . Curran Associates, Inc.,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>L.</given-names>
            <surname>Ratinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Downey</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Anderson</surname>
          </string-name>
          .
          <article-title>Local and global algorithms for disambiguation to wikipedia</article-title>
          .
          <source>In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11</source>
          , pages
          <fpage>1375</fpage>
          -
          <lpage>1384</lpage>
          , Stroudsburg, PA, USA,
          <year>2011</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          .
          <article-title>Entity linking with a knowledge base: Issues, techniques, and solutions</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>27</volume>
          (
          <issue>2</issue>
          ):
          <fpage>443</fpage>
          -
          <lpage>460</lpage>
          ,
          <year>Feb 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Tang</surname>
          </string-name>
          .
          <article-title>Boosting cross-lingual knowledge linking via concept annotation</article-title>
          .
          <source>In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI '13</source>
          , pages
          <fpage>2733</fpage>
          -
          <lpage>2739</lpage>
          . AAAI Press,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>