=Paper= {{Paper |id=Vol-2350/paper5 |storemode=property |title=Building Knowledge Base through Deep Learning Relation Extraction and Wikidata |pdfUrl=https://ceur-ws.org/Vol-2350/paper5.pdf |volume=Vol-2350 |authors=Pero Subasic,Hongfeng Yin,Xiao Lin |dblpUrl=https://dblp.org/rec/conf/aaaiss/SubasicYL19 }} ==Building Knowledge Base through Deep Learning Relation Extraction and Wikidata== https://ceur-ws.org/Vol-2350/paper5.pdf
   Building Knowledge Base through Deep Learning Relation Extraction
                             and Wikidata
                                           Pero Subasic, Hongfeng Yin and Xiao Lin
                                       AI Agents Group, DOCOMO Innovations Inc, Palo Alto, CA, USA
                                               {psubasic, hyin, xlin}@docomoinnovations.com




                              Abstract                                     for compactness and completeness are plainly at odds with
   Many AI agent tasks require domain specific knowledge                   each other such that existing KG generation techniques fail
   graph (KG) that is compact and complete. We present a                   to satisfy both objectives properly. Accordingly, there is a
   methodology to build domain specific KG by merging out-                 need for an improved knowledge graph generation tech-
   put from deep learning-based relation extraction from free              nique that satisfies the conflicting needs for completeness
   text and existing knowledge database such as Wikidata. We
   first form a static KG by traversing knowledge database                 and compactness. We also aim to build a methodology to
   constrained by domain keywords. Very large high-quality                 support easier knowledge base construction in multiple
   training data set is then generated automatically by match-             languages and domains.
   ing Common Crawl data with relation keywords extracted                     We thus propose a methodology to build a domain spe-
   from knowledge database. We describe the training data                  cific KG. Figure 1 depicts the processes of domain specific
   generation process in detail and subsequent experiments
   with deep learning approaches to relation extraction. The re-           KG generation through deep learning-based relation ex-
   sulting model is used to generate new triples from free text            traction and knowledge database. We choose Wikidata as
   corpus and create a dynamic KG. The static and dynamic                  the initial knowledge database. After being language fil-
   KGs are then merged into a new KB satisfying the require-               tered, the database is transformed and stored into Mon-
   ment of specific knowledge-oriented AI tasks such as ques-              goDB so that a hierarchical traversal starting from a set of
   tion answering, chatting, or intelligent retrieval. The pro-
   posed methodology can be easily transferred to other do-                seed keywords could be performed efficiently. This set of
   mains or languages.                                                     seed keywords can be given for specific application thus
                                                                           this approach can be applied to arbitrary domain. It is also
                                                                           possible to extract this set of keywords automatically from
                          Introduction                                     some given text corpora. The resulting subject-relation-
Knowledge graph (KG) plays an important role in closed                     object triples from this step are used to form a so-called
domain question-answering (QA) systems. There are many                     static KG and also are used to match sentences from
large-scale KGs available (Bollacker 2008; Lehmann et al.                  Common Crawl free text to create a large dataset to train
2012; Lenat 1995; Mitchell et al. 2018; Vrandecic and                      our relation extraction model. The trained model is then
Krotzsh 2014). To answer user queries, a KG should be                      applied to infer new triples from free text corpora which
compact (pertain to a particular topic) or the QA engine                   form a dynamic KG to satisfy the requirement of com-
may provide wrong answers due to the knowledge graph                       pleteness. The static and dynamic KGs are then aggregated
having too many extraneous facts and relations. The                        into a new KG that can be exported into various formats
knowledge graph should be complete so as to have as                        such as RDF, property graph etc., and be used by a domain
many facts as possible about the topic of interest or the QA               specific knowledge-based AI agent.
engine may be unable to answer user’s query. The need                         The paper first reviews the related works regarding
                                                                           knowledge graph generation and relation extraction. It then
                                                                           describes our label dataset preparation, relation extraction
Copyright held by the author(s). In A. Martin, K. Hinkelmann, A. Gerber,   model and KG generation in details, followed by some re-
D. Lenat, F. van Harmelen, P. Clark (Eds.), Proceedings of the AAAI        sults of experiments of benchmarking relation extraction
2019 Spring Symposium on Combining Machine Learning with                   models and application of proposed approach for a soccer
Knowledge Engineering (AAAI-MAKE 2019). Stanford University, Palo
Alto, California, USA, March 25-27, 2019.                                  domain.
                                                                 researches focus on tiny improvements on the noisy train-
                                                                 ing data. However, these RE results fall short from re-
                                                                 quirements of practical applications. The biggest challenge
                                                                 of RE is to automatically generate massive high-quality
                                                                 training data. We solve this problem by matching Common
                                                                 Crawl data with a structured knowledge base like Wikidata.
                                                                    Our approach is thus unique in that it utilizes a struc-
                                                                 tured database to form a static KG through hierarchical
                                                                 traversal of links connected with domain keywords for
                                                                 compactness. This KG is used to generate triples to train
                                                                 sequence tagging relation extraction model to infer new
                                                                 triples from free text corpus and generate a dynamic KG
                                                                 for completeness. The major contribution of our study is
  Figure 1. Flow Diagram for Construction of Domain Specific     that we generated a large dataset for relation extraction
                      Knowledge Graph.                           model training. Furthermore, the approach is easily trans-
                                                                 ferrable to other domains and languages as long as the text
                                                                 data is available. Specifically, to transfer to a new domain,
                                                                 we need a new set of keywords or documents representing
                     Related Work                                the domain. To transfer to a new language, we need entity
A knowledge graph could be constructed by collaborative          extractors, static knowledge graph in that language (Wiki-
way to collect entities and links (Clark 2014), or automatic     data satisfies this requirement), and large text corpus in
natural language processing to obtain subject-relation-          target language (Common Crawl satisfies that requirement,
object triples, such as through transformation of embed-         but other sources can be used).
ding representation (Lin et al. 2015; Socher et al. 2012;
Wang et al. 2014), deep neural network model extraction
approaches (Santos, Xiang and Zhou 2015; Zeng 2014;                               Relation Extraction
Zeng et al. 2015; Zhang and Wang; 2015; Zhou et al 2016)
and inference method from graph paths (Guu, Miller and           Label Data Generation
Liang et al. 2015). Researchers in recent years also propose     The datasets used in distant supervision are usually devel-
to use end-to-end system (Kertkeidkachorn and Ichise             oped by aligning a structural knowledge base like Freebase
2017, Shang et al. 2019), deep reinforcement learning            with free text like Wikipedia or news. One example is
method (Feng 2018, Yang, Yang and Cohen 2017) to get             (Riedel, Yao, and McCallum 2010) who match Freebase
better result.                                                   relations with the New York Times (NYT) corpus. Usually,
   As one of the major approaches to expand KG, relation         two entities with relation in a sentence associate a keyword
extraction (RE) aims to extract relational facts from plain      in the sentence to represent the relation in the knowledge
text between entities contained in text. Supervised learning     base. Therefore, it is required to match two entities and a
approach is effective, but preparation of a high-quality la-     keyword for a sentence to generate a positive relation. This
beled data is a major bottleneck in practice. One technique      will largely reduce noise in generating positive samples.
to avoid this difficulty is distant supervision (Mintz et al.,   However, the total number of positive samples is also
2009), which assumes that if two entities have a relation-       largely reduced. The problem can be solved by using very
ship in a known knowledge base, then all sentences that          large free text corpora: billions of web pages available in
mention these two entities will express that relationship in     Common Crawl web data.
some way. All sentences that contain these two entities are         The Common Crawl corpus contains petabytes of data
selected as training instances. The distant supervision is an    collected over 8 years of web crawling. The corpus con-
effective method of automatically labeling training data.        tains raw web page data, metadata extracts and text ex-
However, it has a major shortcoming. The distant supervi-        tracts. We use one year of Common Crawl text data. After
sion assumption is too strong and causes the wrong label         language filtering, cleaning and deduplication there are
problem. A sentence that mentions two entities does not          about 6 billion English web pages. The training data gener-
necessarily express their relation in a knowledge base. It is    ation is shown in Fig. 2, and the in-house entity extraction
possible that these two entities may simply appear in a sen-     system is used to label entities in Common Crawl text.
tence without the specific relation in the knowledge base.          A Wikdata relation category has an id P-number, a
The noisy training data fundamentally limit the perfor-          relation name and several mapped relation keywords, for
mances of any trained model (Luo et al. 2017). Most of RE        example:
     •  P-number: P19                                                                       • actor         1,014,708
     •  Name: place of birth                                                                • capital of     957,203
     •  Mapped relation keywords: birth city, birth                                         • son           954,848
        location, birth place, birthplace, born at, born in,                                • directed by 890,268
        location born, location of birth, POB                                               • married       843,009
Wikidata dump used in our task consists of:                                                 • born in       796,941
   • 48,756,678 triples                                                                     • coach         736,866
   • 783 relation categories                                                           Therefore, the massive high-quality labeled sentences are
   • 2,384 relation keywords                                                           generated automatically for training supervised machine
   •                                                                                   leaning models. With the labeled sentences, we can build
                                                       Entity
                                                                                       RE models for specific domains, for specific relations or
                                Common
    Wikidata                   crawl data
                                                     extraction                        for open domain.

  Relation triples
                                                          Entity texts                 Relation Extraction Models for Soccer
  with Wikidada
     relation                                                                          In a specific domain example, we use labeled sentences to
    categories
                                                                                       build RE models for soccer. First, we extract 17,950 soccer
                         Relation triples
                                            Entity matching and                        entities and 722,528 triples with at least one soccer entity
  Relation category to                        relation keyword     Negative sentence
   relation keyword
                          with relation
                         categories and
                                                matching for          generation       from Wikidata, 78 relation categories with 640 relation
        matching                             positive sentences
                           keywords                                                    keywords.
  Wikidata category
  to keyword table
                                                                                       Training data generation:
                                                          Labeled sentences
                                                                                       • Positive sample generation:
                                                                                                1. Select two entities (e1, e2) and a relation
         Figure 2. Flow Chart of Training Data Generation                                           keyword (r_kw with relation category r_cat)
                                                                                                    in a matched sentence s
First, Wikidata relation category triples are mapped to                                         2. If (e1, r_kw, e2) is in the relation keyword
Wikidata relation keyword triples. Then, Wikidata                                                   triples
keyword triples are matched with Common Crawl entity-                                           3. Set “e1, e2, r_kw, r_cat, s” as a positive
labeled sentences. It yields:                                                                       sample
     • 386 million matched sentences                                                   • Negative sample generation
     • 65 million unique sentences.                                                             1. Select two entities (e1, e2) in a sentence s
     • There are 688 relation keywords with more than                                           2. One entity must be a soccer entity
         1000 matched sentences                                                                 3. Both entities are in the entity list generated
     • Example:                                                                                     from Wikidata relation triples
         • Wikidata keyword triple:                                                             4. Set “e1, e2, NONE, NA, s” as a negative
                  o [[Martín_Sastre]]         born       in                                         sample. Select randomly with some probabil-
                        [[Montevideo]]                                                              ity to obtain sufficient number of negative
         • Matched Common Crawl sentence:                                                           samples.
                  o [[Martín_Sastre]] was born in                                               5. Remove duplicated samples
                        [[Montevideo]] in 1976 and lives in                            • Total Generated Training Data:
                        [[Madrid]]                                                              o 2,121,640 samples
Matched unique sentences for top relation keywords                                              o 335,734 positive relation sentences
     • state            4,336,046                                                               o 1,785,906 negative relation sentences
     • city             4,251.983
     • capital          2,797,477                                                      Building the Models
     • starring         2,032.749
     • borders          1,874,461
     • town             1,737,493
     • wife             1,730,569
     • founder          1,337,416
     • is located in 1,136,473
     • husband          1,016,505
                                                                          Model              F1       Precision     Recall

                                                                   Sequence Tagging        98.25%      97.61%      98.89%
                                                                   PCNN                    82.89%      86.00%      80.00%
                                                                   LSTM Classification     91.28%      90.10%      92.50%
                                                                   LSTM + Attention        95.40%      94.80%      96.00%

                                                                      Table 1. Performance Comparison of Different Models
                                                                 In comparison with distant supervision datasets our da-
                                                                 tasets can train much higher-quality models.




     Figure 3. Flow Chart of Model Training Comparison
The PCNN model (Zeng et al. 2015), LSTM with Atten-
tion model (Zhou et al. 2016) and LSTM classification
model (Zhang and Wang 2015) are trained with 90% data
for training, 10% data for testing. Also, sequence tag mod-
el (Lample et al. 2016) is trained with 80% data for train-
ing, 10% data for testing during training and 10% data for
testing after training.                                            Comparison of Figure 3 in Reference (Lin et al. 2016)
   A positive sentence is tagged as follows:
[[ John ]] Entity lives in [[ New York ]] Entity                 To validate the wining sequence tagging approach, we cre-
O O O O B-Re I-Re O O O O O                                      ate validation data set from Common Crawl outside the
                                                                 training data by using different time period. We also vali-
   (a)                                                           dated on data used from other data source, different from
                                                                 Common Crawl with similar outcome. Validation results
                                                                 are as follows:
                                                                           o F1 93.15%
                                                                           o Precision: 90.43%
                                                                           o Recall: 96.02%
                                                                 Although the models perform well for the training data, we
                                                                 found that there are a lot of false positives when the mod-
                                                                 els are applied on arbitrary free text. This issue can be im-
                                                                 proved with new negative sample generation which we de-
                                                                 scribe here.
                                                                 • Improved negative sample generation: a sentence s
                                                                      should have a keyword in the 640 relation keywords
                                                                           o For each pair of entities (e1, e2) in s
                                                                                   § e1 or e2 is a soccer entity
                                                                                   § e1 and e2 are in the entity list gener-
                                                                                        ated from Wikidata relation triples
            (b)                              (c)                                   § If s is a matched sentence, and e1
                                                                                        and e2 are not in the relation triple
 Figure 4. Performance Comparison (a) Precision vs Recall; (b)                          of s
                   & (c) Precision vs Epoch                                                  • Set “e1, e2, NONE, NA, s”
Table 1. shows the performance of each model. Based on                                            as a negative sample with a
F1 score: Sequence Tagging > LSTM+Attention > LSTM                                                probability 0.5
Classification > PCNN.                                                             § If s is not a matched sentence
                         •     Set “e1, e2, NONE, NA, s”              Construction of Knowledge Graph
                              as a negative sample with a
                              probability 0.15 (15 out of   As illustrated in Figure 1, the goal of the proposed ap-
                              100 samples are selected as   proach is to build a knowledge graph from a static KG
                              negative)                     built from knowledge database and a dynamic KG generat-
        o Remove duplicated samples                         ed from deep learning relation extraction. To form the stat-
• The new training data with improved negative sample       ic knowledge base, a suitable knowledge database (in this
    generation:                                             example, Wikidata) is language filtered (English, Japanese,
        o 1,702,924 samples                                 and so on) and the resulting knowledge graph is stored in a
        o 363,458 positive samples                          suitable database platform such as MongoDB. To build the
        o 1,339,466 Negative samples                        static knowledge graph, database is searched for seed key-
With this training data, the performances of sequence       words to act as seed vertices for the resulting knowledge
tagging model on unseen data are reduced only slightly:     graph. These seed vertices are then expanded by hierar-
        o F1: 92.38%                                        chical traversal. In particular, the hierarchical traversal
        o Precision: 89.42%,                                proceeds by finding all descendent vertices of the seed ver-
        o Recall: 95.53%                                    tex. The algorithm then recursively iterates across these
                                                            descendent (child) vertices. In addition, all ancestor verti-
                                                            ces that have links to the seed vertex are identified by add-
                                                            ing parent Wikidata items and recursively iterating across
                                                            the parents of the seed vertex. Since the seed keywords are
                                                            directed to the topic of interest (e.g., soccer), the hierar-
                                                            chical traversal of a resulting knowledge graph is also per-
                                                            forming a domain filtering to the topic of interest. The rela-
                                                            tion triples from static knowledge graph may then be ex-
                                                            tracted and expanded to assist in the labeling of positive
                                                            and negative sentences from a training corpus to train a
                                                            deep learning relation extraction model. Deep learning
                                                            model applied on free text, such as news articles, blogs,
                                                            and similar up-to-date sources, generates dynamic
                                                            knowledge graph. The static and dynamic knowledge
                                                            graphs are then merged to form a combined knowledge
                                                            graph. The two approaches ensure that we have slowly-
        Figure 5. Soccer RE for Common Crawl data           changing (therefore ‘static’) knowledge as well as fast-
                                                            changing (therefore ‘dynamic’) knowledge in the resulting
Apply the model to Common Crawl data                        knowledge graph. When merging static KG and dynamic
Figure 5 shows the flowchart of soccer RE. For each         KG, several subjective rules are enforced: (1) if relation of
sentence in Common Crawl entity texts, if the sen-          a triple in the dynamic KG is not defined in Wikidata
tence contains at least one soccer entity and two enti-     property list, this triple will be ignored; (2) if any of two
ties in the entity list generated from Wikidata re-         entities of a triple in the dynamic KG is not defined in the
                                                            Wikidata item list, a pseudo item is created with a unique
lation triples, the sentence is a soccer sentence.
                                                            Q-number and the triple will be added into the knowledge
Then, the duplicated soccer sentences are re-               base as a valid link; (3) relation defined in static KG has
moved and the sentences without the relation                higher precedence – if a relation in dynamic KG is con-
keywords are filtered out. The left sentences are           flicted with a relation in static KG, the one in dynamic KG
tagged with IOB tags. Finally, the RE models are            will be ignored and the relation in static KG will be kept in
applied to the sentences to extract the relations.          the merged knowledge base. The merged KG only exists in
The results are:                                            the final Neo4j database.
                                                               In our experiment, we assumed a single fact knowledge-
  o   Total soccer sentences with two labeled entities:     based question answering system in soccer domain to
      64,085,913                                            demonstrate the proposed approach. To assure that the QA
  o   Total relations extracted: 600,964                    system can answer user query correctly, we make the KG
  o   Aggregate unique relations: 147,486.                  contain facts represented by triples related with soccer as
                                                            closely as possible. At the same time, we included as many
soccer related triples as possible. Wikidata is adopted as      when merged KG is used because the corresponding facts
structured database to generate static KG. Relation extrac-     are added from relation extraction results.
tion approach described in Section 2 is used to extract soc-
cer related triples which are then merged into dynamic KG.         Q: who is Louis Giskus?
Table 2 lists the top three relations in the new KG. It            A: [Louis Giskus] => [chairperson] => [Surinamese
demonstrates that triple facts in the aggregated KG are cor-       Football Association]
rectly condensed into a specific domain of soccer. P641            Q: how Antonio Conte is related with Juventus
(sport, in which the subject participates or belongs to) is        F.C.?
not used in sentence labeling for its coverage is too broad.       A: [head coach]
An example of triple with P641 is given: [Lionel Messi] =>         Q: who is the manager of Manchester City F.C.?
[P641 (sport)] => [association football].                          A: [Manchester City F.C.] => [represented by] =>
                                                                   [Pep Guardiola]
  P-number                Label               Occurrence
                                                                           Table 4. QA Examples using Dynamic KG
  P641           sport                         404,761
  P54            member of sports team         128,632            In Wikidata, defined items have different language la-
  P1344          Participant                    30,307          bels. By incorporating corresponding language labels into
                                                                Neo4j database, the resulting KB can easily accommodate
         Table 2. Top 3 Relations in Static Soccer KG           the capabilities of visualizing or querying in languages
   The statistics of static KG and dynamic KG is listed in      other than English. As a demonstration, Figure 6 shows a
Table 3. As it shows, static KG contains only 0.81% enti-       query using Japanese to query the KB.
ties and 0.29% links from the original Wikidata. Queries
performed on static KG thus will be significantly more ef-
ficient than on the original database, thus lowering re-
quirement for computation power and memory usage. This
is especially important for AI agent edge devices where
hardware resources are limited. At the same time, the links
in domain KG increased by 15.6%, resulting in a large in-
crease of coverage. This number is dependent on the size
of corpus used to extract relations. Larger corpus size will
yield larger link increase resulting in more knowledge cov-
erage. For example, in Wikidata database there are 67 links
starting with Q170645 (2018 FIFA World Cup). In merged
KG, this number increases to 472.                                      Figure 6. KB Query and Visualization in Japanese

                          Static KG          Merged KG
  Number of Entities       405,639            425,224                                   Summary
  Number of Predicates     676,500            807,718
  % of       Entity         0.81%                               This paper presents a methodology to build a knowledge
                                                  N/A           graph for domain specific AI application where KG is re-
  Wikidata Predicate        0.29%
  Increased Comparing to Static KG               15.6%          quired to be compact and complete. This KG is constructed
                                                                by aggregating a static knowledge database such as Wiki-
  Table 3. Triple Statistics of Aggregated Knowledge Graph      data and a dynamic knowledge database, which is formed
   Since there is no real question-answering system that is     by subject-relation-object triples extracted from free text
based on the knowledge graphs created in this study, im-        corpora through deep learning relation extraction model. In
provement of question-answering performance from the            this study, a large high-quality dataset for training relation
merged KG over simply static KG or dynamic KG alone is          extraction model is developed by matching Common
not able to be evaluated quantitatively. Neo4j is used in the   Crawl data with knowledge database. This dataset was
demonstration to simulate QA system – instead of a natural      used to train our own sequence tagging based relation ex-
language question, a database query is issued to get re-        traction model and achieved the-state-of-art performance.
sponse (in a real system this is usually accomplished by        Another important contribution is multi-language and mul-
appropriate AIML mapping). Table 4 shows some query             ti-domain applicability of the approach.
examples. As expected, some questions can be answered
   It is inevitable that there might be wrong “facts” inferred      Lin, Y. et al. 2016. Neural Relation Extraction with Selective At-
from test corpora by the relation extraction model. It would        tention over Instances. In Proceedings of the 54th Annual Meeting
                                                                    of the Association for Computational Linguistics, 2124-2133,
be an interesting but challenging future work to evaluate           Berlin, Germany.
validity of predicted triples and delete these wrong “facts”
                                                                    Luo, B. et al. 2017. Learning with Noise: Enhance Distantly Su-
in order that they will not be integrated into knowledge            pervised Relation Extraction with Dynamic Transition Matrix. In
base and become “truth”. To infer new links directly from           Proceedings of the 55th Annual Meeting of the Association for
knowledge database to further expand the knowledge base             Computational Linguistics, 430-439, Vancouver, Canada.
could be another interesting topic. Another topic that could        Mintz, M. 2009. Distant Supervision for Relation Extraction
be worthy to pursue is to study whether joint named entity          without Labeled Data. In Proceedings of the 47th Annual Meeting
recognition and relation extraction could be integrated into        of the ACL and the 4th IJCNLP of the AFNLP, 1003-1011, Sun-
                                                                    tec, Singapore.
our flow (Bekoulis et al. 2018).
                                                                    Riedel, S., Yao, L. and McCallum, A. 2010. Modeling Relations
                                                                    and Their Mentions without Labeled Text. In: Balcázar J.L.,
                                                                    Bonchi F., Gionis A., Sebag M. (eds.) Machine Learning and
                    Acknowledgments                                 Knowledge Discovery in Databases. ECML PKDD 2010. Lecture
We thank Yinrui Li for conducting the benchmark study of            Notes in Computer Science, vol 6323. Springer, Berlin, Heidel-
                                                                    berg.
deep learning algorithms for relation extraction and contri-
                                                                    Santos, C., Xiang, B. and Zhou, B. 2015. Classifying Relations
bution to the data of Figure 4. We also thank the anony-
                                                                    by Ranking with Convolutional Neural Networks. In Proceedings
mous reviewers for their helpful comments.                          of the 53rd Annual Meeting of the Association for Computational
                                                                    Linguistics and the 7th International Joint Conference on Natural
                                                                    Language Processing, 626–634, Beijing, China.
                         References                                 Shang, C. et al., 2019. End-to-end Structure-Aware Convolutional
Bekoulis, G. et al. 2018. Joint Entity Recognition and Relation     Networks for Knowledge Base Completion, arXiv:1811.04441,
Extraction as a Multi-head Selection Problem. Expert System         accepted for Proceedings of AAAI 2019.
with Applications, vol 114, 34-45.                                  Socher, R. et al. 2012. Semantic Compositionality through Recur-
Bollacker, K. et al. 2008. Freebase: A Collaboratively Created      sive Matrix-Vector Spaces. In Proceedings of the 2012 Joint Con-
Graph Database for Structuring Human Knowledge. In Proceed-         ference on Empirical Methods in Natural Language Processing
ings of SIGMOD’08, 1247-1249, ACM                                   and Computational Natural Language Learning, 1201-1211, Jeju
                                                                    Island, Korea.
Clark, P. et al. 2014. Automatic Construction of Inference-
Supporting Knowledge Bases. In Proceedings of 4th Workshop on       Vrandecic, D. and Krotzsh M. 2014. Wikidata: A Free Collabora-
Automated Knowledge Base Construction (AKBC’2014).                  tive Knowledgebase. Communications of the ACM 57(10):78-85.
Feng, J. et al. 2018, Reinforcement Learning for Relation Classi-   Wang, Z. et al. 2014. Knowledge Graph Embedding by Translat-
fication from Noisy Data. in Proceedings of the 32nd AAAI Con-      ing on Hyperplanes. In Proceedings of the 28th AAAI Conference
ference on Artificial Intelligence, 5779-5786.                      on Artificial Intelligence, 1112-1119.
Guu., K., Miller, J. and Liang, P. 2015. Traversing Knowledge       Xie, Q. et al. 2017. An Interpretable Knowledge Transfer Model
Graphs in Vector Space. In Proceedings of the 2015 Conference       for Knowledge Base Completion. In Proceedings of the 55th An-
on Empirical Methods in Natural Language Processing, 318-327,       nual Meeting of the Association for Computational Linguistics,
Lisbon, Portugal.                                                   950–962, Vancouver, Canada, ACL.
Kertkeidkachorn, N. and Ichise, R. 2017. T2KG: An End-to-End        Yang, F., Yang, Z. and Cohen, W. 2017. Differentiable Learning
System for Creating Knowledge Graph from Unstructured Text.         of Logical Rules for Knowledge Base Reasoning. In Proceedings
In Proceeding of AAAI-17 Workshop on Knowledge-Based Tech-          of 31st Conference on Neural Information Processing Systems
niques for Problem Solving and Reasoning, 743-749.                  (NIPS 2017), Long Beach, CA, USA.
Lample, G. et al. 2016. Neural Architectures for Named Entity       Zeng, D. 2014. Relation Classification via Convolutional Deep
Recognition. In Proceedings of NAACL-HLT 2016, 260–270, San         Neural Network. In Proceedings of the 25th International Confer-
Diego, California.                                                  ence on Computational Linguistics (COLING 2014), 2335–2344,
                                                                    Dublin, Ireland.
Lehmann, J. et al. 2012. DBpedia – a Large-scale, Multilingual
Knowledge Base Extracted from Wikipedia. Semantic Web               Zeng, D., Liu., K., Chen., Y. and Zhao, J. 2015. Distant Supervi-
1(2012):1-5.                                                        sion for Relation Extraction via Piecewise Convolutional Neural
                                                                    Networks. In Proceedings of the 2015 Conference on Empirical
Lenat, D. 1995. CYC: A Large-Scale Investment in Knowledge          Methods in Natural Language Processing, 1753–1762, Lisbon,
Infrastructure. Communications of the ACM 38(11):33-38.             Portugal, ACL.
Mitchell, T. et al. 2018. Never Ending Learning. Communications     Zhang, D and Wang, D. 2015. Relation Classification via Recur-
of the ACM 61(5):103-115.                                           rent Neural Network. arXiv:1508.01006.
Lin, Y. et al. 2015. Learning Entity and Relation Embeddings for    Zhou, P. et al. 2016. Attention-Based Bidirectional Long Short-
Knowledge Graph Completion. In Proceedings of the 29th AAAI         Term Memory Networks for Relation Classification. In Proceed-
Conference on Artificial Intelligence, 2181-2187.                   ings of the 54th Annual Meeting of the Association for Computa-
                                                                    tional Linguistics, 207–212, Berlin, Germany, ACL.