<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IEKM-MD: An Intelligent Platform for Information Extraction and Knowledge Mining in Multi-Domains</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yu Li†</string-name>
          <email>yul@ mail.las.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tao Yue</string-name>
          <email>taoyue@ mail.las.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wu Zhenxin</string-name>
          <email>wuzx@ mail.las.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Information extraction, Relation prediction, Active learning,</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Science Library</institution>
          ,
          <addr-line>Chinese</addr-line>
          ,
          <institution>Academy of Sciences</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Translation embedding</institution>
          ,
          <addr-line>Neural network</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>73</fpage>
      <lpage>78</lpage>
      <abstract>
        <p>The terminologies in different disciplines vary greatly, and the annotated corpora are scarce, which have limited the portability of information extraction models. The content of scientific articles is still underutilized. This paper constructs an intelligent platform for information extraction and knowledge mining, namely IEKMMD. Two innovative technologies are proposed: Firstly, a phraselevel scientific entity extraction model combining neural network and active learning is designed, which can reduce the model's dependence on large-scale corpus. Secondly, a translation-based relation prediction model is provided, which improves the relation embeddings by optimizing loss function. In addition, the platform integrates the advanced entity recognition model (spaCy.NER) and the keyword extraction model (RAKE). It provides abundant services for fine-grained and multi-dimensional knowledge, including problem discovery, method recognition, relation representation and hot spot detection. We carried out the experiments in three different domains: Artificial Intelligence, Nanotechnology and Genetic Engineering. The average accuracies of scientific entity extraction respectively are 0.91, 0.52 and 0.76.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>
        • Computing methodologies • Artificial intelligence • Natural
language processing • Information extraction
With the progress of science and technology, there are more and
more fields and scientific articles. Information extraction and
knowledge mining in the specific field enable scholars to quickly
grasp the overall outline of information, and track the
development of fine-grained knowledge. There are many mature
models to extract information from texts, such as BiLSTM-CNN
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], CNN-BiLSTM-CRF [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], LM-LSTM-CRF [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which have
achieved high scores in various tasks of natural language
processing. In fact, these supervised learning models inevitably
consume large amounts of high-quality annotated corpus in order
to fully learn the characteristics of natural language representation.
In most case, however, the annotated corpus in one specific field
is constructed manually by several experts, which is
timeconsuming and laborious. Therefore, it is hard to directly use a
well-trained model to other domains.
      </p>
      <p>
        How to extract information without massive annotated corpus
is a big challenge. Active Learning (AL) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has been proved to be
an effective way to solve the problem of corpus scarcity when
dealing with the classification tasks [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. However, it has not
been validated on the sequence labelling task, which is more
difficult to find the optimal result because its complexity increases
exponentially [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this paper, we introduce multiple active
learning strategies into information extraction for the first time, so
as to explore a cheap and efficient solution for recognizing the
fined-grain entities in multiple domains.
      </p>
      <p>
        Relation predication is another basic technology for
knowledge organization. Translation models see relation as a
process of translating the head entity to the tail entity, which have
been widely used to predict relations. There are some classic
translation models proposed from different perspectives: TransE
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is the first translation embedding model with fewer
parameters. TransH [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is presented to solve the problem of
complex relation representation. TransR [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] distinguishes the
semantic embedding for different types of relations, which wined
a better F-score. TransD [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] simplifies the projection process of
TransR and improves the computing efficiency.
      </p>
      <p>This paper aims to construct an intelligent platform for
information extraction and knowledge mining, which can be used
in multiple domains without much human intervention. The main
contributions are as follows: 1). with the limited annotated corpus,
an effective method combining neural network with active
learning recognizes scientific entities in multiple domains; 2). By
optimizing the loss function, an improved translation model
represents the semantic vectors more accurately and reaches the
convergence state faster with a small loss score compared with the
original model.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Intelligent Platform: IEKM-MD</title>
      <p>The technology framework of our platform is shown in Figure 1.</p>
      <p>This platform includes two innovative technologies: 1) the model
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
combining neural network with active learning extracts "problem"
and "method" entities, 2) the improved translation model predicts
relations between "problem" and "method" entities. At the same
time, the platform integrates two excellent tools (spaCy.NER1 and
RAKE2 ) to recognize the named entities and keywords. Finally,
this platform provides a variety of knowledge services for
researchers, including problem discovery, method recognition,
relation representation and hot spot detection. Besides, the
analyzers can perform richer downstream tasks based on our
platform, such as discipling analysis, trend explosion, new
technology detection, and so on.</p>
      <sec id="sec-2-1">
        <title>Portal</title>
      </sec>
      <sec id="sec-2-2">
        <title>Services</title>
      </sec>
      <sec id="sec-2-3">
        <title>Functions</title>
      </sec>
      <sec id="sec-2-4">
        <title>Databases</title>
        <sec id="sec-2-4-1">
          <title>Platform: IEKM-MD</title>
        </sec>
        <sec id="sec-2-4-2">
          <title>Problem</title>
          <p>discovery
Method
recognition
Relation
represent
Hotspots
detection</p>
        </sec>
        <sec id="sec-2-4-3">
          <title>Scientific entity extraction</title>
        </sec>
        <sec id="sec-2-4-4">
          <title>Relation</title>
          <p>prediction
Named
entity
extraction
Keyword
extraction
AI
GIS
Bio
……</p>
        </sec>
      </sec>
      <sec id="sec-2-5">
        <title>I nfrastructure</title>
        <sec id="sec-2-5-1">
          <title>Platform of storing and computing big data</title>
          <p>Scientific entity recognition contributes to extract phrases from
scientific articles. These phrases consist of several words which
describe the focus of article or the method proposed by author. In
order to reduce the dependence on annotated corpus, this paper
provides a semi-supervised learning model combining neural
network with active learning.</p>
          <p>The framework of the information extraction model is shown
in Figure 2. Firstly, the learning engine trains the parameters of
neural network by using a small number of annotated samples
(dozens of abstracts with semantic labels). Then, the trained
neural network predicts the labels of unannotated samples and
inputs the predicted scores to the selecting engine. Secondly,
according to the active learning strategies, the selecting engine
decides which samples are valuable and should be annotated
manually. Only the top 10% most valuable samples are labelled
by experts. Thirdly, the manually annotated samples are added
into the training set to re-train the neural network, in order to
improve the performance of label prediction. The whole process
runs repeatedly until the performance of model has no significant
optimization. Finally, the trained model predicts the “problems”
1 https://spacy.io/
2 https://github.com/aneesha/RAKE
and “methods” for all the unlabeled articles. More details about
parameter setting will be discussed in Section 3.1.</p>
          <p>feature
jointing</p>
          <p>CRF
decoding
unlabeled samples</p>
          <p>LearningEngine
reepcmWmnorfeoofwgaotoovavnchterrieieiododtldienon PrBLIIe---dattOOOOOOtaabaisscsekktkled</p>
          <p>predicted score
lossscore&lt; threshold
all samplesarelabelled</p>
          <p>No
labeled samples
we
provide
noavel
method
for
face
emotion
recognition
we
provide
noavel
method
for
face
emotion
recognition</p>
          <p>CNNcharacter
encoding
Bi-LSTM
word
encoding</p>
          <p>Yes
methods problems
SelectingEngine
expert annotation selected samples Ssssa…eeemnnn…tttp---12nle Vassslcccu…oooerrr…eees’’’c---o21nre</p>
          <p>Margin
NSE
MNLP
LWP</p>
          <p>Sample Predicted score
sss…eeennn…ttt---12n sssccc…ooorrr…eee---12n</p>
          <p>
            Here we choose CNN-BiLSTM-CRF [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ] as the learning
engine. CNN focuses on the morphology features that are the
prefix and suffix of word. BiLSTM learns the dependency
relationship between words with a long distance by using two
groups of long-short term memory networks in opposite directions.
CRF decides the most optimal labeling sequence with a rational
linguistic logic.
          </p>
          <p>In addition, we propose a hybrid approach for the selecting
engine. Firstly, the value score of each unlabeled sample is
respectively computed by four different types of active learning
strategies, and the sum of them is set as the final value score.
Secondly, the value scores are listed in descending order, only the
top 10% most valuable samples are selected to be annotated
manually in each iteration.</p>
          <p>
            This paper picked out three classical strategies from the
uncertain sampling methods: margin [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], N-best sequence
entropy [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ] and maximum normalized log-probability [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
Additionally, we propose a novel strategy, namely label weighted
probability, which enhances on the importance of the number of
labels. The more labels of problems or methods there are in a
sentence, the more valuable the sentence is.
2.2
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Entity Relation Prediction</title>
      <p>Relation prediction decides whether a "problem" and a "method"
is related or not. That means if a “problem” is related to a
“method”, the method can be used to solve this problem.</p>
      <p>
        Translation model sees the relation in the triple (head entity,
relation, tail entity) as a translational between two entities. There
is a series of translation models. TransE [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] has few parameters
and is low in complexity, but cannot distinguish two tail entities
with the same relation. TransH [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] uses different vectors to
represent one entity with various relations, which solves the
problem of complex relation representation (1-N, N-1, N-N).
TransR [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] supposes that different relations are in different
semantic spaces. Thus, this model projects entities into their
relation spaces at first, then builds the translation process.
However, it greatly increases the time cost because of too many
parameters. TransD [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] creates the projection matrix respectively
for head entity and tail entity. It not only combines the effects of
both entities and relations on projection, but also improves the
computing efficiency.
      </p>
      <p>After comparing the performance of various translation models,
we choose TransH to predict relations, which keeps balance
between accuracy and efficiency. To solve the problems of
oneto-many, many-to-one, many-to-many relations, TransH generates
the relation-specific translation vector   in the relation-specific
hyperplane   rather than in the same space of entity embeddings.
h
h⊥
t
t⊥
dr
wr</p>
      <p>As shown in Figure 3, the relation  in its hyperplane   has a
translation vector   , the head embedding ℎ and the tail
embedding  in   have their projection vectors ℎ⊥ and  ⊥ . The
defined score function is: ||ℎ⊥ +   −  ⊥||22.</p>
      <p>However, the original TransH model does not match our goal
exactly. We achieved three improvements.
1)</p>
      <p>TransH constructs the negative samples by replacing the
head or tail entity with others in the positive samples.
However, the replaced one may also be correct because of
synonyms, which introduced many false negative labels into
training. Considering that there are only two types of
relationships, we simply construct the negative samples by
modifying the correct relationship into its antonym. By this
change, it is more convenient to construct a balanced
annotated corpus. Moreover, the score function   (ℎ, ) is
redefined as Equation (1), which aims to move the attention
from entity to relation.</p>
      <p>(ℎ, ) = ||(ℎ
⊥ −  ⊥) −   ||22
(1)
2)</p>
      <p>Comparing with the original model that initializes the entities
with the random vectors, we use the word2vec model to
3)
2.3
generate the semantic representation of all head and tail
entities.</p>
      <p>To improve the ability of feature learning for the unknown
entities, we add one hidden layer of linear transformation
respectively for the head entities and tail entities.</p>
    </sec>
    <sec id="sec-4">
      <title>Named Entity Recognition and Keyword</title>
    </sec>
    <sec id="sec-5">
      <title>Extraction</title>
      <p>We use an enterprise open source toolkit spaCy.NER to recognize
the named entities. spaCy.NER implements a very fast and
efficient system based on the statistical machine learning
algorithms, which can recognize 18 entity types, such as Person,
Organization, Location, Geopolitics entity.</p>
      <p>
        Furthermore, keyword extraction is achieved by the open
source toolkit RAKE (Rapid Automatic Keyword Extraction).
RAKE is an automatic keyword extraction technique. Based on
the statistical method, RAKE outperformed TextRank and other
supervised learning models, which obtained a high F value [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
and is more efficient.
      </p>
    </sec>
    <sec id="sec-6">
      <title>3 Platform Evaluation and Display</title>
      <p>We evaluate the performance of information extraction of
IEKMMD in the field of Artificial Intelligence (AI). There are two
datasets be used.</p>
      <p>1) The top 100 AI conferences were picked out by the domain
experts, and their abstracts were acquired from NSTL database3,
total in 9753 sentences. Next, we built the truth datasets. Each
sentence is annotated synchronously by two students in the
corresponding subjects (task, method or other). The annotation
results are checked by one expert. The annotation format is shown
as Figure 4. The AI annotated corpus contains 26,0000 tokens.</p>
      <p>We use
active learning</p>
      <p>extract information
|
O
|
O
|</p>
      <p>|
B-method I-method</p>
      <p>|
B-task</p>
      <p>|</p>
      <p>I-task
to
|
O</p>
      <p>2) FTD datasets4 shared by Stanford University in the field of
Computational Linguistics. It comes from the Conference of the
Association for Computational Linguistics and ranges from 1965
to 2009, which containing four types of labels: focus, technique,
domain and other, in total 2628 sentences.</p>
      <p>In addition, we show the effect of knowledge mining in three
different kinds of domains. We choose three popular keywords
(Neural Networks, Nano Structure and Genetic Engineering) that
respectively respect the subjects of Computer Science, Material
and Medicine to acquire abstracts from NSTL database. 200</p>
      <sec id="sec-6-1">
        <title>3 https://www.las.ac.cn</title>
        <p>4 https://nlp.stanford.edu/pubs/FTDDataset_v1.txt
abstracts of each subject are randomly selected from SCI journals
and are used to verify the practical application effect of
IEKMMD.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>3.1 Scientific Entity Recognition</title>
      <p>We set the baselines only using the CNN-BiLSTM-CRF (CBC)
model trained on all annotated samples. For each dataset (AI or
FTD), the best performance is as the baseline, so as to detect
whether active learning helps reduce the scale of annotated corpus
for supervised learning models. The scale of training sets and the
best F1 scores of CBC model are shown in Table 1.</p>
      <p>In the model of IEKM-MD, initially only 0.01% annotated
samples are used to carry out the cold starting process, then the
highest valuable samples (10%) are added into the training sets in
each iteration. Only if the F1 score of IEKM-MD reaches the
baseline, can the learning process be stopped. The label scales and
F1 scores of AI and FTD datasets in each iteration are show in
Table 2.
The results reflect that Neural Networks achieved the best
performance with 0.93 accuracy of problem extraction and 0.89
accuracy of method extraction. The average accuracy of three
fields reveals that problem extraction has a better score than
method extraction. The first reason is that the total mentions of
problem are smaller than methods, and they are usually described
in the noun phrases, which contribute to an easier pattern to be
caught by model. The second reason is that one article may
contain multiple methods, which are modified by multiple
attributives or adverbials, making it more challenging to recognize
the complete methods.</p>
      <p>However, our platform performed worst in the field of Nano
Structure. This may because that the articles of Nano Structure
include many complex and specialized terms in the subjects of
biology, physics, chemistry, electronics, and metrology. Our
platform still lacks the professional knowledge to learn the
specific features.</p>
      <p>The extracted top 10 problems of three fields are shown in
Table 4, which reveal that Neural Networks focuses on the
classification, prediction and recognition problems of data and
images in the subject of Computer Science. Nano Structure covers
a wide range, including physics, biology, chemistry, and so on,
which focuses on the applications on the basic disciplines.
Therefore, the extracted problems involve detection, analysis and
prediction of energy, atom and medicine. The scope of Genetic
Engineering is relatively narrow and is related to drug
development, disease treatment, and biological manufacturing in
the biomedical field.</p>
      <p>Top</p>
      <p>Neural Network</p>
      <p>Nano Structure</p>
      <p>Genetic Engineering</p>
      <p>Classification</p>
      <p>Prediction
Pattern recognition</p>
      <p>Optimization</p>
    </sec>
    <sec id="sec-8">
      <title>3.2 Entity Relation Prediction</title>
      <p>By predicting the relations between problem and method, we
construct the method-problem networks for different domains. As
shown in Figure 5, the methods and problems which were
separate in the articles of Neural Network are linked by relation
prediction. The red dots refer to methods, and the blue dots refer
to problems.</p>
      <p>9
10</p>
      <p>Specifically, we can get more details from the
abovementioned network. By setting the method X-Ray Diffraction
(XRD) as a center, Figure 6 reveals that what problems are solved
by XRD. They are Assisted Synthesis, Biomedical Application,
Biosynthesis of Silver Nanoparticles and so on.
Hotspots are the most popular research topics. We use the
extracted keywords to pick out the hotspots in multiple domains.
As a hotspot, the total number occurring in articles should be
increased year by year or keeps a steady top order in last three
years. According by this rule, Figure 7 shows the hotspots in the
field of Neural Networks. They are distinct from the scientific
entities recognized in section 3.1, which have no semantic type
but reflect the popularity degree of terms.
4</p>
    </sec>
    <sec id="sec-9">
      <title>Conclusion</title>
      <p>This paper introduced an innovative and intelligent platform
IEKM-MD to extract information and mine knowledge from
scientific articles in multiple domains. One contribution is
providing a hybrid active learning strategy to solve the problem of
annotated corpus scarcity in supervised learning model. Another
contribution is designing an improved Translation embedding
approach based on TransH model to optimize the performance of
relation prediction. Three datasets in Neural Networks, Nano
Structure and Genetic Engineering show that our platform is
enable to achieve various knowledge services with a high
accuracy in multiple domains.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work is supported by the project “Annotation and evaluation
of the semantic relationship between geographical entities in
Chinese web texts” (Grant No. 41801320) from the National
natural science foundation of China youth science foundation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Chiu</given-names>
            <surname>Jason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Nichols</given-names>
            <surname>Eric</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Named entity recognition with bidirectional LSTM-SNNs. Transactions of the Association for Computational Linguist 6(Nov</article-title>
          .
          <year>2015</year>
          ). DOI: https://doi.org/10.1162/tacl_a_
          <fpage>00104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Ma</given-names>
            <surname>Xuezhe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Eduard</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>End-to-end sequence labeling via bidirectional LSTM-CNNs-CRF</article-title>
          . arXiv:
          <volume>1603</volume>
          .01354. Retrieved from https://arxiv.org/abs/1603.01354.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Liyuan</given-names>
            <surname>Liu</surname>
          </string-name>
          , Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han.
          <year>2017</year>
          .
          <article-title>Empower sequence labeling with task-aware neural language model</article-title>
          .
          <source>arXiv:1709</source>
          .04109. Retrieved from https://arxiv.org/abs/1709.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Kulkarni</surname>
          </string-name>
          , Sanjeev and Mitter,
          <source>Sanjoy and Tsitsiklis, John and Systems</source>
          , Massachusetts.
          <year>1993</year>
          .
          <article-title>Active Learning Using Arbitrary Binary Valued Queries</article-title>
          .
          <source>Machine Learning 11, 1 (Apr</source>
          .
          <year>1993</year>
          ),
          <fpage>23</fpage>
          -
          <lpage>35</lpage>
          . DOI: https://doi.org/11. 10.1023/A:
          <fpage>1022627018023</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Vijayanarasimhan</given-names>
            <surname>Sudheendra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Grauman</given-names>
            <surname>Kristen</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Active frame selection for label propagation in videos</article-title>
          .
          <source>In Proceedings of the 12th. European Conference on Computer Vision (ECCV'12)</source>
          , Florence, Italy. Springer-Verlag. Heidelberg, Berlin,
          <fpage>496</fpage>
          -
          <lpage>509</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -33715-4_
          <fpage>36</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Deng</given-names>
            <surname>Yue</surname>
          </string-name>
          , Dai Qionghai, Liu Risheng, Zhang Zengke,
          <string-name>
            <given-names>Hu</given-names>
            <surname>Sanqing</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Lowrank structure learning via non-convex heuristic recovery</article-title>
          .
          <source>IEEE Transactions on Neural Networks and Learning Systems</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ):
          <fpage>383</fpage>
          -
          <lpage>396</lpage>
          . DOI: https://doi.org/10.1109/TNNLS.
          <year>2012</year>
          .
          <volume>2235082</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Deng</given-names>
            <surname>Yue</surname>
          </string-name>
          , Chen Kawai, Shen Yilin,
          <string-name>
            <given-names>Jin</given-names>
            <surname>Hongxia</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Adversarial active learning for sequences labeling and generation</article-title>
          .
          <source>In Proceedings of the 27th International Joint Conference on Artificial Intelligence, July</source>
          ,
          <year>2018</year>
          , Stockholm, Sweden. IJCAI-18. California,
          <volume>4012</volume>
          -
          <fpage>4018</fpage>
          . https://doi.org/10.24963/ijcai.
          <year>2018</year>
          /558.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Bordes</given-names>
            <surname>Antonie</surname>
          </string-name>
          , Nicolas Usunier, Alberto Garcia-Duran, Jason Weston,
          <string-name>
            <given-names>Oksana</given-names>
            <surname>Yakhnenko</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Translating embeddings for modeling multi-relational data</article-title>
          .
          <source>In Proceedings of NIPS</source>
          . MIT Press. Cambridge, MA,
          <fpage>2787</fpage>
          -
          <lpage>2795</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Zhen</given-names>
            <surname>Wang</surname>
          </string-name>
          , Jianwen Zhang, Jianlin Feng, Zheng Chen.
          <year>2014</year>
          .
          <article-title>Knowledge graph embedding by translating on hyperplanes</article-title>
          .
          <source>In Proceedings of the 28th. AAAI Conference on Artificial Intelligence (AAAI'14)</source>
          , June,
          <year>2014</year>
          . AAAI Press. Menlo Park, CA,
          <fpage>1112</fpage>
          -
          <lpage>1119</lpage>
          . https://doi.org/10.5555/2893873.2894046.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>He</surname>
            <given-names>Shizhu</given-names>
          </string-name>
          , Liu Kang,
          <string-name>
            <surname>Ji</surname>
            <given-names>Guoliang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Zhao</given-names>
            <surname>Jun</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Learning to represent knowledge graphs with Gaussian embedding</article-title>
          .
          <source>In Proceedings of CIKM. ACM</source>
          . New York,
          <fpage>623</fpage>
          -
          <lpage>632</lpage>
          . https://doi.org/10.1145/2806416.2806502.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Ji</surname>
            <given-names>Guoliang</given-names>
          </string-name>
          , He Shizhu, Xu Liheng, Liu Kang,
          <string-name>
            <given-names>Zhao</given-names>
            <surname>Jun</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Knowledge graph embedding via dynamic mapping matrix</article-title>
          .
          <source>In Proceedings of ACL. ACL. Stroudsburg</source>
          , PA,
          <fpage>687</fpage>
          -
          <lpage>696</lpage>
          . https://doi.org/10.3115/v1/
          <fpage>P15</fpage>
          -1067.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Xuezhe</surname>
            <given-names>Ma</given-names>
          </string-name>
          , Eduard Hovy.
          <year>2016</year>
          .
          <article-title>End-to-end sequence labeling via bidirectional LSTM-CNNs-CRF [OL]</article-title>
          .
          <source>arXiv: 1603</source>
          .01354. Retrieved from https://arxiv.org/abs/1603.01354.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Yanyao</surname>
            <given-names>Shen</given-names>
          </string-name>
          , Hyokun Yun,
          <string-name>
            <surname>Zachary</surname>
            <given-names>C.</given-names>
          </string-name>
          <year>2017</year>
          . Lipton, Yakov Kronrod,
          <string-name>
            <given-names>Animashree</given-names>
            <surname>Anandkumar</surname>
          </string-name>
          .
          <article-title>Deep active learning for named entity recognition</article-title>
          .
          <source>arXiv:1707</source>
          .05928. Retrieved from https://arxiv.org/abs/1707.05928.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Seokhwan</surname>
            <given-names>Kim</given-names>
          </string-name>
          , Yu Song, Kyungduk Kim,
          <string-name>
            <surname>Jeong-Won</surname>
            <given-names>Cha</given-names>
          </string-name>
          ,
          <source>Gary Geunbae Lee</source>
          .
          <year>2006</year>
          .
          <article-title>MMR-based active machine learning for bio entities</article-title>
          .
          <source>In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers</source>
          , June,
          <year>2006</year>
          , New York., New York, USA,
          <fpage>69</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Balcan</given-names>
            <surname>Maria-Florina</surname>
          </string-name>
          , Broder Andrei, Zhang Tong.
          <year>2007</year>
          .
          <article-title>Margin based active learning</article-title>
          .
          <source>In Proceedings of the 20th. Annual Conference on Learning Theory (COLT'07)</source>
          ,
          <year>2007</year>
          , San Diego, CA, USA. Springer-Verlag., Berlin, Heidelberg,
          <fpage>35</fpage>
          -
          <lpage>50</lpage>
          . https://doi.org/10.5555/1768841.1768848
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Stuart</surname>
            <given-names>Rose</given-names>
          </string-name>
          , Dave Engel, Nick Cramer,
          <string-name>
            <given-names>Wendy</given-names>
            <surname>Cowley</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Automatic keyword extraction from individual documents</article-title>
          .
          <source>Text Mining: Applications and Theory</source>
          <volume>20</volume>
          ,
          <issue>1</issue>
          (Mar.
          <year>2010</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          . DOI: https://doi.org/10.1002/9780470689646.ch1.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>