<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>What is Special about Patent Information Extraction?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Liang Chen†</string-name>
          <email>25565853@qq.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zheng Wang</string-name>
          <email>wangz@istic.ac.cn</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shuo Xu</string-name>
          <email>xushuo@bjut.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chao Wei</string-name>
          <email>weichaolx@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Weijiao Shang</string-name>
          <email>shangwj490@163.com</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haiyun Xu</string-name>
          <email>xuhy@clas.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chengdu Library and Information, Center, Chinese Academy of Sciences</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country>China P.R.</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>College of Economics and</institution>
          ,
          <addr-line>Management</addr-line>
          ,
          <institution>Beijing University of Technology</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country>China P.R.</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute of Scientific and Technical, Information of China</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country>China P.R.</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Research Institute of Forestry, Policy and Information, Chinese Academy of Forestry</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country>China P.R.</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>63</fpage>
      <lpage>72</lpage>
      <abstract>
        <p>Information extraction is the fundamental technique for text-based patent analysis in era of big data. However, the specialty of patent text enables the performance of general information-extraction methods to reduce noticeably. To solve this problem, an in-depth exploration has to be done for clarify the particularity in patent information extraction, thus to point out the direction for further research. In this paper, we discuss the particularity of patent information extraction in three aspects: (1) what is the special about labeled patent dataset? (2) What is special about word embeddings in patent information extraction? (3) What kind of method is more suitable for patent information extraction?</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>CCSInformation systemsInformation
tasks and goalsInformation extraction
retrievalRetrieval</p>
      <sec id="sec-1-1">
        <title>1 Introduction</title>
        <p>
          Patent information extraction, deep learning, word embedding
As an important source of technical intelligence, patents cover
more than 90% latest technical information of the world, of which
80% would not be published in other forms [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. There are two
traditional ways to obtain technical intelligence, either by
analyzing structured data with bibliometric methods or by experts
reading patent texts. However, with the rapid growth of patent
documents, the second way is facing more and more challenges.
        </p>
        <p>Information extraction is an important technology for
machine understanding text, which aims to extract structured data
from free text to eliminate ambiguous problem inherent in free
texts. In recent years, with the tremendous advances of machine
learning technology, especially the rise of deep learning, the
research in information extraction has made great progress.
However, the particularity of patent text enables the performance
of general information-extraction tools to reduce greatly.
Therefore, it is necessary to deeply explore patent information
extraction, and provide ideas for subsequent research.</p>
        <p>Information extraction is a big topic, but it is far from being
well explored in IP (Intelligence Property) field. To the best of our
knowledge, there are only three labeled datasets publicly available
in the literature which contain annotations of named entity and
semantic relation. Therefore, we choose NER (Named Entity
Recognition) and RE (Relation Extraction) for discussion in three
aspects as follows.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2 What is special about labeled patent dataset?</title>
      <p>
        Since supervised learning methods represent state-of-the-art
techniques in information extraction, it’s necessary to clarify the
particularity of labeled patent dataset for further improvement of
information extraction in IP. To this end, a comparative analysis is
conducted which contains seven labeled datasets of three
categories: (1) news corpus consisting of Conll-2003 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and
NYT-2010 (New York Times corpus) [3], (2) encyclopedia
corpus consisting of Wikigold [
        <xref ref-type="bibr" rid="ref3">4</xref>
        ] and LIC-2019 (the annotated
dataset of 2019 ;Language and Intelligence Challenge) [
        <xref ref-type="bibr" rid="ref4">5</xref>
        ], (3)
patent corpus consisting of CPC-2014 (Chemical Patent Corpus)
[
        <xref ref-type="bibr" rid="ref5">6</xref>
        ], CGP-2017 (The CEMP and GPRO Patents Tracks) [
        <xref ref-type="bibr" rid="ref6">7</xref>
        ],
TFH-2020 (Thin Film Head annotated dataset) [
        <xref ref-type="bibr" rid="ref7">8</xref>
        ].
      </p>
      <p>There are two parts contained in each labeled dataset, (1) an
information schema to define label types, (2) a dataset consisting
of labeled texts. Let’s take TFH-2020 as an example, the schema
of named entities and semantic relations are shown in Table 1 and
2 of the Appendix, and the labeled text is shown in Fig. 1.</p>
      <p>In order to analyze these datasets, 8 indicators are proposed
as shown in Table 3 of the Appendix. It is worth noting that, (1) in
CGP-2017, Conll-2003 and Wikigold, only entities are annotated
but semantic relations, (2) all datasets are in English except
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
LIC-2019, which is in Chinese. As a consequence, some
indicators are not calculated for certain datasets. The final result is
shown in Table 4 of the Appendix.</p>
      <p>From the statistics in Table 4, we can find the following
facts:
(1) In terms of average sentence length, there is no clear
distinction between patent text and generic text;
(2) In terms of count of entities per sentence, there are more
entities in a single sentence of patent text than that of generic text;
(3) As to rest indicators, TFH-2020 shows clear distinctions from
the other patent datasets and all generic datasets.</p>
      <p>In summary, there exist significant distinctions not only
between patent text and generic text, but also between patent text
from different technical domains. In our opinion, the later
distinctions are two-fold: firstly, they come from the unique
characteristics of different technical domains, i.e., there are plenty
of sequences and chemical structures mentioned in describing
innovations in chemical and biotechnology (Hunt et al., 2012),
while the most frequent entities in the field of hard disk drive are
of components, location and function (Chen et al., 2020), as to
describe inventions with different materials and mechanisms, the
patents of different domains follow different writing styles;
secondly, they come from the concerns of experts from different
domains, e.g., for TFH-2020 dataset, there are 17 types of entities
designed, as to CGP-2017 dataset, only 3 types of entities
including chemical, gene-n, gene-y are concerned while other
types of entities are out of concern.</p>
    </sec>
    <sec id="sec-3">
      <title>3 What is special about patent word embeddings?</title>
      <p>So far deep learning techniques have achieved state-of-the-art
performance in information extraction. As the foundation of deep
learning techniques in NLP, word embedding refers to a class of
techniques where words or phrases from the vocabulary are
mapped to vectors of real numbers.</p>
      <p>
        There are two ways to obtain word embeddings, (1) by
training on a corpus via word embedding algorithm, such as
Skip-gram [
        <xref ref-type="bibr" rid="ref8">9</xref>
        ] and the like; (2) by directly downloading a
pre-trained word embedding file from the Internet, like GloVe
[
        <xref ref-type="bibr" rid="ref9">10</xref>
        ]. Risch and Krestel [
        <xref ref-type="bibr" rid="ref10">11</xref>
        ] suggested obtaining word
embeddings by training specifically on patent documents in all
fields for improving semantic representation of patent language.
In fact, such suggestion is based on automatic classification for
patents in all fields, which is quite different from information
extraction from patents in specific domain. In order to explore
which word embedding is preferable in patent information
extraction, four types of word embedding with the same
dimensions of 100 are prepared as follows:
(1) Word embeddings of GloVe provided by Stanford NLP group.
According to the different training corpora, there are four release
versions of GloVe [
        <xref ref-type="bibr" rid="ref9">10</xref>
        ]. We choose the one trained on Wikipedia
2014 and Gigaword 5 as it provides word embeddings of 100
dimensions. In fact, the version trained on Twitter also has word
embeddings of 100 dimensions. But since our training corpus does
not follow the patterns in such short texts as Twitter;
(2) Word embeddings provided by Risch and Krestel [
        <xref ref-type="bibr" rid="ref10">11</xref>
        ], which
are trained with the full-text of 5.4 million patents granted from
USPTO during 1976 to 2016. Risch and Krestel released three
versions of word embeddings with 100/200/300 dimensions. The
100 dimensions version is chosen and referred to it as
USPTO-5M;
(3) Word embeddings trained with a corpus of 1,010 patents
mentioned in this paper but with their full-text (abstract, claims
and description), these word embeddings are referred as
TFH-1010;
(4) Word embeddings trained with the abstract of 46,302 patents
regarding magnetic head in hard disk drive, these word
embeddings are referred as MH-46K.
      </p>
      <p>On basis of these word embeddings, two deep-learning
models, BiLSTM-CRF and BiGRU-HAN, are respectively used
for entity identification and semantic relation extraction.
Specifically speaking, BiLSTM-CRF (Fig. 2) takes sentences as
input and represents every word in a word embedding format,
during training procedure these word embeddings pass through
the layers within BiLSTM-CRF and output the prediction of
named entities in the sentence; The basic idea of BiGRU-HAN
(Fig. 3) is to recognize the occurrence pattern of different
semantic relations by a recurrent neural network named BiGRU,
and then leverages a hierarchical attention mechanism consisting
of a word-level attention layer and a sentence-level attention layer
to further improve the model’s prediction accuracy.</p>
      <p>Fig. 3 The structure of BiGRU-HAN model.</p>
      <p>
        From Table 5 and Table 6, we can see the results produced
by these four types of word embedding are almost the same.
However, Risch and Krestel [
        <xref ref-type="bibr" rid="ref10">11</xref>
        ] observed a considerable
improvement when replacing Wikipedia word embeddings with
USPTO-5M word embeddings in patent classification task. In our
opinion, the main reason lies in the huge difference between
automatic classification for patents in all fields and the
information extraction from patents in a specific domain. To say it
in another way, when one confronts a task in a specific domain,
the word embeddings trained on the same domain corpus should
be preferred to.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4 What is special about methods of patent information extraction?</title>
      <p>
        As far as supervised learning method is concerned, there are
mainly 2 ways for information extraction, namely pipeline way
and joint way shown in Fig.4. The former extracts the entities first,
and then identifies the relationship between them. The separated
strategy makes the information extraction task easy to deal with,
and each component can be more flexible; differently, the latter
uses a single model to extract entities and relations at one time.
Even Zheng et al [
        <xref ref-type="bibr" rid="ref11">12</xref>
        ] claimed joint method is capable of
integrating the information of entities and relations, thus to
improve NER and RE performance in a mutually reinforcing way,
in our opinion, the biggest advantage it brings is the elimination of
entity pair generation which would produce a large number of
entity pairs with no relation type shown in Fig. 5.
      </p>
      <p>Subject</p>
      <p>Predicate</p>
      <p>Object</p>
      <p>Object</p>
      <p>Predicate</p>
      <p>Subject
Entity pair generation</p>
      <p>RE</p>
      <p>NER
Patent dataset
sturcture prediction</p>
      <p>……
Feature engineering</p>
      <p>Patent dataset
(a) pipeline method (b) joint method
Fig.4 Two patterns of information extraction</p>
      <p>Joint method seems to be a better solution to extract patent
information, so what is the actual situation?</p>
      <p>
        To verify this, we prepare a pipeline baseline and a joint
baseline, namely BiLSTM-CRF [
        <xref ref-type="bibr" rid="ref12">13</xref>
        ] &amp;BiGRU-HAN [
        <xref ref-type="bibr" rid="ref13">14</xref>
        ] and
Hybrid Structure of Pointer and Tagging [
        <xref ref-type="bibr" rid="ref14">15</xref>
        ] (Fig. 6) for an
experiment on TFH-2020 dataset. Since the proportion of no
relations in TFH-2020 is much larger than that of generic text
after entity pair generation, two set of results are provided by
pipeline baseline including with no relations and without no
relations, which are shown in 1st and 2nd rows of Table 7, and the
result of Hybrid Structure of Pointer and Tagging is shown in 3nd
row. In order to highlight the performance of the two baselines on
different types of relation, the precision, recall, and F1-value for
each type of relation are shown in Fig. 7 and 8 with each type of
relation denoted by its first 3 letters (cf. Table 2)
      </p>
      <p>Raw sentence</p>
      <p>In one embodiment the offset portion of the first magnetic
layer is disposed within a recess in the substrate
NER</p>
      <p>In one embodiment the offset portion of the first magnetic
layer is disposed within a recess in the substrate
Entity pair generation</p>
      <p>RE</p>
      <p>entity pair
offset portion, first magnetic layer
offset portion, recess
offset portion, substrate
first magnetic layer,recess
first magnetic layer, substrate
recess, substrate
gold standard of
relation type
part-of
spatial relation
no relation
no relation
no relation
attribution
Model training
Fig. 5 The procedure of information extraction in pipeline way</p>
      <p>
        As can be seen, the experimental results in this paper
contradict the observation from information extraction
competition in LIC 2019 [
        <xref ref-type="bibr" rid="ref4">5</xref>
        ], where joint methods outperformed
pipeline methods by a large margin. In our opinion, there are two
reasons behind, (1) as same as pipeline method, the performance
of joint method is severely affected by the number of entities in
sentences; (2) the requirement of joint model for training set size
is much higher than that of pipeline model. To verify the 2nd
reason, we take the LIC-2019 dataset as an example to
demonstrate how the size of the training set affects the
performance of Hybrid Structure of Pointer and Tagging.
      </p>
      <sec id="sec-4-1">
        <title>Object</title>
      </sec>
      <sec id="sec-4-2">
        <title>Predicate</title>
      </sec>
      <sec id="sec-4-3">
        <title>Subject</title>
        <p>Neural Networks</p>
      </sec>
      <sec id="sec-4-4">
        <title>Convolution layer</title>
      </sec>
      <sec id="sec-4-5">
        <title>BiLSTM layer</title>
        <p>Word embedding &amp;
Position embedding &amp;
Relation type embedding</p>
      </sec>
      <sec id="sec-4-6">
        <title>Patent dataset</title>
        <p>Fig. 6 The structure of Hybrid Structure of Pointer and Tagging
Fig.9 The performance of joint model with different size of
training set</p>
        <p>As shown in Fig. 9, as the size of the training set increases
from 1000 to 50000, the performance of Hybrid Structure of
Pointer and Tagging increases rapidly, and then it enters a stable
state near 0.78/0.51/0.63 in terms of weighted-average precision
/recall /F1-value.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5 Conclusions</title>
      <p>In this paper, we discuss the particularity in patent information
extraction in three aspects:
(1) Labeled dataset through comparative analysis, it is found that
there are differences not only between labeled patent datasets and
labeled generic datasets, but also between labeled patent datasets
from different technical fields, which means patent information
extraction is a domain-specific task, and a series of processing
steps should be customized from feature engineering to model
building for better performance.
(2) Word embedding word embedding is the foundation of deep
learning methods in information extraction. Although Risch et al.
suggested obtaining word embeddings by training specifically on
patent documents in all fields to improve semantic representation
of patent language, experiment shows when one confronts a task
in a specific domain, the word embeddings trained on the same
domain corpus should be preferred to.
(3) Organization of sub-tasks in information extraction
although joint method achieves state-of-the-art performance in
information extraction, this excellent performance comes at the
expense of large labeled dataset. When the dataset is limited, one
should take a series of factors, such as model characteristics,
computing resources, actual performance and so on into
consideration, and then choose an optimal method.</p>
      <p>We realize some conclusions in this paper are obtained only
considering a few sample data considering simple metrics.
However, given the scarcity of patent labeled dataset in
information extraction, this is what we can get so far with data
support. In the future, we hope more people participate in
construction of patent labeled datasets and research of patent
information extraction, not only because lack of labeled datasets,
but also because there are valuable tasks waiting for us to explore,
such as how to generate large-scale patent annotation dataset with
low cost? Or how to use the particularity of patent text to improve
the performance of information extraction in patent text?</p>
    </sec>
    <sec id="sec-6">
      <title>ACKNOWLEDGMENTS</title>
      <p>This research received the financial support from National Natural
Science Foundation of China under grant number 71704169,
National Key Research and Development Program of China under
grant number 2019YFA0707202, and Social Science Foundation
of Beijing Municipality under grant number 17GLB074,
respectively. Our gratitude also goes to the anonymous reviewers
for their valuable suggestions and comments.</p>
    </sec>
    <sec id="sec-7">
      <title>Appendix:</title>
      <p>example
The etchant solution has a suitable solvent additive such as glycerol or methyl cellulose
A camera using a film having a magnetic surface for recording magnetic data thereon
Conductor is utilized for producing writing flux in magnetic yoke
The curing step takes place at the substrate temperature less than 200.degree
The curing step takes place at the substrate temperature less than 200.degree
The legs are thinner near the pole tip than in the back gap region
The MR elements are biased to operate in a magnetically unsaturated mode
Magnetic disk system permits accurate alignment of magnetic head with spaced tracks
A magnetic head having highly efficient write and read functions is thereby obtained
Recess is filled with non-magnetic material such as glass
A pole face of yoke is adjacent edge of element remote from surface
A pole face of yoke is adjacent edge of element remote from surface
This prevents the slider substrate from electrostatic damage
A digital recording system utilizing a magnetoresistive transducer in a magnetic
recording head
Interlayer may comprise material such as Ta
Peak intensity ratio represents an amount hydrophilic radical</p>
      <p>Pressure distribution across air bearing surface is substantially symmetrical side
formula
=
=
∑  
∑  
∑</p>
      <p>∑</p>
      <p>comment
sentence
sentence
N indicates the number of sentences,
  indicates the length of the i-th
N is the same as above,   indicates
the number of entities in the i-th
NE indicates the number of entities
in
sentences,</p>
      <p>indicates the
number of words in the i-th entity
N is the same as above,   indicates
the number of relation mentions in
the i-th sentence
NE
is
the
same
as</p>
      <p>above,
NE_distinct indicates the number of
entities after deduplication
RE indicates the number of relation
mentions in sentences, RE_distinct
indicates the number of
mentions after deduplication
NE is the same as above, 
relation</p>
      <p>indicates the number of multi-word
entities, namely ngram entities in
sentences
entity

number of deduplicated entities that
have common word(s) with the i-th

is same as above,
indicates
the</p>
      <p>memo
Calculate how many words
are included in an sentence
on average
Calculate
how</p>
      <p>many
entities are included in an
sentence on average
Calculate how many words
are included in an entity on
average
Calculate
relation
average</p>
      <p>how
mentions
many</p>
      <p>are
included in an sentence on
Calculate how many times
an entity can appear in the
corpus on average
Calculate how many times
an relation</p>
      <p>mention can
appear in the corpus on
average
Measure the proportion of
phrase-type entities in all
entities
Measure the connection
between
entities</p>
      <p>by
co-word mechanism, i.e.,
thin film head and Ferrite
head are connected as they
have a common word head
CPC-2014(EN)
CGP-2017(EN)
TFH-2020(EN)
Conll-2003(EN)
NYTC(EN)
LIC-2019(CN)
Wikigold(EN)</p>
      <p>Wikipedia
corpus description
Patent full-text
regarding biology
and chemistry
Patent abstract
regarding
biomedical science
Patent abstract
regarding thin film
head techniques
Reuters news
stories
New York Times
Corpus
search results of
Baidu Search as
well as Baidu
Zhidao
average
length of
sentence
--# of entities
per sentence
# of words
per entity
# of relations
per sentence
entity
repetition
rate
relation
repetition
rate
percentage of
ngram entities
(%)
entity
association
rate
--------0.4
2.1
------8.0
1.3
----</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Zha</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Study on early warning of competitive technical intelligence based on the patent map</article-title>
          .
          <source>Journal of Computers</source>
          ,
          <volume>5</volume>
          (
          <issue>2</issue>
          ).
          <source>doi:10.4304/jcp.5.2</source>
          .
          <fpage>274</fpage>
          -
          <lpage>281</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Sang</surname>
            <given-names>E. F. T. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Meulder F.</surname>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition</article-title>
          . arXiv preprint arXiv:cs/03060-
          <fpage>50</fpage>
          .[3]
          <string-name>
            <surname>Riedel</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCallum</surname>
            <given-names>A</given-names>
          </string-name>
          . (
          <year>2010</year>
          )
          <article-title>Modeling Relations and Their Mentions without Labeled Text</article-title>
          . In:
          <string-name>
            <surname>Balcázar</surname>
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonchi</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gionis</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebag</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>(eds) Machine Learning and Knowledge Discovery in Databases</article-title>
          .
          <source>ECML PKDD 2010. Lecture Notes in Computer Science</source>
          , vol
          <volume>6323</volume>
          . Springer, Berlin, Heidelberg
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Balasuriya</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ringland</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nothman</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murphy</surname>
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>J. R.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Named Entity Recognition inWikipedia</article-title>
          ,
          <source>Proceedings of the 2009 Workshop on the People's Web Meets NLP, ACL-IJCNLP</source>
          <year>2009</year>
          , pages
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Report of 2019 language &amp; Intelligence technique evaluation</article-title>
          .
          <source>Baidu Corporation</source>
          . http://tcci.ccf.org.cn/summit/2019/dlinfo/1101-wh.pdf
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Akhondi</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klenner</surname>
            ,
            <given-names>A. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tyrchan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manchala</surname>
            ,
            <given-names>A. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boppana</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zimmermann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagarlapudi</surname>
            ,
            <given-names>S. A. R. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sayle</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kors</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Muresan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Annotated Chemical Patent Corpus: A Gold Standard for Text Mining</article-title>
          .
          <source>PLoS ONE</source>
          ,
          <volume>9</volume>
          (
          <issue>9</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Pérez-Pérez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pérez-Rodríguez</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vazquez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fdez-Riverola</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oyarzabal</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oyarzabal</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lourenço</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Evaluation of Chemical and Gene/Protein Entity Recognition Systems at BioCreative V.5: The CEMP and GPRO Patents Tracks</article-title>
          .
          <source>In Proceedings of the BioCreative V.5 Challenge Evaluation Workshop</source>
          ,
          <fpage>11</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Lei</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>A deep learning based method for extracting semantic information from patent documents</article-title>
          .
          <source>Scientometrics. doi:10</source>
          .1007/s11192-020-03634-y.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            <given-names>G.</given-names>
          </string-name>
          ,&amp;
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Efficient estimation of word representations in vector Space</article-title>
          .
          <source>arXiv preprint arXiv: 1301</source>
          .
          <fpage>3781</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Glove: Global vectors for word representation</article-title>
          .
          <source>In Proceedings of the 2014 conference on empirical methods in natural language processing</source>
          (pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Risch</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Krestel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Domain-specific word embeddings for patent classification</article-title>
          .
          <source>Data Technologies and Applications</source>
          ,
          <volume>53</volume>
          (
          <issue>1</issue>
          ),
          <fpage>108</fpage>
          -
          <lpage>122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Zheng</surname>
            <given-names>S.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bao</surname>
            <given-names>H.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hao</surname>
            <given-names>Y.X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme</article-title>
          .
          <source>arXiv preprint arXiv:1706.05075</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yu</surname>
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Bidirectional LSTM-CRF models for sequence tagging</article-title>
          .
          <source>arXiv preprint arXiv:1508</source>
          .
          <year>01991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction</article-title>
          .
          <source>arXiv preprint arXiv: 1301.3781</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Hybrid Structure of Pointer and Tagging for Relation Extraction: A Baseline</article-title>
          . https://github.com/bojone/kg-2019
          <source>-baseline 23.3 21.9 30.7 14.6 23.0 40.6 2.5 2.4 6.1 1.7 2.1 2.2 3.0 1.4 1.3 2.3 1.5 1.8 1.5 0.6 4.3 5.3 3.7 2.8 33.3 5.1 13.5 2.5 4.73 1.2 25.7 19.3 75.5 37.6 50.4 44.1 1.6 0.4 3.4 0.1 0.4 0</source>
          .
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>