<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>C. Yu);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Research on Named Entity Recognition from Patent Texts with Local Large Language Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chi Yu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ng Chen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haiyun Xu</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Business school, Shandong University of Technology</institution>
          ,
          <addr-line>Zibo, China, 255000</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Scientific and Technical Information of China</institution>
          ,
          <addr-line>Beijing, China, 100038</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Named entity recognition (NER) from patent texts is one of the fundamental tasks in technical intelligence analysis. However, state-of-the-art performance of NER is achieved at the cost of massive labeled data and intensive labor, which is cumbersome and time consuming. To address the issue, this paper proposed a new framework which employs large language model (LLM) to fulfill the task of NER. Specifically, 3 different prompt templates are designed for NER and efficient fine-tuning algorithm is also utilized to improve its performance. To demonstrate the characteristics of our method, extensive experiments are conducted based on a patent dataset pertained to magnetic head in hard disk drive, namely TFH-2020. Experimental results show that, even though from the perspective of supervised learning, there is a considerable gap between our method and SOTA methods, from the perspective of few-shot learning, our method outperformances similar methods by a large margin.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Named entity recognition (NER) seeks to locate and
classify named entity mentions in unstructured text into
pre-defined categories, thus to resolve the issue of
ambiguity within free texts. But when it comes to patent
text, challenges arise not only from the scarcity of labeled
patent dataset, but also from the characteristics of patent
texts, such as long sentence with domain-specific and
novel terms that are difficult to understand, complex
structure of sentences in patent claims that difficult for
syntactic parsing. These issues severely hinder the
application of NER technologies in patent texts.</p>
      <p>The emergence of large language model (LLM)
provides a new way to address the issues. That is, by
leveraging its capabilities of knowledge storage, semantic
understanding, and text generation, NER task can be conducted
with minimal labeled data and applicable to all domains. To
demonstrate the validity and feasibility of the thought, this
study proposes a framework for NER based on locally deployed
LLM, as ChatGPT or GPT4 is inaccessible in China. Furthermore,
efficient fine-tuning algorithm is also utilized in our study to
explore the capability of LLM in NER task.</p>
      <p>
        The organization of the rest of this paper is as follows. In
Section 2, a LLM-based method is put forward with 3 types of
prompt template for NER in patent texts. Then, extensive
experiments are conducted on the corpus of TFH-2020[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to
illustrate its performance in Section 3. The last section
concludes this contribution with future study directions.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <sec id="sec-2-1">
        <title>Since LLM primarily serves as a chatbot, NER is</title>
        <p>
          treated as a question-answering task in this paper.
Specifically speaking, through prompt templates,
the target text is transformed into a question and
submitted to LLM to identify named entities within
it. In this process, the quality of prompt template
determines the performance of LLM in NER task.
According to the principles of prompt template
design proposed by Fulford and Andrew[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], three
prompt templates are proposed include baseline
prompt, two-step prompt and multi-entity-type
prompt which are shown in Figure1. In the
meanwhile, efficient fine-tuning algorithm is also
utilized to improve the performance of LLM.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiment</title>
      <sec id="sec-3-1">
        <title>3.1 Data Preparation and LLM</title>
      </sec>
      <sec id="sec-3-2">
        <title>Selection</title>
        <sec id="sec-3-2-1">
          <title>TFH-2020 corpus [1] is taken as the experimental</title>
          <p>
            dataset in this study. It contains 1,010 patent
abstracts pertaining to thin film head technology in
hard-disk drive collected from the USPTO (United
States Patent and Trademark Office) database. To
describe the structure of the invention, Chen et al
[
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] defined 17 types of entities such as system,
component, function, effect, consequence etc., we
refer the readers to Chen et al [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] for more details
on this corpus. We construct prompts, instructions,
and test data from the sentence level. In detail, the
patent abstracts in TFH-2020 are divided into
3,384 sentences, which are then randomly split
into training and testing sets at a 9:1 ratio. The
sentences in the training set are used as examples
to fill in the prompts and build instructions as well,
while those in the testing set serve to evaluate the
performance of LLM in NER task. LLAMA-7B-chat
released by META Inc is employed as the LLM in
this study.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.2 Experimental result and analysis</title>
        <p>To evaluate the performance of the proposed
method, the three prompt templates mentioned
above are utilized to for NER task, and the
performance of LLM are analyzed from two
perspectives: first, the impact of varying number of
examples in prompt, and second, the performance
of efficient fine-tuning algorithms for different
prompt templates. The weighted-average
precision, recall, F1-value and the error rate of
output format are shown in Figure 2. It can be
observed that:</p>
        <p>(1) For contextual learning, as the number of
examples increases, the performance of baseline
prompt undergoes a trend of first increasing and
then decreasing. When the number of examples is 3,
its F1-value reaches maximum of 31%, with
corresponding precision rate of 32% and recall rate
of 31%, respectively. The performance of two-step
prompt exhibits a similar trend. As for the
counterpart of multi-entity-type prompt, it remains
stable with the varying number of examples.</p>
        <p>
          (2) As for LoRA [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], the three templates exhibit
distinct performance. After fine-tuned, the
performance of mutli-entity-type prompt gains
significant improvement, with its F1-value rising
from 27% to 49%. In contrast, the performances of
baseline prompt and two-step prompt are inferior
to their counterparts of contextual learning.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This study proposes a framework for NER task
from patent texts with local LLM. It is found that
prompt template not only affects the performance
of LLM but also influences the effectiveness of
efficient fine-tuning method. Simple prompt
templates tend to yield better results, while
complicated prompts significantly increase the
difficulty for both LLM and fine-tuning algorithm.
Acknowledgements</p>
      <sec id="sec-4-1">
        <title>This article is the outcome of the projects,</title>
        <p>“ Early Recognition Method of Transformative
Scientific and Technological Innovation Topics
based on Weak Signal Temporal Network Evolution
analysis ” (No.72274113) supported by the
National Natural Science Foundation of China, and
the Taishan Scholar Foundation of Shandong
province of China (tsqn202103069).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Chen</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>L.</given-names>
          </string-name>
          , et al. (
          <year>2021</year>
          ).
          <article-title>A deep learning-based method for extracting semantic information from patent documents</article-title>
          .
          <source>Scientometrics</source>
          ,
          <volume>125</volume>
          :
          <fpage>289</fpage>
          -
          <lpage>312</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>FULFORD I.</surname>
          </string-name>
          , ANDREW N.
          <article-title>ChatGPT prompt engineering for developers</article-title>
          ,
          <year>2023</year>
          . URL: https://learn.deeplearning.ai/chatgpt-prompteng/
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Hu</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallis</surname>
            <given-names>P.</given-names>
          </string-name>
          , et al. (
          <year>2021</year>
          ).
          <source>Lora: Lowrank adaptation of large language models</source>
          ,
          <year>2021</year>
          . arXiv preprint arXiv:
          <volume>2106</volume>
          .
          <fpage>096</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>