=Paper= {{Paper |id=Vol-3745/paper15 |storemode=property |title=Research on Named Entity Recognition from Patent Texts with Local Large Language Model |pdfUrl=https://ceur-ws.org/Vol-3745/paper15.pdf |volume=Vol-3745 |authors=Chi Yu,Liang Chen,Haiyun Xu |dblpUrl=https://dblp.org/rec/conf/eeke/YuCX24 }} ==Research on Named Entity Recognition from Patent Texts with Local Large Language Model== https://ceur-ws.org/Vol-3745/paper15.pdf
                         Research on Named Entity Recognition from Patent Texts
                         with Local Large Language Model
                     Chi Yu 1, Liang Chen 1,*and Haiyun Xu 2
                     1
                         Institute of Scientific and Technical Information of China, Beijing, China, 100038
                     2
                         Business school, Shandong University of Technology, Zibo, China, 255000


                                            Abstract
                                            Named entity recognition (NER) from patent texts is one of the fundamental tasks in technical intelligence
                                            analysis. However, state-of-the-art performance of NER is achieved at the cost of massive labeled data and
                                            intensive labor, which is cumbersome and time consuming. To address the issue, this paper proposed a
                                            new framework which employs large language model (LLM) to fulfill the task of NER. Specifically, 3
                                            different prompt templates are designed for NER and efficient fine-tuning algorithm is also utilized to
                                            improve its performance. To demonstrate the characteristics of our method, extensive experiments are
                                            conducted based on a patent dataset pertained to magnetic head in hard disk drive, namely TFH-2020.
                                            Experimental results show that, even though from the perspective of supervised learning, there is a
                                            considerable gap between our method and SOTA methods, from the perspective of few-shot learning, our
                                            method outperformances similar methods by a large margin.
                                            Keywords
                                            Patent mining, named entity recognition, large language model, efficient fine-tuning algorithm

                                                                                                                              leveraging its capabilities of knowledge storage, semantic
                     1. Introduction
                                                                                                                              understanding, and text generation, NER task can be conducted
                                                                                                                              with minimal labeled data and applicable to all domains. To
                             Named entity recognition (NER) seeks to locate and
                         classify named entity mentions in unstructured text into                                             demonstrate the validity and feasibility of the thought, this
                         pre-defined categories, thus to resolve the issue of                                                 study proposes a framework for NER based on locally deployed
                         ambiguity within free texts. But when it comes to patent                                             LLM, as ChatGPT or GPT4 is inaccessible in China. Furthermore,
                         text, challenges arise not only from the scarcity of labeled                                         efficient fine-tuning algorithm is also utilized in our study to
                         patent dataset, but also from the characteristics of patent                                          explore the capability of LLM in NER task.
                         texts, such as long sentence with domain-specific and                                                     The organization of the rest of this paper is as follows. In
                         novel terms that are difficult to understand, complex                                                Section 2, a LLM-based method is put forward with 3 types of
                         structure of sentences in patent claims that difficult for                                           prompt template for NER in patent texts. Then, extensive
                         syntactic parsing. These issues severely hinder the
                                                                                                                              experiments are conducted on the corpus of TFH-2020[1] to
                         application of NER technologies in patent texts.
                                                                                                                              illustrate its performance in Section 3. The last section
                             The emergence of large language model (LLM)
                                                                                                                              concludes this contribution with future study directions.
                         provides a new way to address the issues. That is, by

                         Joint Workshop of the 5th Extraction and Evaluation of Knowledge Entities
                         from Scientific Documents and the 4th AI + Informetrics (EEKE-AII2024),
                         April 23~24, 2024, Changchun, China and Online. ∗ Corresponding author.
                           0009-0005-2278-9684 (C. Yu); 0000-0002-3235-9806 (L. Chen); 0000-
                         0002-7453-3331 (H. Xu); EMAIL: 726932669@qq.com (C. Yu);
                         25565853@qq.com (L. Chen)
                                         © 2024 Copyright for this paper by its authors. Use permitted under Creative
                                         Commons License Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                                        109
2. Methodology                                                  3.2 Experimental result and analysis

      Since LLM primarily serves as a chatbot, NER is
 treated as a question-answering task in this paper.
 Specifically speaking, through prompt templates,
 the target text is transformed into a question and
 submitted to LLM to identify named entities within
 it. In this process, the quality of prompt template
 determines the performance of LLM in NER task.
 According to the principles of prompt template
 design proposed by Fulford and Andrew[2], three
 prompt templates are proposed include baseline
 prompt, two-step prompt and multi-entity-type
 prompt which are shown in Figure1. In the
 meanwhile, efficient fine-tuning algorithm is also
 utilized to improve the performance of LLM.




                                                                Figure 2: The performance of different prompt
                                                                templates in NER task


                                                                    To evaluate the performance of the proposed
Figure 1: The three prompt templates for NER task.               method, the three prompt templates mentioned
                                                                 above are utilized to for NER task, and the
                                                                 performance of LLM are analyzed from two
3. Experiment                                                    perspectives: first, the impact of varying number of
                                                                 examples in prompt, and second, the performance
3.1 Data Preparation and LLM                                     of efficient fine-tuning algorithms for different
Selection                                                        prompt templates.           The weighted-average
                                                                 precision, recall, F1-value and the error rate of
    TFH-2020 corpus [1] is taken as the experimental             output format are shown in Figure 2. It can be
 dataset in this study. It contains 1,010 patent                 observed that:
 abstracts pertaining to thin film head technology in               (1) For contextual learning, as the number of
 hard-disk drive collected from the USPTO (United                examples increases, the performance of baseline
 States Patent and Trademark Office) database. To                prompt undergoes a trend of first increasing and
 describe the structure of the invention, Chen et al             then decreasing. When the number of examples is 3,
 [1] defined 17 types of entities such as system,                its F1-value reaches maximum of 31%, with
 component, function, effect, consequence etc., we               corresponding precision rate of 32% and recall rate
 refer the readers to Chen et al [1] for more details            of 31%, respectively. The performance of two-step
 on this corpus. We construct prompts, instructions,             prompt exhibits a similar trend. As for the
 and test data from the sentence level. In detail, the           counterpart of multi-entity-type prompt, it remains
 patent abstracts in TFH-2020 are divided into                   stable with the varying number of examples.
 3,384 sentences, which are then randomly split                     (2) As for LoRA [3], the three templates exhibit
 into training and testing sets at a 9:1 ratio. The              distinct performance. After fine-tuned, the
 sentences in the training set are used as examples              performance of mutli-entity-type prompt gains
 to fill in the prompts and build instructions as well,          significant improvement, with its F1-value rising
 while those in the testing set serve to evaluate the            from 27% to 49%. In contrast, the performances of
 performance of LLM in NER task. LLAMA-7B-chat                   baseline prompt and two-step prompt are inferior
 released by META Inc is employed as the LLM in                  to their counterparts of contextual learning.
 this study.




                                                          110
4. Conclusion                                               analysis ” (No.72274113) supported by the
                                                            National Natural Science Foundation of China, and
                                                            the Taishan Scholar Foundation of Shandong
     This study proposes a framework for NER task
                                                            province of China (tsqn202103069).
from patent texts with local LLM. It is found that
prompt template not only affects the performance
of LLM but also influences the effectiveness of            References
efficient fine-tuning method. Simple prompt                [1]   Chen L, Xu S., Zhu L., et al. (2021). A deep
templates tend to yield better results, while                    learning-based method for extracting semantic
complicated prompts significantly increase the                   information     from       patent      documents.
difficulty for both LLM and fine-tuning algorithm.               Scientometrics, 125: 289–312.
                                                           [2]   FULFORD I., ANDREW N. ChatGPT prompt
Acknowledgements                                                 engineering for developers, 2023. URL:
                                                                 https://learn.deeplearning.ai/chatgpt-prompt-
    This article is the outcome of the projects,                 eng/
“ Early Recognition Method of Transformative               [3]   Hu E, Shen Y, Wallis P., et al. (2021). Lora: Low-
Scientific and Technological Innovation Topics                   rank adaptation of large language models, 2021.
based on Weak Signal Temporal Network Evolution                  arXiv preprint arXiv:2106.096




                                                     111