=Paper=
{{Paper
|id=Vol-3004/paper13
|storemode=property
|title=Topic Evolution Path and Semantic Relationship Discovery Based on Patent Entity Relationship
|pdfUrl=https://ceur-ws.org/Vol-3004/paper13.pdf
|volume=Vol-3004
|authors=Jinzhu Zhang,Linqi Jiang
|dblpUrl=https://dblp.org/rec/conf/jcdl/ZhangJ21
}}
==Topic Evolution Path and Semantic Relationship Discovery Based on Patent Entity Relationship==
EEKE 2021 - Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents Topic Evolution Path and Semantic Relationship Discovery Based on Patent Entity Relationship∗ Jinzhu Zhang† Department of Information Management, School of Economics & Management Nanjing University of Science and Technology Nanjing China zhangjinzhu@njust.edu.cn Linqi Jiang Department of Information Management, School of Economics & Management Nanjing University of Science and Technology Nanjing China sufi_jiang@163.com ABSTRACT Topic evolution analysis describes the emergence, transition, and 1 Introduction extinction of a topic in a technical field, which can help Topic evolution analysis describes the emergence, transition, and researchers understand the history and current situation of the extinction of a topic in a technical field, which can help research field. Current studies is mainly patent text-based researchers understand the history and current situation of the methods, which often uses relationships among keywords to research field. The result can quickly identify research hotspots, construct co-occurrence network and analyses evolution using trends and gaps, which is essential to scientific and technological topic clustering algorithms. However, it didn't consider all the innovation (Liu H,2020). words in the patent and the semantic relationship between them. In the study of topic evolution analysis, topic evolution path and In addition, the relationships among topics should be more relationship discovery play important role in related research. concrete, we should not only find the evolution relationship, but Current studies could be classified into two classes, including also need to reveal the semantic relationships among topics. patent citation analysis-based and patent text-based methods (Yu Therefore, this paper uses representation learning method to get D,2020). This paper focuses on the latter method, which often the semantic representation of each entity/word, and computes the uses relationships among keywords to construct co-occurrence semantic similarity among them to find out pairs of words which network and analyses evolution using topic clustering algorithms are different but with the same meaning in a special context. (No H. J,2015). Then the evolution path and relationship are Moreover, we define multiple semantic relationships among discovered through comparisons of common keywords among topics, and design a method to use patent entity relationships to topics in different time series. obtain the semantic relationships among topics. Experiments in However, these common keywords cannot cover the pair of words the technical field of UAV transportation have confirmed that the which are different but with the same meaning in a special context. method in this paper can effectively identify the evolutionary In addition, the relationships among topics should be more relationship between topics and the semantic relationship between concrete, for example, we should not only find the evolution topic, Make the evolutionary relationship between topics more relationship (i.e., emergence, transition, and extinction), but also abundant and Interpretable. And provide a reference for further need to reveal the semantic relationships among topics (i.e., enriching and improving the topic evolution analysis method. function-realization or function-area). Therefore, this paper uses representation learning method KEYWORDS (Birunda,2021) to get the semantic representation of each Topic Evolution Path, semantic relationship between topic, Patent entity/word, and computes the semantic similarity among them to Entity Relationship find out pairs of words which are different but with the same meaning in a special context. Moreover, we define multiple semantic relationships among topics, and design a method to use patent entity relationships to obtain the semantic relationships among topics. 2 Data and method Firstly, a manual labelled dataset is made and a neural network model is trained to extract all patent entities. Then the topic is identified by the clustering method and the evolution path is Copyright 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 77 EEKE 2021 - Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents detected based on semantic similarity among. Finally, a neural network model is trained to extract the relationship between patent entities, and the semantic relationship among topics is discovered through relationships between patent entities division 2.1 Data collection Topic T1(t1) is divided two or more topics in This paper uses the Derwent Innovations Index database as the t2 patent data retrieval platform and the data is retrieved on July 2019. The patent search expression is “IP = B64* AND TI= (((un- manned OR automatic OR autonomous OR remotely piloted OR nonhuman) AND (aircraft OR "aerial vehicle" OR airship* OR integration drone OR plane OR aircraft* OR airplane OR aerobat* OR aerostat*)) OR "UAV"”, the time interval is from 2008 to 2020. A total of 4507 patents with title, abstract, patent application date Two or more topics integrate into one topic and other features are retrieved and processed as the data source. It is divided into different time series for evolution analysis considering the number of patents in each time period. extinction 2.2 Discovery of Topic Evolution Path Based on Semantic Similarity Among Entities T2(t2) is extinct which has no evolutionary It has following four steps for semantic similarity among entities. relationship with the following topic in t3 Firstly, a subset about training and testing set is made where the entities are manually labelled. Secondly, a BiLSTM-CRF (Lample, G.2016) model is trained on this dataset and evaluated through quantitative indicators. After 100 training iterations, the accuracy of the model exceeded 90% and became stable. Similarly, the loss emerging dropped to below 5.8 and stabilized. Thirdly, this model is used to detect entities on all patens of each patent. Finally, K-Means is T2(t2) is treated as an emerging topic which applied for clustering topics and get entities of each topic. Patent has no evolutionary relationship with documents contain a lot of long professional vocabulary. previous topic. Compared with commonly used LDA, the identified patent entity will not lose professional information. Fourthly, a word 2.3 Discovery of Semantic Relationship Among representation learning method is applied and the semantic Topics similarity among entities of each topic could be calculated. Firstly, we predefine five types of semantic relationships among Based on semantic similarity among entities, a similarity patent entities, which are shown in Table 2. Secondly, we threshold is determined, in which two different entities could be manually label a small dataset for training and testing with treated as the same meaning if the similarity is higher than predefined relationships. Thirdly, we train a OpenNRE (Han, X., threshold. We define five topic evolution patterns, including 2019) model on this dataset and evaluate it through quantitative development, division, integration, extinction, and emerging. indicators. Fourthly, the model is used to predict all relationships They are shown in Table 1, in which the size of the circle among entities. Finally, the semantic relationship between two represents the number of entities under each topic. Moreover, topics is determined based on the semantic relationship among all T1(t1) and T2(t2) means the topic T1 and T2 at time t1 and t2 pairs of entities. respectively. Table 2: Semantic relationship among topics Table 1: Five patterns of topic evolution path Semantic relationship Expression and illustration Evolution Mechanical refers to the containment relationship, position Expression and illustration pattern relationship(M) relationship, etc. of some mechanical parts. Efficacy refers to the relationship of efficacy enhancement relationship(E) Function-Area refers to the application field or function field of certain development relationship(FA) machines or systems. Function-Realization refers to some patented devices or systems that realize Topic T1(t2) only comes from T1(t1) Relationship(FR) certain functions. Control relationship(C) refers to a certain control relationship. 78 EEKE 2021 - Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents 3 Result analysis method. In the next step, we would like to apply other neural network models that may do better in patent entity We set semantic similarity to 0.7 where a pair of entities with relationship extraction and compared with baseline method for similarity higher than it is considered to have the same meaning. deep analysis. In the discovery of topic evolution path, the results are obtained from 2015-2016 as an example. There are six topics in 2015 and eight topics in 2016, where the evolution probability is shown in ACKNOWLEDGMENTS Table 3. This work is supported by the National Natural Science Foundation of China (No. 71974095). Table 3: Evolution probability among topics (%) 2015/2016 T0(t2) T1(t2) T2(t2) T3(t2) T4(t2) T5(t2) T6(t2) T7(t2) REFERENCES T0(t1) 5.1 14.4 32.4 12.4 0 20.5 16.5 0 T1(t1) 40.0 21.9 24.5 7.2 5.4 10.6 1.4 6.8 Liu, H., Chen, Z., Tang, J. et al. Mapping the technology T2(t1) 4.8 24.7 18.2 20.5 0 14.0 19.3 0 evolution path: a novel model for dynamic topic detection and T3(t1) 100 11.2 12.9 0 100 1.6 0 100 tracking. Scientometrics 125, 2043–2090 (2020). T4(t1) 3.4 23.3 18.4 13.5 0 17.9 24.3 0 https://doi.org/10.1007/s11192-020-03700-5 T5(t1) 5.0 14.0 8.6 34.1 0. 23.0 16.2 0 Yu, D., Xu, Z., & Wang, X. (2020). Bibliometric analysis of As shown in the Table 3, the pairs of topics with probability more support vector machines research trend: a case study in than 20% are emphasized with bold. They mean that there is a China. International Journal of Machine Learning and certain evolutionary relationship between them. For example, Cybernetics, 11(3), 715-728. T1(t1), T2(t1), T4(t1) are integrated into T1(t2). T1(t1) is divided into No, H. J., An, Y., & Park, Y. (2015). A structured approach to T0(t2), T1(t2) and T2(t2). T3(t1) is developed into T7(t2) because these explore knowledge flows through technology-based business two topics are almost the same. methods by integrating patent citation analysis and text Then, we obtain the semantic relationships between topics which mining. Technological Forecasting and Social Change, 97, have an evolution path, which is shown in Table 4. The 181-192. Birunda, S. S., & Devi, R. K. (2021). A Review on Word abbreviations of semantic relationships are illustrated in Table 2. Embedding Techniques for Text Classification. In Innovative Data Communication Technologies and Application (pp. 267- Table 4: Semantic relations of topic evolution 281). Springer, Singapore. 2015/2016 T0(t2) T1(t2) T2(t2) T3(t2) T4(t2) T5(t2) T6(t2) T7(t2) Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & T0(t1) E E Dyer, C. (2016). Neural architectures for named entity T1(t1) M E E recognition. arXiv preprint arXiv:1603.01360. T2(t1) E FA T3(t1) M M M Han, X., Gao, T., Yao, Y., Ye, D., Liu, Z., & Sun, M. (2019). T4(t1) E E OpenNRE: An open and extensible toolkit for neural relation T5(t1) FA FA extraction. arXiv preprint arXiv:1909.13078.. According to the results in Table 4, the semantic relationship between T1(t1) (UAV communication medium and function) and T0(t2) (camera device and image transmission system) is “Mechanical relationship". It shows that patents on communication media and functions of UAVs have developed over time, and a large part of the research topics have been split into image information transmission, which is very consistent with the development in the field. In addition, the semantic relationship among T2(t1) (UAV functional module), T5(t1) (UAV kinetic energy device) and T3(t2) (UAV application field) is "Function-Area relationship". Obviously, a large part of patent applications in the field of UAV transportation are related to UAV system applications, including geological survey, forest fire prevention, water resources inspection and protection, environmental science and ecology, agriculture and other key technologies used in industries. 4 Conclusion This paper proposes a method for discovery of Topic Evolution Path and Semantic Relationship among topics Based on Patent Entity Relationship. The result could prove the effectiveness of this method and could enrich and improve the topic evolution 79