=Paper=
{{Paper
|id=Vol-3745/paper8
|storemode=property
|title=Material Performance Evolution Discovery Based on Entity Extraction and Social Circle Theory
|pdfUrl=https://ceur-ws.org/Vol-3745/paper8.pdf
|volume=Vol-3745
|authors=Jinzhu Zhang,Wenwen Sun
|dblpUrl=https://dblp.org/rec/conf/eeke/ZhangS24
}}
==Material Performance Evolution Discovery Based on Entity Extraction and Social Circle Theory==
Material performance evolution discovery based on entity extraction and social circle theory⋆ Jinzhu Zhang1, Wenwen Sun1 1 Department of Information Management, School of Economics and Management, Nanjing University of Science and Technology, Nanjing China Abstract Topic evolution analysis describes the emerge, develop, and extinct of topics in a field, which can help researchers understand the history and current situation of the research field. However, the material patent text has a certain domain specificity, and the general entity extraction models cannot extract special entities effectively. Moreover, the belief that topics with high similarity have evolution relationship contradicts the rule of “first the change, then the new topic”, which cannot clearly present the dynamic changes and accumulation of topics. Therefore, we design a method to extract the material performance entities accurately and construct dynamic evolution path for material performance topics. Firstly, we propose a material entity extraction model BERT-BiLSTM- CRF, which integrates syntactic dependency analysis and attention mechanism, realizing the accurate extraction of material performance entities. Secondly, we design an algorithm for identifying the evolution relationship between performance nodes based on ring boundaries, which can mine the evolution relationship between performance nodes and existing topics, realizing the dynamic accumulation and change of topics. Finally, we construct the dynamic evolution path of material performance, exploring the complex associations of material performance. Experiments in the field of metal materials confirm that the proposed method can effectively construct the dynamic evolution path of material performance topics, which makes the evolution relationships between topics more abundant and interpretable. Keywords Entity extraction, material performance evolution, patent entity relationship1 1. Introduction affect the material's microstructure. So, current researches perspectives on the evolution of material With the emergence of a large number of patents on performance are mainly divided into the three materials manufacturing and materials innovation, it has perspectives: "performance evolution - microstructure" , become critical to explore the complex associations and "performance evolution - microstructure - manufacture evolutionary trends of material performance. Such process" , and "performance evolution - manufacture exploration can help researchers deepen their process". The specific relationships are shown in Figure understanding of material performance and promotes 1. Among them, Perspectives I and II [2, 3, 4] involve the invention of new materials [1]. microstructure, require high levels of expertise and Through the review of existing studies, we learned that experimental equipment from researchers and readers ; the material performance mainly depends on their Perspectives III [5] usually proceed in the form of microstructure, while the manufacture process directly Joint Workshop of the 5th Extraction and Evaluation of Knowledge Entities from Scientific Documents and the 4th AI + Informetrics (EEKE-AII2024), April 23~24, 2024, Changchun, China and Online EMAIL: zhangjinzhu@njust.edu.cn ( Jinzhu Zhang ); Sun000216@16 3.com ( Wenwen Sun ) © Copyright 2024 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 72 controlling variables, such as varying the temperature of 2.1. Data sources and preprocessing a certain process, then exploring the influence of on the We considered that to do material evolution, if we microstructure of the material and the evolution of the collect data at random intervals, it may lead to a lack of material performance, which to some extents restricts completeness and accuracy in the final analysis of the the comprehensive understanding and description of the evolution results.So we takes the concept of Germany’s material performance evolution. III “Industrie 4.0” as the background, and selects metal affect material as an example, which is one of the key foundational materials closely related to this concept. I II material determine material s change manufacture Then we use the Derwent Innovations Index database as performance microstructure process the patent data retrieval platform. The patent search Figure 1: Different Perspectives in the Analysis of The expression is “TS=(‘Metal materials’ OR ‘Metallic Evolution of Material Performance materials’ OR ‘Metal alloys’ OR ‘Metal compositions’ OR In terms of evolution analysis methods, researchers ‘Metal-based materials’) AND WC=(‘Materials Science’)”, usually use the following approaches:(1) topic with a time interval from 2011 to 2023, where 2011 is identification [6]; (2) topic evolutionary analysis [7, 8]. the year when the concept of "Industry 4.0" was first However, the former is difficult to reveal document introduced. Then, the top 10,000 relevant patent texts semantics effectively, resulting in poor interpretability of were selected as the dataset. In addition, considering the the topics [9] . The latter only reflects the relative number of patents in each period, we divide it into year importance or attention of a topic at a specific time point batches for material performance evolution. [10]. Moreover, most of the existing studies are based on topic similarity, believing that there is an evolution 2.2. Method for extracting the material relationship between two topics whose similarity above entities a threshold [11, 12]. But the topic itself is dynamically Under the background of the continuous changing and accumulating, there should be “ first the improvement of the manufacture process, the change in material performance, then the new material processing method of materials is constantly progressing. performance topic ”, so the establishment of an At the same time, the change in the manufacture process evolution relationship based on the similarity between of the material will bring about changes in the material topics is biased. performance [13]. Therefore, this paper takes the material Therefore, we defined the performance entity and performance as the research object. Particularly, we manufacture process entity of metal material. In propose a method for constructing the topic evolution addition, we refer to the relationship shown in Fig. 1, path by introducing the social circle theory, realizing the establishing the causal relationship between the two for dynamic accumulation of material performance topics subsequent analysis. (Among them, the manufacture and the construction of evolution path. Finally, we process entity and causal relationship will be used in our explore the complex associations in the evolution of next step of exploring the reasons for performance material performance on the basis of the dynamic evolution, so it is rarely involved in this paper.) evolution path. Then, considering the content of material patents contains a large number of technical terms, material 2. Data and method components, we constructed an entity extraction model This paper takes the material performance as the (BERT-BiLSTM-CRF) by combining syntactic dependency research object. Firstly, we integrate syntactic analysis and attention mechanism. The combined use of dependency analysis and attention mechanism to these methods provides a more accurate extraction of construct the material entity extraction model (BERT- the material performance entities and manufacture BiLSTM-CRF), we obtaining the performance nodes of process entities from patent contents, providing a basis each material. And then divided the performance nodes for subsequent material analysis and research. of all materials into time batches by year. After that, we designed an algorithm based on the initial performance topics, to realize the dynamic accumulation of material performance topics and the construction of evolution path. 73 2.3. Method for constructing dynamic 2.3.2. Definition of evolution types evolution path for material performance This paper defines six evolution types based on topics existing studies. Among them, the four types of develop, In this part, we first define six evolution types, then evolve, emerge, and fuse are derived from four different we designed an algorithm for identifying the evolution social relationships in the social circle theory, and the relationship of performance nodes based on ring two types of extinct and split refer to the existing studies boundaries. Finally, we present the detailed process of to ensure the diversity of evolution types. In addition, the method for constructing dynamic evolutionary path this paper also improves the fuse and split types, by of material performance topics. further refining the different contributions of each theme in them, which helps to consider the dynamic 2.3.1. Social Circle Theory interactions between themes in more detail. See Appendix B for details. Social circle theory suggests that the social circle formed around a person reflects the closeness of his or 2.3.3. Identifying the evolution relationship of her social relationships. That is, a person's intimate social performance nodes based on ring boundaries circle usually consists of relationships with a high degree of relevance; then followed by the normal friends circle We refer to social circle theory and improve the and the strangers circle. In addition, there may exist such model proposed by Zhang et al. [14], proposing a part of people in the sea of people: they are algorithm for identifying the evolution relationship of temporarily outside your normal friends circle, but there performance nodes based on ring boundaries, the are certain similarities between each other, and they specific algorithm and its correspondence are shown in may become your friends or even intimate friends in the Figure 3. Firstly, for the existing performance topics, the future, so this paper defines them as potential friends. centroid of each topic is calculated, and the maximum Therefore, centered on the individual, their affinity rank Euclidean distance between each topic’s patent and its order is: intimate friends, normal friends, potential centroid is taken as the topic boundary. The topic friends, strangers, and the position belongs to: within boundary is extended outward by a ratio less than 1 to the intimate friends circle, outside the intimate friends obtain the outer ring boundary and shrunk inward by the circle within the normal friends circle, outside strangers same ratio to obtain the inner ring boundary. After circle within the normal friends circle, outside the several comparison tests, we finally set the ratio in this strangers circle, specifically as shown in Figure 2. study to 0.2. evolve intimate inner intimate friends circle boundary emerge C friends circle D Strangers develop Normal Friends normal topic centroid1 A friends circle boundary normal Potential Friends Intimate Friends Strangers friends circle outer B fuse Normal Strangers circle boundary Mutual Friends Friends centroid2 Potential topic Intimate boundary Friends Friends Strangers circle Mutual Figure 3: Algorithm for Identifying the Evolution Friends Relationship of Performance Nodes based on Ring Boundaries (See Appendix C for Detailed Picture) Subsequently, for material performance nodes in subsequent batches, the Euclidean distance between each performance node and the centroids of existing Figure 2: Social Circle Theory (See Appendix A for performance topics is calculated separately to identity Detailed Picture) the evolution relationship between them. The specific We refer to this theory and combine it with existing rules are as follows: research to define six evolution types. And propose an a) develop: inside the inner ring boundary algorithm for identifying the evolution relationship of b) evolve: outside the inner ring boundary and performance nodes, the details are shown in parts 2.3.2 inside the outer ring boundary and 2.3.3 respectively. c) emerge: outside the outer ring boundary d) fuse: boundary intersection 74 Then, hierarchical clustering is introduced to obtain 55 and 60. The result of the evolution path are shown in the different types of performance topics in the batch, Figure 5, where yellow, red, and green represent the and merge similar topics that exceed a threshold (we set years 2021, 2022, and 2023 respectively. (see Appendix it to 0.8 in this paper). For a topic in the previous batch, E for the entire picture, where Cluster stands for topic). if its number of topics obtained more than two in this batch, the evolution type is considered as split. Furthermore, in the construction of the evolutionary path, if a topic has no evolution relationship with the following topics, we consider it as extinct type. 2.3.4. Construction of the dynamic evolution path for metal material performance topics The construction of the dynamic evolution path of metal material performance mainly includes the following steps, which are shown in Figure 4. Firstly, after Figure 5: Evolution Path of Metal Material the extraction of performance entities of each material, Performance from 2021 to 2023 we get the performance node of each material, and then, From the results of the evolution path above, it can all material performance nodes are divided into time be observed that in 2022, Cluster1 developed from the batches according to the year. Secondly, the K-Means Cluster1 in 2021. The contents of the two Clusters are as algorithm is used to cluster the first batch of data to follows: [‘high excellent low corrosion’, ‘resistance good obtain the initial performance topics. powder strength temperature’], [‘high excellent alloy Subsequently, for performance nodes in subsequent low mechanical’, ‘strength corrosion resistance process batches, the algorithm for identifying the evolution good’]. It is not difficult to find that both Clusters focus relationship of performance nodes (see Section 3.3.3 for on improving the corrosion resistance and mechanical the specific algorithmic process) is used to identify their performance of metal materials, which aligns with the evolution relationships with each performance topic. practical application requirements of metal materials Then, hierarchical clustering is introduced to obtain the [15]. performance topics of different evolution types in this In 2022, Cluster1 further developed and split into batch, and merge similar topics that exceed a threshold. Cluster1 and Cluster2 in 2023. The contents of the three Finally, incremental iterations are carried out in the Clusters are as follows: [‘high excellent low corrosion’, above manner to obtain the material performance topics ‘resistance good powder strength temperature’], [‘good at different year batches, thereby achieving the dynamic surface heat’, ‘resistance layer wear low’], and construction of the material performance evolution path. [‘resistance good base layer wear’, ‘layer wear surface (see Appendix D for the entire picture, where Cluster low heat’]. As can be seen, the 2023 Cluster1 maintained stands for topic) the original corrosion resistance and wear resistance Cluster1 One time slice Cluster2 performance of Cluster1 in 2022, and further improved the surface heat performance of metals. This may be Metal Material Data flow Cluster3 Patent Abstract Text achieved through improvements in material technology and alloy additions, thus enhancing the performance in evolve emerge split Cluster1 evolve practical applications. fuse develop develop extinct Cluster2 4. Conclusion Figure 4: Process of Constructing The Evolution Path of This paper proposes an algorithm for identifying the Metal Material Performance evolution relationship of performance nodes based on ring boundaries, which can not only realize the dynamic 3. Result accumulation and construction of metal material In the construction of the evolution path of metal performance evolution path, but enrich and improve the material performance, we use examples from the years topic evolutionary analysis method. Currently, we are 2021-2023 to obtain the results. Specifically, the number combining the manufacture process entities of each of performance clusters in 2021, 2022, and 2023 are 66, material to further analyze the causes of the evolution of 75 material performance in depth, and to better [7] Parlina A, Ramli K, & Murfi H. Theme Mapping and understand the evolution trends and the changing Bibliometrics Analysis of One Decade of Big Data patterns of material performance. Research in the Scopus Database. Information, 2020,11(2): 69. doi: 10.3390/info11020069. Acknowledgements [8] Zibiao Li, & Li Zhang. Evolution of Patented Technology Topics of Steel Materials Based on LDA This work is supported by the National Natural Model. SCIENCE AND TECHNOLOGY Science Foundation of China (No. 72374103, 71974095) MANAGEMENT RESEARCH, 2020, 40(24): 175-183. and the Postgraduate Research & Practice Innovation doi: 10.3969/j.issn.1000-7695.2020.24.023. Program of Jiangsu Province (No. KYCX23_0632). [9] Behrouzi S, Sarmoor Z S, Hajsadeghi K, & Kavousi K. Predicting scientific research trends based on link References prediction in keyword networks. Journal of [1] Kaushal Jha, Suman Neogy, Santosh Kumar, R N Informetrics, 2020, 14(4): 101079. doi: Singh, & G K Dey. Correlation between 10.1016/j.joi.2020.101079. microstructure and mechanical properties in the [10] Wu Z, Xie P, Zhang J, Zhan B, & He Q. Tracing the age-hardenable Cu-Cr-Zr alloy. Journal of Nuclear Trends of General Construction and Demolition Materials: Materials Aspects of Fission and Fusion, Waste Research Using LDA Modeling Combined 2021, 546, 152775. doi: With Topic Intensity. Frontiers in Public Health, 10.1016/j.jnucmat.2020.152775. 2022, 10(3), 22-26. doi: [2] Enyu Guo, Guohua Fan, & Tongmin Wang. 10.3389/fpubh.2022.899705. Microstructure Evolution Mechanism of Metallic [11] Jinxia Liu, Qianqian Hou, Jing Du, Fuhou Chai, & Li Materials: Progress on In Situ Studies Using Zhang. Hot Topics Evolution Research of Emerging Synchrotron Radiation Source. Failure Analysis and Field Under the Sub Topics and Vocabulary Related Prevention, 2021, 16(01): 1-14+91. doi: Perspective. Journal of Intelligence. 10.3969/j.issn.1673-6214.2021.01.001. 2023,42(3):123-129. doi:10.3969/j.issn.1002- [3] ZhiGuo Liu, Ri You, & Jintao Wu. Study on 1965.2023.03.017. relationship between dam concrete material [12] Reza T H S, Sadegh A, & Soroush T. An embedding performance evolution and its pore structure. approach for analyzing the evolution of research WATER RESOURCES AND HYDROPOWER topics with a case study on computer science ENGINEERING, 2016, 47(11): 25-28+35. subdomains. Scientometrics, 2023,128(3): 11-16. [4] Yinzhi Li, Gang Liu, Xinzhe Chen, Zhenhua Su, doi: 10.1007/s11192-023-04642-4. Lianwu Yan, & Yingbiao Peng. Effects of Sintering [13] Qing-Miao Hu, & Rui Yang. The endless search for Temperature and Holding Time on the better alloys. Science, 2022, 378: 26-27. Microstructure and Properties of Ti(C,N)-Based doi:10.1126/science.ade5503 Cermets[J]. Packaging Journal, 2023, 15(03): 37-45. [14] Zhang Y, Zhang G, Zhu D, & Liu J. Scientific [5] Yang Liu, Baoliang Liu, Shi Kai, Xiaoyuan Han, Yi Xia, evolutionary pathways: Identifying and visualizing & Jianzhao Shang. Effects of different zirconia- relationships for scientific topics. Journal of the based raw materials on properties of MgO-Al-C Association for Information Science and materials. REFRACTORIES, 2023, 57(01): 59-64. doi: Technology , 2017, 68(8): 1925-1939. doi: 10.3969/j.issn.1001-1935.2023.01.013. 10.1002/asi.23814. [6] Lee J W, & Han D H. Data Analysis of Psychological [15] Y.B. Lei, Z.B. Wang, B. Zhang, Z.P. Luo, J. Lu, & K. Lu. Approaches to Soccer Research: Using LDA Topic Enhanced mechanical properties and corrosion Modeling. Behavioral Sciences, 2023,13(10), 43-47. resistance of 316L stainless steel by pre-forming a doi: 10.3390/bs13100787. gradient nanostructured surface layer and annealing. Acta Materialia. 2021, 208, 116773. doi: 10.1016/j.actamat.2021.116773. 76 A. Appendices intimate friends circle normal Strangers friends circle Normal Friends Potential Intimate Friends Friends Strangers circle Mutual Friends Figure 2: Social Circle Theory B. Appendices Table 1 Six patterns of topic evolution types (where Cluster stands for topic) social Evolution Expression and illustration Explanation relationships types develop Cluster1 Cluster2 Cluster2 only develops intimate friends develop from Cluster1 evolve Cluster1 Cluster2 Cluster2 only evolves from normal friends evolve Cluster1 evolve develop Cluster1 Cluster2 Cluster3 Cluster4 is treated as an strangers emerge emerge Cluster which has Cluster4 no evolutionary relationship with previous Clusters. Cluster1 cluster3 comes from mutual friends fuse Cluster3 cluster1 and cluster2 Cluster2 together. 77 Cluster2 Cluster1 is divided two or / split Cluster1 Cluster3 more Clusters evolve develop Cluster1 Cluster2 Cluster4 Cluster3 is extinct which / extinct has no evolution Cluster3 relationship with the following Clusters C. Appendices evolve intimate inner friends circle boundary emerge C D Strangers develop Normal Friends normal topic centroid1 A friends circle boundary Potential Intimate Friends Friends fuse outer B Strangers circle boundary Mutual Friends centroid2 topic boundary Figure 3: Algorithm for Identifying the Evolution Relationship of Performance Nodes based on Ring Boundaries D. Appendices Cluster1 One time slice Cluster2 Metal Material Data flow Cluster3 Patent Abstract Text evolve emerge split Cluster1 evolve fuse develop extinct Cluster2 develop Figure 4: Process of Constructing The Evolution Path of Metal Material Performance 78 E. Appendices Figure 5: Evolution Path of Metal Material Performance from 2021 to 2023 79