=Paper= {{Paper |id=Vol-3745/paper8 |storemode=property |title=Material Performance Evolution Discovery Based on Entity Extraction and Social Circle Theory |pdfUrl=https://ceur-ws.org/Vol-3745/paper8.pdf |volume=Vol-3745 |authors=Jinzhu Zhang,Wenwen Sun |dblpUrl=https://dblp.org/rec/conf/eeke/ZhangS24 }} ==Material Performance Evolution Discovery Based on Entity Extraction and Social Circle Theory== https://ceur-ws.org/Vol-3745/paper8.pdf
                                Material performance evolution discovery based on entity
                                extraction and social circle theory⋆
                                Jinzhu Zhang1, Wenwen Sun1

                                1 Department of Information Management, School of Economics and Management, Nanjing University of Science

                                and Technology, Nanjing China


                                                    Abstract

                                                    Topic evolution analysis describes the emerge, develop, and extinct of topics in a field, which can
                                                    help researchers understand the history and current situation of the research field. However, the
                                                    material patent text has a certain domain specificity, and the general entity extraction models
                                                    cannot extract special entities effectively. Moreover, the belief that topics with high similarity have
                                                    evolution relationship contradicts the rule of “first the change, then the new topic”, which cannot
                                                    clearly present the dynamic changes and accumulation of topics. Therefore, we design a method to
                                                    extract the material performance entities accurately and construct dynamic evolution path for
                                                    material performance topics. Firstly, we propose a material entity extraction model BERT-BiLSTM-
                                                    CRF, which integrates syntactic dependency analysis and attention mechanism, realizing the
                                                    accurate extraction of material performance entities. Secondly, we design an algorithm for
                                                    identifying the evolution relationship between performance nodes based on ring boundaries, which
                                                    can mine the evolution relationship between performance nodes and existing topics, realizing the
                                                    dynamic accumulation and change of topics. Finally, we construct the dynamic evolution path of
                                                    material performance, exploring the complex associations of material performance. Experiments in
                                                    the field of metal materials confirm that the proposed method can effectively construct the dynamic
                                                    evolution path of material performance topics, which makes the evolution relationships between
                                                    topics more abundant and interpretable.

                                                    Keywords
                                                    Entity extraction, material performance evolution, patent entity relationship1



                                1. Introduction                                                                                affect the material's microstructure. So, current
                                                                                                                               researches perspectives on the evolution of material
                                    With the emergence of a large number of patents on                                         performance are mainly divided into the three
                                materials manufacturing and materials innovation, it has                                       perspectives: "performance evolution - microstructure" ,
                                become critical to explore the complex associations and                                        "performance evolution - microstructure - manufacture
                                evolutionary trends of material performance. Such                                              process" , and "performance evolution - manufacture
                                exploration can help researchers deepen their                                                  process". The specific relationships are shown in Figure
                                understanding of material performance and promotes                                             1. Among them, Perspectives I and II [2, 3, 4] involve
                                the invention of new materials [1].                                                            microstructure, require high levels of expertise and
                                Through the review of existing studies, we learned that                                        experimental equipment from researchers and readers ;
                                the material performance mainly depends on their                                               Perspectives III [5] usually proceed in the form of
                                microstructure, while the manufacture process directly

                                Joint Workshop of the 5th Extraction and Evaluation of Knowledge
                                Entities from Scientific Documents and the 4th AI + Informetrics
                                (EEKE-AII2024), April 23~24, 2024, Changchun, China and Online
                                EMAIL: zhangjinzhu@njust.edu.cn ( Jinzhu Zhang ); Sun000216@16
                                3.com ( Wenwen Sun )
                                              © Copyright 2024 for this paper by its authors. Use permitted
                                              under Creative Commons License Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                                          72
controlling variables, such as varying the temperature of                  2.1. Data sources and preprocessing
a certain process, then exploring the influence of on the
                                                                               We considered that to do material evolution, if we
microstructure of the material and the evolution of the
                                                                           collect data at random intervals, it may lead to a lack of
material performance, which to some extents restricts
                                                                           completeness and accuracy in the final analysis of the
the comprehensive understanding and description of the
                                                                           evolution results.So we takes the concept of Germany’s
material performance evolution.
                                   III
                                                                           “Industrie 4.0” as the background, and selects metal
                                 affect                                    material as an example, which is one of the key
                                                                           foundational materials closely related to this concept.
                       I                         II
      material
                  determine
                                material s
                                               change
                                                        manufacture        Then we use the Derwent Innovations Index database as
    performance               microstructure              process
                                                                           the patent data retrieval platform. The patent search
Figure 1: Different Perspectives in the Analysis of The                    expression is “TS=(‘Metal materials’ OR ‘Metallic
Evolution of Material Performance                                          materials’ OR ‘Metal alloys’ OR ‘Metal compositions’ OR
    In terms of evolution analysis methods, researchers                    ‘Metal-based materials’) AND WC=(‘Materials Science’)”,
usually use the following approaches:(1) topic                             with a time interval from 2011 to 2023, where 2011 is
identification [6]; (2) topic evolutionary analysis [7, 8].                the year when the concept of "Industry 4.0" was first
However, the former is difficult to reveal document                        introduced. Then, the top 10,000 relevant patent texts
semantics effectively, resulting in poor interpretability of               were selected as the dataset. In addition, considering the
the topics [9] . The latter only reflects the relative                     number of patents in each period, we divide it into year
importance or attention of a topic at a specific time point                batches for material performance evolution.
[10]. Moreover, most of the existing studies are based on
topic similarity, believing that there is an evolution                     2.2. Method for extracting the material
relationship between two topics whose similarity above                     entities
a threshold [11, 12]. But the topic itself is dynamically
                                                                               Under the background of the continuous
changing and accumulating, there should be “ first the
                                                                           improvement of the manufacture process, the
change in material performance, then the new material
                                                                           processing method of materials is constantly progressing.
performance topic ”, so the establishment of an
                                                                           At the same time, the change in the manufacture process
evolution relationship based on the similarity between
                                                                           of the material will bring about changes in the material
topics is biased.
                                                                           performance [13].
    Therefore, this paper takes the material
                                                                               Therefore, we defined the performance entity and
performance as the research object. Particularly, we
                                                                           manufacture process entity of metal material. In
propose a method for constructing the topic evolution
                                                                           addition, we refer to the relationship shown in Fig. 1,
path by introducing the social circle theory, realizing the
                                                                           establishing the causal relationship between the two for
dynamic accumulation of material performance topics
                                                                           subsequent analysis. (Among them, the manufacture
and the construction of evolution path. Finally, we
                                                                           process entity and causal relationship will be used in our
explore the complex associations in the evolution of
                                                                           next step of exploring the reasons for performance
material performance on the basis of the dynamic
                                                                           evolution, so it is rarely involved in this paper.)
evolution path.
                                                                               Then, considering the content of material patents
                                                                           contains a large number of technical terms, material
2. Data and method
                                                                           components, we constructed an entity extraction model
    This paper takes the material performance as the                       (BERT-BiLSTM-CRF) by combining syntactic dependency
research object. Firstly, we integrate syntactic                           analysis and attention mechanism. The combined use of
dependency analysis and attention mechanism to                             these methods provides a more accurate extraction of
construct the material entity extraction model (BERT-                      the material performance entities and manufacture
BiLSTM-CRF), we obtaining the performance nodes of                         process entities from patent contents, providing a basis
each material. And then divided the performance nodes                      for subsequent material analysis and research.
of all materials into time batches by year. After that, we
designed an algorithm based on the initial performance
topics, to realize the dynamic accumulation of material
performance topics and the construction of evolution
path.




                                                                      73
2.3. Method for constructing dynamic                                         2.3.2. Definition of evolution types
evolution path for material performance                                          This paper defines six evolution types based on
topics                                                                       existing studies. Among them, the four types of develop,
    In this part, we first define six evolution types, then                  evolve, emerge, and fuse are derived from four different
we designed an algorithm for identifying the evolution                       social relationships in the social circle theory, and the
relationship of performance nodes based on ring                              two types of extinct and split refer to the existing studies
boundaries. Finally, we present the detailed process of                      to ensure the diversity of evolution types. In addition,
the method for constructing dynamic evolutionary path                        this paper also improves the fuse and split types, by
of material performance topics.                                              further refining the different contributions of each
                                                                             theme in them, which helps to consider the dynamic
2.3.1. Social Circle Theory                                                  interactions between themes in more detail. See
                                                                             Appendix B for details.
     Social circle theory suggests that the social circle
formed around a person reflects the closeness of his or                      2.3.3. Identifying the evolution relationship of
her social relationships. That is, a person's intimate social                performance nodes based on ring boundaries
circle usually consists of relationships with a high degree
of relevance; then followed by the normal friends circle                         We refer to social circle theory and improve the
and the strangers circle. In addition, there may exist such                  model proposed by Zhang et al. [14], proposing
a part of people in the sea of people: they are                              algorithm for identifying the evolution relationship of
temporarily outside your normal friends circle, but there                    performance nodes based on ring boundaries, the
are certain similarities between each other, and they                        specific algorithm and its correspondence are shown in
may become your friends or even intimate friends in the                      Figure 3. Firstly, for the existing performance topics, the
future, so this paper defines them as potential friends.                     centroid of each topic is calculated, and the maximum
Therefore, centered on the individual, their affinity rank                   Euclidean distance between each topic’s patent and its
order is: intimate friends, normal friends, potential                        centroid is taken as the topic boundary. The topic
friends, strangers, and the position belongs to: within                      boundary is extended outward by a ratio less than 1 to
the intimate friends circle, outside the intimate friends                    obtain the outer ring boundary and shrunk inward by the
circle within the normal friends circle, outside strangers                   same ratio to obtain the inner ring boundary. After
circle within the normal friends circle, outside the                         several comparison tests, we finally set the ratio in this
strangers circle, specifically as shown in Figure 2.                         study to 0.2.
                                                                                                                                                                                                     evolve
                                                                                                                                 intimate            inner
                                                         intimate                                                             friends circle       boundary                                                          emerge
                                                                                                                                                                                         C
                                                      friends circle
                                                                                                                                                                                                              D

                                                                               Strangers                                                                                                                                 develop
                                                                                                        Normal
                                                                                                        Friends                   normal             topic               centroid1       A
                                                                                                                              friends circle       boundary

                                                          normal                           Potential
                                                                                            Friends
                                                                                                                  Intimate
                                                                                                                   Friends
       Strangers                                      friends circle                                                                              outer                              B
                                                                                                                                                                                                                  fuse

                                Normal                                                                                       Strangers circle
                                                                                                                                                boundary
                                                                                                       Mutual
                                Friends                                                                Friends

                                                                                                                                                                                             centroid2


                   Potential                                                                                                                                    topic
                                          Intimate                                                                                                            boundary
                    Friends                Friends

                                                     Strangers circle
                               Mutual                                        Figure 3: Algorithm for Identifying the Evolution
                               Friends
                                                                             Relationship of Performance Nodes based on Ring
                                                                             Boundaries (See Appendix C for Detailed Picture)
                                                                                 Subsequently, for material performance nodes in
                                                                             subsequent batches, the Euclidean distance between
                                                                             each performance node and the centroids of existing
Figure 2: Social Circle Theory (See Appendix A for                           performance topics is calculated separately to identity
Detailed Picture)                                                            the evolution relationship between them. The specific
    We refer to this theory and combine it with existing                     rules are as follows:
research to define six evolution types. And propose an                           a)     develop: inside the inner ring boundary
algorithm for identifying the evolution relationship of                          b)     evolve: outside the inner ring boundary and
performance nodes, the details are shown in parts 2.3.2                                 inside the outer ring boundary
and 2.3.3 respectively.                                                          c)     emerge: outside the outer ring boundary
                                                                                 d)     fuse: boundary intersection




                                                                        74
     Then, hierarchical clustering is introduced to obtain                                                             55 and 60. The result of the evolution path are shown in
the different types of performance topics in the batch,                                                                Figure 5, where yellow, red, and green represent the
and merge similar topics that exceed a threshold (we set                                                               years 2021, 2022, and 2023 respectively. (see Appendix
it to 0.8 in this paper). For a topic in the previous batch,                                                           E for the entire picture, where Cluster stands for topic).
if its number of topics obtained more than two in this
batch, the evolution type is considered as split.
Furthermore, in the construction of the evolutionary
path, if a topic has no evolution relationship with the
following topics, we consider it as extinct type.

2.3.4. Construction of the dynamic evolution
path for metal material performance topics
     The construction of the dynamic evolution path of
metal material performance mainly includes the
following steps, which are shown in Figure 4. Firstly, after                                                           Figure 5: Evolution Path of Metal Material
the extraction of performance entities of each material,                                                               Performance from 2021 to 2023
we get the performance node of each material, and then,                                                                     From the results of the evolution path above, it can
all material performance nodes are divided into time                                                                   be observed that in 2022, Cluster1 developed from the
batches according to the year. Secondly, the K-Means                                                                   Cluster1 in 2021. The contents of the two Clusters are as
algorithm is used to cluster the first batch of data to                                                                follows: [‘high excellent low corrosion’, ‘resistance good
obtain the initial performance topics.                                                                                 powder strength temperature’], [‘high excellent alloy
     Subsequently, for performance nodes in subsequent                                                                 low mechanical’, ‘strength corrosion resistance process
batches, the algorithm for identifying the evolution                                                                   good’]. It is not difficult to find that both Clusters focus
relationship of performance nodes (see Section 3.3.3 for                                                               on improving the corrosion resistance and mechanical
the specific algorithmic process) is used to identify their                                                            performance of metal materials, which aligns with the
evolution relationships with each performance topic.                                                                   practical application requirements of metal materials
Then, hierarchical clustering is introduced to obtain the                                                              [15].
performance topics of different evolution types in this                                                                     In 2022, Cluster1 further developed and split into
batch, and merge similar topics that exceed a threshold.                                                               Cluster1 and Cluster2 in 2023. The contents of the three
Finally, incremental iterations are carried out in the                                                                 Clusters are as follows: [‘high excellent low corrosion’,
above manner to obtain the material performance topics                                                                 ‘resistance good powder strength temperature’], [‘good
at different year batches, thereby achieving the dynamic                                                               surface heat’, ‘resistance layer wear low’], and
construction of the material performance evolution path.                                                               [‘resistance good base layer wear’, ‘layer wear surface
(see Appendix D for the entire picture, where Cluster                                                                  low heat’]. As can be seen, the 2023 Cluster1 maintained
stands for topic)                                                                                                      the original corrosion resistance and wear resistance
                                                                                            Cluster1
                                                                One time slice                         Cluster2
                                                                                                                       performance of Cluster1 in 2022, and further improved
                                                                                                                       the surface heat performance of metals. This may be
          Metal Material                          Data flow                      Cluster3
       Patent Abstract Text
                                                                                                                       achieved through improvements in material technology
                                                                                                                       and alloy additions, thus enhancing the performance in
                               evolve
      emerge         split                           Cluster1           evolve
                                                                                                                       practical applications.
                                 fuse

                                                                           develop

                     develop
                                        extinct     Cluster2
                                                                                                                       4. Conclusion
Figure 4: Process of Constructing The Evolution Path of                                                                    This paper proposes an algorithm for identifying the
Metal Material Performance                                                                                             evolution relationship of performance nodes based on
                                                                                                                       ring boundaries, which can not only realize the dynamic
3. Result                                                                                                              accumulation and construction of metal material
    In the construction of the evolution path of metal                                                                 performance evolution path, but enrich and improve the
material performance, we use examples from the years                                                                   topic evolutionary analysis method. Currently, we are
2021-2023 to obtain the results. Specifically, the number                                                              combining the manufacture process entities of each
of performance clusters in 2021, 2022, and 2023 are 66,                                                                material to further analyze the causes of the evolution of




                                                                                                                  75
material performance in depth, and to better                        [7]    Parlina A, Ramli K, & Murfi H. Theme Mapping and
understand the evolution trends and the changing                           Bibliometrics Analysis of One Decade of Big Data
patterns of material performance.                                          Research in the Scopus Database. Information,
                                                                           2020,11(2): 69. doi: 10.3390/info11020069.
Acknowledgements                                                    [8]    Zibiao Li, & Li Zhang. Evolution of Patented
                                                                           Technology Topics of Steel Materials Based on LDA
    This work is supported by the National Natural
                                                                           Model.         SCIENCE      AND        TECHNOLOGY
Science Foundation of China (No. 72374103, 71974095)
                                                                           MANAGEMENT RESEARCH, 2020, 40(24): 175-183.
and the Postgraduate Research & Practice Innovation
                                                                           doi: 10.3969/j.issn.1000-7695.2020.24.023.
Program of Jiangsu Province (No. KYCX23_0632).
                                                                    [9]    Behrouzi S, Sarmoor Z S, Hajsadeghi K, & Kavousi K.
                                                                           Predicting scientific research trends based on link
References                                                                 prediction in keyword networks. Journal of
[1]   Kaushal Jha, Suman Neogy, Santosh Kumar, R N                         Informetrics, 2020, 14(4): 101079. doi:
      Singh, & G K Dey. Correlation between                                10.1016/j.joi.2020.101079.
      microstructure and mechanical properties in the               [10]   Wu Z, Xie P, Zhang J, Zhan B, & He Q. Tracing the
      age-hardenable Cu-Cr-Zr alloy. Journal of Nuclear                    Trends of General Construction and Demolition
      Materials: Materials Aspects of Fission and Fusion,                  Waste Research Using LDA Modeling Combined
      2021,           546,           152775.            doi:               With Topic Intensity. Frontiers in Public Health,
      10.1016/j.jnucmat.2020.152775.                                       2022,            10(3),          22-26.            doi:
[2]   Enyu Guo, Guohua Fan, & Tongmin Wang.                                10.3389/fpubh.2022.899705.
      Microstructure Evolution Mechanism of Metallic                [11]   Jinxia Liu, Qianqian Hou, Jing Du, Fuhou Chai, & Li
      Materials: Progress on In Situ Studies Using                         Zhang. Hot Topics Evolution Research of Emerging
      Synchrotron Radiation Source. Failure Analysis and                   Field Under the Sub Topics and Vocabulary Related
      Prevention, 2021, 16(01): 1-14+91. doi:                              Perspective.        Journal      of       Intelligence.
      10.3969/j.issn.1673-6214.2021.01.001.                                2023,42(3):123-129.         doi:10.3969/j.issn.1002-
[3]   ZhiGuo Liu, Ri You, & Jintao Wu. Study on                            1965.2023.03.017.
      relationship between dam concrete material                    [12]   Reza T H S, Sadegh A, & Soroush T. An embedding
      performance evolution and its pore structure.                        approach for analyzing the evolution of research
      WATER       RESOURCES        AND      HYDROPOWER                     topics with a case study on computer science
      ENGINEERING, 2016, 47(11): 25-28+35.                                 subdomains. Scientometrics, 2023,128(3): 11-16.
[4]   Yinzhi Li, Gang Liu, Xinzhe Chen, Zhenhua Su,                        doi: 10.1007/s11192-023-04642-4.
      Lianwu Yan, & Yingbiao Peng. Effects of Sintering             [13]   Qing-Miao Hu, & Rui Yang. The endless search for
      Temperature and Holding Time on the                                  better alloys. Science, 2022, 378: 26-27.
      Microstructure and Properties of Ti(C,N)-Based                       doi:10.1126/science.ade5503
      Cermets[J]. Packaging Journal, 2023, 15(03): 37-45.           [14]   Zhang Y, Zhang G, Zhu D, & Liu J. Scientific
[5]   Yang Liu, Baoliang Liu, Shi Kai, Xiaoyuan Han, Yi Xia,               evolutionary pathways: Identifying and visualizing
      & Jianzhao Shang. Effects of different zirconia-                     relationships for scientific topics. Journal of the
      based raw materials on properties of MgO-Al-C                        Association for Information Science and
      materials. REFRACTORIES, 2023, 57(01): 59-64. doi:                   Technology , 2017, 68(8): 1925-1939. doi:
      10.3969/j.issn.1001-1935.2023.01.013.                                10.1002/asi.23814.
[6]   Lee J W, & Han D H. Data Analysis of Psychological            [15]   Y.B. Lei, Z.B. Wang, B. Zhang, Z.P. Luo, J. Lu, & K. Lu.
      Approaches to Soccer Research: Using LDA Topic                       Enhanced mechanical properties and corrosion
      Modeling. Behavioral Sciences, 2023,13(10), 43-47.                   resistance of 316L stainless steel by pre-forming a
      doi: 10.3390/bs13100787.                                             gradient nanostructured surface layer and
                                                                           annealing. Acta Materialia. 2021, 208, 116773. doi:
                                                                           10.1016/j.actamat.2021.116773.




                                                               76
 A. Appendices
                                                                                                                        intimate
                                                                                                                     friends circle




                                                                                                                         normal
                               Strangers                                                                             friends circle
                                                           Normal
                                                           Friends


                                           Potential
                                                                             Intimate
                                            Friends                           Friends

                                                                                                                    Strangers circle
                                                         Mutual
                                                         Friends




                                                  Figure 2: Social Circle Theory

 B. Appendices
 Table 1 Six patterns of topic evolution types (where Cluster stands for topic)
     social               Evolution                    Expression and illustration                                                     Explanation
 relationships              types

                                                                               develop
                                                          Cluster1                               Cluster2                 Cluster2 only develops
intimate friends           develop
                                                                                                                          from Cluster1

                                                                               evolve
                                                            Cluster1                            Cluster2                  Cluster2 only evolves from
normal friends              evolve
                                                                                                                          Cluster1
                                                                   evolve                  develop
                                                       Cluster1                 Cluster2                 Cluster3
                                                                                                                          Cluster4 is treated as an
   strangers                emerge
                                                                                                                          emerge Cluster which has
                                                                                   Cluster4
                                                                                                                          no evolutionary
                                                                                                                          relationship with previous
                                                                                                                          Clusters.

                                                                  Cluster1
                                                                                                                          cluster3 comes from
mutual friends                fuse                                                            Cluster3
                                                                                                                          cluster1 and cluster2
                                                                  Cluster2                                                together.




                                                                         77
                                                                                                     Cluster2

                                                                                                                               Cluster1 is divided two or
         /                               split                            Cluster1

                                                                                                   Cluster3
                                                                                                                               more Clusters


                                                                           evolve                  develop
                                                               Cluster1                Cluster2                 Cluster4
                                                                                                                               Cluster3 is extinct which
         /                             extinct                                                                                 has no evolution
                                                                                      Cluster3
                                                                                                                               relationship with the
                                                                                                                               following Clusters
C. Appendices
                                                                                                                                                              evolve
                                                                 intimate                           inner
                                                              friends circle                      boundary                                                                    emerge
                                                                                                                                                C

                                                                                                                                                                       D

 Strangers                                                                                                                                                                        develop
                            Normal
                            Friends                               normal                            topic                      centroid1         A
                                                              friends circle                      boundary

             Potential
                                      Intimate
              Friends                  Friends                                                                                                                             fuse
                                                                                             outer                                         B
                                                             Strangers circle
                                                                                           boundary
                           Mutual
                           Friends

                                                                                                                                                      centroid2

                                                                                                                      topic
                                                                                                                    boundary




 Figure 3: Algorithm for Identifying the Evolution Relationship of Performance Nodes based on Ring Boundaries


D. Appendices
                                                                                                                                               Cluster1
                                                                                                        One time slice                                    Cluster2




                             Metal Material                                           Data flow                                  Cluster3
                          Patent Abstract Text




                                                    evolve
                         emerge          split                                            Cluster1                   evolve

                                                      fuse

                                                                                                                           develop
                                                               extinct                  Cluster2

                                          develop

                     Figure 4: Process of Constructing The Evolution Path of Metal Material Performance




                                                                                 78
E. Appendices
Figure 5: Evolution Path of Metal Material Performance from 2021 to 2023




                                                     79