<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessment of Chinese Academic Articles in Information Resources Management: A Comparison of Knowledge Entity and Reference-Based Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yanqi Ren</string-name>
          <email>tusthisrenyanqi@njust.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chen Yang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yi Zhao</string-name>
          <email>yizhao93@njust.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heng Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chengzhi Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Nanjing University of Science and Technology</institution>
          ,
          <addr-line>Nanjing 210094</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>42</fpage>
      <lpage>59</lpage>
      <abstract>
        <p>Novelty is a key factor in assessing research outcomes in a specific field. Measuring the novelty of articles in the field of Information Resources Management (IRM) helps researchers understand the current status of innovation and identify future development potential. Analyzing novelty across various themes within IRM offers insights for promoting innovation and ensuring balanced growth. Fine-grained knowledge entities encapsulate a paper's core knowledge, while references represent the flow of knowledge. Measuring article novelty from these two perspectives and comparing the results reveals thematic similarities and differences, providing a more comprehensive understanding. This study analyzes IRM-related research articles published in CSSCI-indexed journal from 2000 to 2022. After calculating article novelty using finegrained research method entities and references, the BERTopic model identifies key themes in the field. The results indicate that novelty scores based on fine-grained knowledge entities are generally lower than those based on references, with both perspectives showing skewed distributions. Themes like University Libraries and Bibliometrics and Evaluation exhibit higher novelty scores from both perspectives.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Novelty Assessment</kwd>
        <kwd>Information Resources Management</kwd>
        <kwd>Fine-Grained Knowledge Entities</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The term “Information Resources Management” (IRM) was first coined in the United States during
the late 1970s and early 1980s [1], and it gradually spread worldwide. In China, IRM has evolved
from a research field into an independent discipline, exerting a profound influence on the
theoretical construction, discipline development, professional education, and career advancement
of library and information science[2]. Notably, in September 2022, after the primary discipline
“Library Information
and</p>
      <sec id="sec-1-1">
        <title>Archives</title>
      </sec>
      <sec id="sec-1-2">
        <title>Management”</title>
        <p>was renamed
“Information</p>
      </sec>
      <sec id="sec-1-3">
        <title>Resources</title>
        <p>Management,” the future trajectory of development has garnered significant attention. The advent
of the data and intelligence era has endowed the research related to the IRM discipline with new
connotations. Whether in terms of data acquisition or computational requirements, both have been
realized with the advent of the big data era, and new disciplines related to IRM are gradually
emerging, providing fresh impetus for the development of the discipline. This undoubtedly
presents new opportunities for the field of IRM. The exploration and analysis of topics within this
field, along with their novelty characteristics, can not only assist fellow scholars in understanding
cutting-edge trends and gaining insights into state of research in the Chinese Information
Resources Management field from a macro perspective, thus advancing the discipline [3].</p>
        <p>Methods for measuring the novelty of academic articles are typically divided into two
categories: those based on references and those based on content. References in a paper represent
its knowledge sources and can be regarded as the input of knowledge into the paper. In
referencebased methods, the novelty of a paper is measured by quantifying the number of new combinations
of its knowledge sources. Previous studies have indicated that, for “knowledge input-based”
articles, this method overlooks the complexity of citation motivations [4], focusing instead on the
paper’s exploration and integration across interdisciplinary fields, which highlights the diversity
and innovative combinations of knowledge sources [5]. Content-based methods may focus more on
measuring “knowledge output-based” articles. Traditional methods of measuring the novelty of an
article often rely on the frequency of keywords, entities, and other elements within the text.
However, calculating merely the combinational frequency of vocabulary, without considering
semantic differences between combinations, may overlook important novelty features [6].
Knowledge entities are fundamental developmental trajectories but also demonstrate the current
units of a discipline. Assessing academic paper novelty through fine-grained knowledge entities
can not only address the shortcomings of traditional novelty measurement methods but also
capture finer granularity of novelty differences between entities. As a result, many scholars have
used knowledge entities as entry points for assessing paper novelty. For instance, Liu et al.
quantified the scientific novelty of doctoral theses based on biological entities to explore the
heterogeneity and gender differences in scientific novelty [7].</p>
        <p>Current research mostly measures novelty from a single perspective within single or multiple
disciplines, with few studies measuring and comparing novelty under two perspectives across
different research topics within a discipline. Measuring the novelty of IRM articles from the dual
perspectives of fine-grained knowledge entities and references enables the study of novelty
characteristic differences among various topics. This approach aids researchers in better
understanding the knowledge structure and development trends of the field, thus promoting
interdisciplinary integration and innovation. Therefore, this study aims to utilize fine-grained
knowledge entities and references in articles to explore, from two perspectives, the novelty score
characteristics and differences of topics in the Chinese IRM field from 2000 to 2022, identifying
traces of innovation in academic articles as they explore unknown areas, thereby providing
references for researchers in choosing paper topics and designing research proposals to advance
the depth and breadth of academic research. By analyzing the novelty characteristics of academic
topics from two perspectives, it is hoped that this study will contribute beneficially to the theory
and practice of academic innovation, providing strategies and insights for the academic community
to maintain and enhance novelty within the research landscape.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>This article aims to measure the novelty of articles related to the IRM field from the perspectives of
fine-grained knowledge entities and references, and to analyze the characteristics、similarities and
differences in novelty scores across different topics within the field. To this end, this section will
review previous related work.</p>
    </sec>
    <sec id="sec-3">
      <title>2.1 The methods for measuring novelty in academic articles</title>
      <p>The measurement of novelty in academic articles primarily relies on two perspectives: references
and the content of the articles themselves. References provide an external perspective for gauging
novelty, with the measurement based on the flow of knowledge from cited articles to citing articles.
By contrast, measuring novelty from an internal perspective, focusing on the content of the paper,
often involves keywords, entities, and sentences, which directly capture the innovation of
knowledge within the paper.</p>
    </sec>
    <sec id="sec-4">
      <title>2.1.1 Novelty measurement based on references</title>
      <p>Uzzi et al. evaluated the novelty of articles by the rarity of pairwise journal combinations in the
references. The study found that scientific innovation does not simply rely on novelty or
conventionality, articles that combine highly novel with highly conventional elements are more
likely to become highly cited [8]. Lee et al. build upon Uzzi’s method of measuring novelty by
addressing the issue of journal commonality, using the tenth percentile instead of the minimum
value to reduce noise. The research decomposes scientific creativity into two aspects: novelty and
impact, and explores how team size, domain, and task diversity influence these aspects. The results
indicate an inverted U-shaped relationship between team size and novelty, with team size
enhancing novelty by increasing knowledge diversity [9]. Foster used a community detection
algorithm to cluster journals in the references. Journals categorized within the same community
were considered conventional. Those in different communities were considered novel for
innovation assessment. He discovered that novel strategies are present in the field of chemistry.
Furthermore, he found that their importance increases over time [10]. Subsequently, Wang et al.
defined the novelty of a paper based on whether the journals cited in it are being combined for the
first time. They found that articles with high novelty, as defined, contribute significantly to science
and are more likely to become the top 1% most cited articles in the long term, stimulating future
highly-useful research. However, such articles exhibit higher variance in citations and are less
likely to become top-cited articles in the short term, indicating high research risk[11]. Veugelers
applied Wang et al.'s method and concluded that scientific articles ranked in the top 1% of a field
are more likely to have direct technological impact, be cited, and generate innovative patents
compared to their non-novel counterparts [12].</p>
    </sec>
    <sec id="sec-5">
      <title>2.1.2 Novelty measurement based on paper content</title>
      <p>To measure the novelty of academic articles, some scholars have focused on the content itself,
including keywords, entities, and sentences. An international expert panel, consisting of 57 leading
experts from 16 countries, was mentioned by Zins in relation to the “informatics knowledge
graph”, which explores systemic and comprehensive innovations in this field through group
discussions [13]. In subsequent research, knowledge transfer and innovation models were
categorized by Meng into three types: research-oriented innovation, front-led innovation, and
disruptive innovation, based on the similarity of keywords in academic research topics [14].
Additionally, novelty was assessed through keywords by Mishra S et al. A single document’s
thematic novelty was measured, based on a medical subject heading index, by proposing a series of
methods that include improved word frequency statistics. It was found that the average conceptual
novelty among most authors declines with age, yet the most innovative works may be published at
any stage of their careers [15]. Regarding entity-based novelty measurement, scientific novelty in
doctoral dissertations was quantified using biological entities by Liu et al., and it was found that
the novelty declines over time, with gender differences also noted [7]. Moreover, the novelty was
gauged by Liu using combinations of biological entities from COVID-19-related articles,
highlighting the importance of international collaboration during pandemics [16]. The novelty of
articles was measured by Chen et al. using fine-grained knowledge entity combinations, exploring
the relationship between the composition of author teams and the novelty of academic articles [17].
Sentence-level novelty was also assessed by Zhang et al., quantifying the novelty of sentences by
comparing the cosine similarity between current and historical sentences in the bag-of-words
space, with the novelty score calculated as one minus the maximum similarity. The results were
compared to those of English sentence novelty assessments, revealing that the performance of
novelty detection at the Chinese sentence level can be comparable to that of English [18].</p>
      <p>In summary, the novelty assessment based on references primarily includes evaluating the
rarity of reference pair combinations and considering the commonality issues of journals.
Additionally, the measurement of novelty based on paper content involves constructing evaluation
metrics using factors such as the frequency and temporal aspects of keywords, entities, and
sentences. Methods such as measuring novelty through entities and sentences are also utilized.</p>
    </sec>
    <sec id="sec-6">
      <title>2.2 Research on topics related to novelty</title>
      <p>In academic research, topic identification and novelty measurement are two important and closely
related fields. Topic identification aims to determine hidden themes within a text, thereby helping
researchers better understand the content. Novelty measurement, on the other hand, involves
assessing the originality and innovativeness of an article or study. By comprehensively applying
topic identification techniques and novelty measurement methods, the core themes of a paper can
be thoroughly explored, and its innovative contribution within academia can be evaluated.</p>
      <p>He et al. explored the predictive effect of paper innovation on literature growth in a certain
field. They sorted word embeddings by time series to form time-ordered embeddings, calculated
topic word similarity in vector space, and obtained a topic innovation index [19]. The frequency of
topic changes over time was tracked by Mörchen et al., with frequency scores used to represent
topic novelty. The results indicate that emerging trends can be predicted, and a trend-ranking
function was provided to support interactive searches for the latest popular trends related to
diseases [20]. Some studies utilize temporal relationships between topics to assess their novelty, but
methods for constructing these relationships vary. Topic modeling was applied by He et al. to
citation networks to determine pairwise relationships [21], whereas Yan employed similarity
measures to establish these relationships [22]. These methods automatically detect research topics
and assess their novelty based on textual information. Small et al. identified, classified, and
analyzed the top 25 emerging topics from 2007 to 2010 each year, to understand the drivers of their
novelty, including scientific discoveries, technological innovations, or external events. The novelty
and value of these topics were evaluated by searching for these topics or significant awards
recently received by key researchers. The findings indicate that this method provides a list of
potentially important topics with novelty for review by decision-makers [23]. Additionally, Choi et
al. used STM topic models to identify topics within patent data to mine potential novel topics [24].
Other scholars focused on the relationship between paper topics and time to explore the novelty of
the articles. For instance, Tu et al. proposed a predictive index based on time, the number of topics,
and frequency to identify the novelty of emerging topics in specific fields [25].</p>
      <p>Current research in the IRM field mostly utilizes research topics for frontier analysis or for
measuring novelty based on the topics themselves. However, studies that measure novelty from
both the perspectives of knowledge entities and references, and combine them with paper topics to
analyze characteristics and similarities and differences, are limited. This study starts from the
content of articles, using fine-grained research method entities and references to compute the
novelty of IRM field articles, and explores the distribution of novelty scores and their similarities
and differences across different topics, with the aim to provide a reference for scholars in related
fields to understand the IRM domain and select research topics.</p>
    </sec>
    <sec id="sec-7">
      <title>3. Data and Methodology</title>
      <p>This article aims to evaluate the novelty of IRM articles from 2000 to 2022. It uses fine-grained
knowledge entities as the internal perspective and references as the external perspective. The
article analyzes the novelty scores of different topics within the IRM field. It also examines the
novelty differences across various topics under these perspectives. First, all academic articles
related to the IRM field are collected. Then, the collected corpus is subjected to novelty calculation
in two dimensions, and the BERTopic model is used to identify paper topics. Finally, the
distribution of novelty across different topics in the IRM field and the similarities and differences
under the two perspectives are analyzed. The specific research framework is shown in Figure 1.</p>
      <sec id="sec-7-1">
        <title>3.1 Corpus collection and organization</title>
        <p>This study focuses on Chinese IRM articles, which are defined as those indexed by the Chinese
Social Sciences Citation Index (CSSCI, http://cssci.nju.edu.cn/) and pertain to the field of
information resource management. Currently, CSSCI is widely recognized in the Chinese academic
and publishing communities and has become one of the most influential journal evaluation
standards in social sciences in China [26]. Therefore, the CSSCI journal list (2021-2022) was used as
the source, and the CNKI database (https://www.cnki.net/) was selected as the data source to obtain
a total of 100,142 IRM articles. Due to varying initial publication years of journals, the time range
was set from 2000 to 2022 for consistent analysis. Considering the needs of subsequent research,
articles lacking publication dates, abstracts, full text (approximately 11.3% of the total data),
references (approximately 20.6% of the total data), and those classified as cover articles were
excluded, resulting in a valid dataset of 59,084 articles. The distribution of journals in the dataset is
shown in Table 1. Notably, the journal New Technology of Library and Information Service was
renamed Data Analysis and Knowledge Discovery in 2017. Thus, the paper counts from these
journals have been combined under the latter title. The unequal distribution of source journals
could affect topic identification. However, the development of the IRM field has led to significant
trends of thematic intersection, with integration observed among its sub-disciplines, which
mitigate potential biases in topic identification results due to unequal numbers. Additionally, the
temporal distribution of journal articles was also analyzed, as shown in Figure 2.</p>
      </sec>
      <sec id="sec-7-2">
        <title>3.2 Novelty calculation of IRM articles</title>
        <p>
          In this study, five research method entities were defined and extracted, and these fine-grained
entities, along with the references of the articles, were used to measure the novelty of IRM articles
from 2000 to 2022. The novelty results from the two perspectives were compared. This section
primarily introduces the methods for measuring novelty from both perspectives.
3.2.1 Novelty calculation of entities based on fine-grained research methods
(
          <xref ref-type="bibr" rid="ref1 ref2">1</xref>
          ) Annotation of fine-grained research method entities in the IRM field corpus
For subsequent work on novelty measurement based on fine-grained research method entities,
machine learning was employed to automatically identify research method entities in the full text.
Based on Chu et al.'s classification standards [27] and related studies [28], theory, method, data,
tool, and metric entities used to solve problems were extracted from research articles for novelty
evaluation. Considering the broad range of disciplines in the information resource management
field, where research methods integrate multiple disciplines and theory acts as a cornerstone for
guiding practice and driving innovation, a theoretical entity was added to the original four
finegrained knowledge entities in this study. The specific definitions are shown in Table 2.
        </p>
        <p>
          To minimize subjective bias in manual annotation, the process was divided into three parts:
preannotation, consistency calculation, and formal annotation. Specific annotation guidelines are
provided in the appendix. The pre-annotation phase was used to develop annotation rules, and
during the consistency calculation phase, two sets of annotation consistency results were obtained
and measured using Kappa coefficients [29], which were 0.69 and 0.73, meeting consistency
requirements. Finally, all data underwent formal annotation. Ultimately, 2,716 sentences containing
method entities were obtained from the 249 sampled articles. The numbers of entities and
sentences corresponding to the five types of method entities are shown in Table 3.
(
          <xref ref-type="bibr" rid="ref3">2</xref>
          ) Fine-grained research method entity extraction in the IRM domain
Chinese-BERT-WWM-Ext [30] is a Chinese pre-trained model based on the BERT (Bidirectional
Encoder Representations from Transformers) architecture. It has been trained using Chinese
vocabulary and language characteristics, enhancing its ability to understand Chinese semantics.
“WWM” stands for “Whole Word Masking,” meaning that entire words are masked during training
as opposed to individual characters as in the original BERT model, which helps improve the
model’s comprehension of complete words. “Ext” stands for “Extended,” indicating that this
Chinese BERT model has been adjusted in terms of training size and steps to enhance model
performance and effectiveness. Based on these characteristics, the Chinese-BERT-WWM-Ext model
was fine-tuned on a training set, with optimal model parameters determined using a validation set.
To ensure a balanced number of entities across all categories, all sentences are randomly shuffled
and then divided into training, validation, and test sets in an 8:1:1 ratio. Its performance was tested
on a test set using accuracy, recall, and F1 score, achieving scores of 0.79, 0.75, and 0.77
respectively. Finally, the trained model was used to extract method entities from unannotated
articles, with the extraction results for each entity type presented in Table 4.
(
          <xref ref-type="bibr" rid="ref4">3</xref>
          ) Novelty computation based on fine-grained entities
This study employs the method developed by Liu et al. [16], which measures the novelty of articles
based on entity combinations and distance calculations, to calculate the novelty of IRM articles
based on fine-grained knowledge entities. Liu et al., extracted entities from the titles and abstracts
of COVID-19 articles. Then, these entities were paired, and the distances between each pair were
captured. A distribution of distances between entity pairs was obtained, and those pairs whose
distances fell within the top 10% of this distribution were considered novel entity combinations.
Equation (
          <xref ref-type="bibr" rid="ref1 ref2">1</xref>
          ) shows the specific calculation method of distance for entities. The novelty score of
each article was measured by the ratio of novel entity pairs to the total number of possible entity
pairs in the paper, as shown in Equation (
          <xref ref-type="bibr" rid="ref3">2</xref>
          ). After obtaining all five types of fine-grained
knowledge entities for each article, the novelty scores for all IRM domain articles were calculated
using the aforementioned method and equations.
        </p>
        <p>Where  represents the academic article, is the number of entities extracted from ,  is the
total number of combinations of two entities that can be extracted from the set of  entities
Where
and</p>
        <p>represent the two entities in an entity pair, specifically, all five types of
entities extracted from the aforementioned articles,
is the dot product of
and , and
represents the product of the Euclidean norms of
and .</p>
        <p>
          (
          <xref ref-type="bibr" rid="ref1 ref2">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">2</xref>
          )
extracted from    (i.e. the number of entity pairs generated by
entities),and   denotes the count
of entity pairs in   compared to all entity pairs generated in IRM domain articles, where the
distance between the two entities in the pairs from   falls within the top 10% of the distance
distribution of entity pairs.
3.2.2 Novelty computation based on references
This study employs a method for calculating novelty based on references, adopting the concept of
Lee et al. [9], which uses the rarity of journal combinations in references as a measure of novelty,
implemented by ranking and recording the percentiles of commonality. This method of measuring
novelty was inspired by the research of Uzzi et al. [8]. Specifically, researchers first calculated the
number of co-cited journal pairs in the database and recorded the cited journal pairs for each
article. Then, for articles published in the same year, these journal pairs were aggregated into an
annual set of journal pairs. Subsequently, for articles published in year , the commonality of each
cited journal pair was recorded. These commonality values were ranked, and the 10th percentile
was recorded as an indicator of commonality at the paper level, as shown in Equation (
          <xref ref-type="bibr" rid="ref4">3</xref>
          ). In this
manner, the novelty of articles can be objectively assessed without relying on other factors such as
impact and citation counts.
(
          <xref ref-type="bibr" rid="ref4">3</xref>
          )
Where
represents the entire dataset,
indicates the number of journal pairs containing
journals
and in ,
represents the total number of journal pairs in
.
        </p>
        <p>is the probability of
journal appearing in .</p>
        <p>is the joint probability of journal and appearing together.</p>
      </sec>
      <sec id="sec-7-3">
        <title>3.3 Identify research topics</title>
        <p>After evaluating the novelty of IRM field articles from both internal and external perspectives,
BERTopic is employed for topic modeling to further explore the thematic characteristics of novelty
from different perspectives, facilitating the subsequent analysis of thematic features of paper
novelty scores.
3.3.1 Topic modeling based on BERTopic
This study primarily utilizes the BERTopic model for topic extraction from abstracts and titles of
articles within the IRM field. The BERTopic model is based on BERT (Bidirectional Encoder
Representations from Transformers) and topic modeling. It is applied to generate and interpret
topics within documents. Compared to traditional topic models, BERTopic has the capability to
handle multilingual text data and demonstrates superior performance. It more accurately captures
the semantic and thematic information of text data, ensuring higher levels of topic coherence and
diversity while retaining keywords within topics. Its dynamic topic modeling results can provide a
clearer explanation of trend analysis [31]. Considering the characteristics of the data and research
objectives, BERTopic is utilized for topic identification. The model employs BERT to generate
document embeddings, uses UMAP for dimensionality reduction while preserving positional
information, and clusters using the HDBSCAN algorithm. Finally, c-TF-IDF and maximal marginal
relevance are used to optimize topic generation and obtain topic representation [32].</p>
        <p>The detailed process of topic modeling in this study is as follows: Initially, since the study
involves processing Chinese text, the “paraphrase-multilingual-mpnet-base-v2” was selected as the
word embedding model. Subsequently, after the initialization of the UMAP model, “cosine” was
49
used to measure the distance between points, informed by relevant literature [24] and multiple
experimental results. To ensure tighter embeddings, the parameter “min_dist” was set to 0.01. Next,
HDBSCAN was initialized. Considering the data volume and the handling of outliers in the study,
the parameter “min_cluster_size” was set to 700 while “min_samples” was set to 1 to ensure the
identification of topics while minimizing outliers as much as possible. Finally, the parameter
“nr_topics” was set to “auto,” allowing BERTopic to iteratively generate topics. This approach
avoids the inconvenience of subjectively setting too many or too few topics.
3.3.2 Topic identification results in the IRM domain
Topic modeling was conducted on all IRM field data for each paper, categorizing them into
corresponding topics. The number of articles in each topic was counted, resulting in a total of 17
topics. Additionally, 12,908 articles were considered as noise due to their unclear topics, and they
were labeled as -1. Combining the thematic names from previous studies with the characteristic
words identified in this study, the research topics in the field of information resources management
include but are not limited to University Library (topic0),Bibliometrics and Evaluation (topic1),
Enterprise Knowledge Management and Organization (topic2),Online Public Sentiment (topic3) ,
Resource Service Construction of Digital Libraries (topic4) , National Security Intelligence Analysis
(topic5),Electronic Data and Information Management in Government (topic6), Text Semantic
Analysis (topic7), Information Literacy Education (topic8), Book Preservation and Classification
(topic9), Copyright Protection (topic10),Reading Promotion (topic11), Patent Technology
Protection(topic12),Virtual Consulting Services (topic13), Network Information Retrieval
(topic14) ,Enterprise Competitive Intelligence (topic15) , and User Information Behavior(topic16),
aligning well with existing research findings.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>4. Results Analysis</title>
      <p>This section primarily investigates the distribution of novelty scores calculated based on references
and fine-grained entities, comparing the results obtained by both methods to analyze the thematic
characteristics of highly novel articles.</p>
      <sec id="sec-8-1">
        <title>4.1 Distribution analysis of novelty calculation results of academic articles in the field of IRM</title>
        <p>The distribution of novelty obtained by two methods will be explored separately in this section, in
order to reveal the state of novelty in current academic research and its characteristics from
different perspectives.
4.1.1 Novelty distribution analysis based on fine-grained research method entities
Based on the idea of entity combination, this study calculated the novelty of IRM articles from 2000
to 2022, with an overall novelty score range of 0-1. To more intuitively observe the general
characteristics of the novelty score distribution derived from fine-grained entities, a histogram of
the score distribution was also created, as shown in Figure 3.</p>
        <p>Overall, the novelty scores of the majority of articles are distributed in the lower range, showing
a significant right-skewed distribution, indicating that novelty in academic research is still an issue
requiring further attention and enhancement. Specifically, the most notable part of the graph
shows that over half of the novelty scores are concentrated in the 0.0-0.2 range, with 55.8% and
34.8% of scores falling into the 0.0-0.1 and 0.1-0.2 intervals, respectively. This implies that most
articles may primarily extend or slightly develop existing research, with relatively limited novelty.
Such a distribution may raise concerns about the conservative or highly repetitive nature of
academic research, suggesting that academic institutions and researchers need to more actively
promote highly novel research. Articles with novelty scores exceeding 0.2 are significantly fewer.
Articles falling into the 0.2-0.25 range account for 5.3%, those in the 0.25-0.3 range only make up
2.4%, and those scoring between 0.3-0.35 are as few as 1.1%. These articles may propose new
theories or significant breakthroughs, yet they are relatively scarce. Although low in proportion,
these articles with high novelty may have important implications for the development of the
discipline.</p>
        <p>In summary of the above analysis, the histogram indicates that although some articles exhibit
high novelty, the majority demonstrate low novelty. To enhance overall research novelty in the
IRM field, various measures may need to be adopted, such as strengthening interdisciplinary
collaboration[33], encouraging high-risk, high-reward research projects[11], and increasing
support and funding for original research. Additionally, academic review mechanisms could be
oriented towards high-novelty research, encouraging more researchers to dare to challenge and
explore new areas.
4.1.2 Novelty distribution analysis based on references
This study also utilized Lee’s improved method for calculating novelty, which relies on the theory
of the combinatorial rarity of reference journals to assess the novelty of IRM articles. This
approach provides a basis for the reliability of subsequent novelty calculations. The distribution of
novelty scores obtained from references is shown in Figure 4, where the horizontal axis represents
the novelty scores in the range of -8 to 3. The vertical axis indicates the frequency of articles per
score interval. The red line depicts the smoothed Kernel Density Estimation (KDE) curve [34]. It
can be observed from the histogram and its KDE curve that the distribution of novelty scores
exhibits a notable skewness, particularly showing a left-skewed distribution. Specifically, most
research articles have novelty scores concentrated between 0 and 1, with this interval accounting
for 63.25% of the total paper proportion, indicating that only a small number of articles in the
academic dataset exhibit high novelty. Furthermore, it can be directly observed from the
distribution graph that most scores are concentrated, with no significant dispersion seen in the
scores obtained using references, a finding consistent with previous research conclusions and
expectations [3].</p>
      </sec>
      <sec id="sec-8-2">
        <title>4.2 Comparative analysis of novelty calculation results</title>
        <p>
          This section will provide a comparative analysis of the novelty results of academic articles obtained
from two different perspectives. This comparison not only aids in revealing the application
differences of citation-based and content entity-based novelty measurement methods across
various research topics but also facilitates a deeper understanding of their characteristics and
strengths in capturing academic innovation. Through comparative analysis, it is expected that the
applicable scenarios for each method will be identified, along with their specific contributions to
novelty assessment.
4.2.1 Comparative analysis of novelty calculation results based on fine-grained entities
and references —— global perspective
In this section, a global perspective is adopted to analyze the novelty characteristics based on
finegrained entities and references. Building on the existing classification system of research methods,
this study introduces “theory” entities and uses five extracted fine-grained research method entities
to calculate the novelty of IRM articles. Previous related studies have calculated novelty using
references from a small dataset of academic articles. In contrast, this study has collected IRM
academic articles from 2000 to 2022 and calculated their novelty using references, yielding a
distribution of scores consistent with existing research. This study utilizes both internal and
external texts of academic articles. After normalizing and aligning the novelty score data obtained
from fine-grained entities and references, the results are plotted on a quadrant chart, as shown in
Figure 5. The specific normalization method is shown in formula (
          <xref ref-type="bibr" rid="ref5">4</xref>
          ).The analysis combines the
consistency and differences of novelty scores under the two methods to explore the novelty
characteristics of IRM articles in depth.
(
          <xref ref-type="bibr" rid="ref5">4</xref>
          )
        </p>
        <sec id="sec-8-2-1">
          <title>Among them,</title>
          <p>and</p>
          <p>are the maximum and minimum values in the original data,
respectively,</p>
          <p>represents each feature value or observation value in the original data,
represents normalized eigenvalues or observations.</p>
          <p>Figure 5 illustrates the scatter distribution of two novelty calculation methods,
Score_reference and Score_entity. The horizontal axis represents the Score_reference, while the
vertical axis denotes the Score_entity, with each point representing an individual article. The
obtained novelty score distribution is divided into four quadrants. The red and green lines indicate
the demarcation lines for high and low novelty scores based on fine-grained entities and
references, respectively, representing the average scores from the two perspectives. Overall, more
score points are distributed in the lower quadrants of the chart. Relative to the demarcation lines,
the novelty scores of articles based on references are generally higher than those based on
finegrained method entities. Articles located in the Q1 quadrant have novelty scores that exceed the
average in both perspectives, indicating that these articles demonstrate significant innovation in
content entities as well as in their reference citations compared to other articles. These articles may
propose new theories, methods, or applications, combined with the citation of cutting-edge and
diverse literature, to exhibit high overall novelty. Therefore, such articles are more likely to exert a
positive impact on their respective fields and advance academic research.</p>
          <p>
            Articles in the Q3 quadrant score below average in both novelty measurements, indicating
limited innovation in both content entities and reference citations. These articles predominantly
follow established research paths, lacking novel insights or cutting-edge literature citations. If
finegrained research method entities are considered as knowledge output and reference citations are
seen as knowledge input, then articles located in the Q2 and Q4 quadrants indicate that some
articles exhibit outstanding innovation in knowledge output or demonstrate a certain level of
novelty through the integration of pioneering, cross-disciplinary literature as input. Furthermore,
by examining a few extremely high novelty articles (Score_entity&gt;0.5 and Score_reference&gt;0.5) in
relation to internal and external features, it is found that they were mostly published between 2002
and 2013.This suggests that there is no direct correlation between novelty and publication time,
and newly published articles do not necessarily exhibit high novelty.
4.2.2 Comparative analysis of novelty calculation results based on fine-grained entities
and references - thematic perspective
This section analyzes the novelty calculation results based on fine-grained entities and references
from a thematic perspective, further exploring the thematic characteristics of article novelty using
both methods. This study selects the top 1000 articles with the highest novelty calculated based on
fine-grained entities and the top 1000 based on references for thematic analysis. This allows for an
understanding of the similarities and differences in how the two methods identify novel themes,
providing a more comprehensive perspective on studying paper novelty.
(
            <xref ref-type="bibr" rid="ref1 ref2">1</xref>
            )
          </p>
          <p>Topic distribution characteristics of the top1000 articles with novelty scores from two
perspectives
The thematic overlap of the top 1000 articles in terms of novelty, based on fine-grained research
method entities and references, was assessed using the Jaccard similarity coefficient, yielding a
score of 0.5661.This indicates that the thematic overlap for the top 1000 articles in novelty scoring,
under the fine-grained entities dimension and the references dimension, is approximately 56.61%.
The thematic distributions of the two dimensions are similar in more than half of the cases, yet
they are not completely identical. This suggests that the internal and external dimensions of an
article may each emphasize different aspects of novelty assessment: entity innovation and
reference innovation represent distinct contributions of content and citation networks,
respectively.</p>
          <p>The thematic distributions of the top 1000 articles in novelty, obtained from two dimensions, are
tabulated in Tables 5, where topics labeled as -1 are considered noise. Specifically, the primary
themes of high novelty articles based on fine-grained entities include University Library(topic0),
Bibliometrics and Evaluation (topic1), Enterprise Knowledge Management and Organization
(topic2), while those based on references include University Library(topic0), Bibliometrics and
Evaluation (topic1), and Resource Service Construction of Digital Libraries(topic4).</p>
          <p>topic0
topic1</p>
          <p>Topic</p>
          <p>-1</p>
          <p>
            Among these, the topic of University Library(topic0) has a high proportion in the top 1000
articles across both internal and external characteristics. On one hand, university libraries are a
core area of library science with relatively rich research accumulation. On the other hand, aspects
such as information resource development, digital transformation, and information literacy
education in university libraries easily intersect with other topics, resulting in high relevance in
both entity and reference dimensions. Consequently, this leads to a high novelty score for this
topic in both dimensions. Furthermore, articles under the theme of Bibliometrics and Evaluation
(topic1) also exhibit a high proportion in the top 1000 articles by novelty score in both dimensions.
As a significant research direction in the IRM field, bibliometrics and evaluation involve issues
such as how to assess the quality, impact, and academic productivity of articles. Scholars in this
research area are quite active, often proposing novel research methods and perspectives. Research
in bibliometrics spans multiple disciplines and is particularly widely applied in various subfields of
the IRM domain. Additionally, bibliometrics is inherently a field that necessitates continuous
innovation. Researchers often draw from methods in other areas to develop new research
techniques, resulting in novel findings that elevate the novelty score of this theme. Moreover, the
integration of vast bibliographic and citation data, along with the trend of “cross-disciplinary
integration” in the information resource management field, may contribute to the high proportion
of this research theme among articles with high novelty scores.
(
            <xref ref-type="bibr" rid="ref3">2</xref>
            ) Characteristics of topic distribution in high novelty articles from two perspectives
This study considers the top 1000 articles, calculated under both dimensions for novelty, as high
novelty articles in their respective dimensions. Extract the articles that appear in both the Top 1000
based on novelty calculated from fine-grained entities and the Top 1000 based on novelty
calculated from references, totaling 61 articles. A PaperID was assigned to each paper along with
its associated theme, and a Lollipop chart was drawn, as shown in Figure 6. Red dots represent
          </p>
          <p>Percentag</p>
          <p>e
(entity)
25.10%
27.60%
6.20%</p>
          <p>Percentage
(reference)
novelty scores based on references, while blue dots represent novelty scores based on entities. The
horizontal axis represents the novelty scores, and the line connecting the two dots indicates the
score difference. The vertical axis represents 61 high-novelty articles assigned with PaperIDs and
their corresponding topics. For instance, on the vertical axis, ‘60_topic 13’ represents the paper
with ID 60 from the 61 jointly highly novel articles, which belongs to Virtual Consulting Services
(topic13).The chart shows that the novelty scores based on fine-grained research method entities
are generally lower than those based on references among these 61 high novelty articles, consistent
with the distribution shown in Figure 5. The main themes involved include University Library
(topic0), Bibliometrics and Evaluation (topic1), Enterprise Knowledge Management and
Organization (topic2), Online Public Sentiment (topic3),Resource Service Construction of Digital
Libraries (topic4),National Security Intelligence Analysis (topic5), Electronic Data and Information
Management in Government (topic6), Information Literacy Education (topic8), Book Preservation
and Classification (topic9), Reading Promotion (topic11), Virtual Consulting Services (topic13),
Network Information Retrieval (topic14), and Enterprise Competitive Intelligence (topic15).
Notably, shared high novelty paper themes do not include Text Semantic Analysis (topic7),
Copyright Protection (topic10), Patent Technology Protection (topic12), and User Information
Behavior (topic16). First, themes like Copyright Protection (topic10) and Patent Technology
Protection (topic12) have established research foundations with fixed citation networks and
research methods, leading to weaker performance in external novelty calculations. Moreover,
research on these topics is strictly limited by national laws and regulations, posing challenges for
innovation within this framework. Secondly, themes like Text Semantic Analysis (topic7) and User
Information Behavior (topic16) often involve interdisciplinary approaches, with complex research
methods and application scenarios, resulting in varying novelty scores across dimensions, and thus
less prominent performance in a composite dimension. Finally, in practical academic dissemination,
the impact of these themes might not be promptly reflected in novelty calculations due to delays in
dissemination and citation.</p>
          <p>The difference in novelty scores for the theme of University Library (topic0) between
finegrained research method entities and reference-based scores is significant, indicating that the
novelty of articles in this theme is more readily reflected through references, while it remains
relatively conservative in terms of method entities. Several articles demonstrate relatively
consistent novelty in the theme of Book Preservation and Classification (topic9). This is closely
related to technological advancements in the digital age, including artificial intelligence and big
data analytics. Research on book preservation and classification has benefited from the application
of emerging technologies, continuously yielding new methods and tools. For instance, the
emergence of large language models has spurred the intellectualization and automation of
librarianship [35], and the use of electronic books in libraries has been extensively promoted. These
technological advancements have to some extent increased the interdisciplinarity of this theme,
facilitating new integrations and enhancements in both research methods and reference utilization.
Note: Red dots indicate reference-based novelty scores, blue dots indicate entity-based novelty scores, with the
horizontal axis showing novelty scores and lines between dots showing score differences. The vertical axis represents 61
high-novelty articles with PaperIDs and their topics.</p>
          <p>In summary, this study analyzes the novelty scores and thematic characteristics calculated
from two dimensions, from macro to micro perspectives, providing references for scholars
assessing the novelty of articles across various themes. Meanwhile, the two perspectives emphasize
different aspects when evaluating the novelty of academic articles: the fine-grained entity-based
approach reveals the uniqueness and novelty of the content itself, whereas the reference-based
approach highlights its position and role in academic dissemination and citation networks.
Combining these two approaches allows for a more accurate assessment of the overall novelty of
an article, thereby providing a more comprehensive and objective perspective for academic
research and evaluation.
1. Conclusion and Future Works
This study employs the BERTopic model to identify research themes in Chinese IRM articles from
2000 to 2022, defining and extracting five types of fine-grained knowledge entities, and calculating
novelty based on these entities. The novelty scores calculated from references are combined with
those derived from fine-grained entities to analyze the differences and characteristics of paper
novelty scores and themes under these two perspectives. The results indicate that articles with
high novelty scores evaluated through fine-grained methods of entity and reference perspectives
exhibit high novelty in topics such as university libraries, bibliometrics, and evaluation. The
novelty score from fine-grained method entities shows a right skewed distribution, with a novelty
range of 0 to 1. The novelty score based on references shows a left skewed distribution, with a
novelty score range of -8 to 3. Both dimensions indicate a high degree of consistency in book
preservation and classification themes. This study also normalized the novelty scores from both
perspectives, analyzing them from a global and thematic perspective. From a global perspective, the
novelty score of academic papers in the field of Chinese IRM based on references is generally
higher than that based on fine-grained research methods and entities. From a thematic perspective,
the overlap between the top 1000 topics in terms of novelty scores in the fine-grained entity
dimension and the reference dimension is approximately 56.61%, indicating that the internal and
56
external dimensions of the paper may have their own focus in novelty assessment. Meanwhile,
there is no direct correlation between the novelty of the paper and its publication time, and the
novelty of newly published papers may not necessarily be high. By analyzing the novelty
characteristics of IRM themes over the past 22 years under both perspectives, the study provides
guidance for researchers in theme selection and reveals the importance of interdisciplinary
integration in the digital age. Through this comparative analysis, it is expected that applicable
scenarios for each method and their specific contributions to novelty evaluation can be identified,
providing valuable insights for curriculum development and interdisciplinary collaboration.
Additionally, in evaluating novelty across different themes, multiple factors should be considered
to promote the development of the IRM field.</p>
          <p>Future research can focus on enhancing entity extraction performance and optimizing paper
novelty calculations by exploring more efficient algorithms. Integrating the BERTopic model with
other models or technologies may improve theme recognition accuracy. Additionally, examining
relationships between research method entities can uncover patterns in novel articles, offering a
comprehensive understanding. Meanwhile, a deeper exploration of the proportion of different
types of entities in papers with varying degrees of novelty may provide better assistance in
understanding novelty. Finally, considering the document structure of fine-grained entities and
references in novelty evaluation can lead to a more thorough assessment.</p>
          <p>Acknowledgements
The paper is presented at the second Workshop on “Innovation Measurement for Scientific
Communication (IMSC) in the Era of Big Data” at 2024 ACM/IEEE Joint Conference on Digital
Libraries (JCDL). This work was partially supported by the National Natural Science Foundation of
China (No. 72074113).</p>
          <p>Declaration on Generative AI
During the preparation of this work, the authors used GPT-4 in order to correct grammatical er
rors, typos, and other writing mistakes. After using this tool, the authors reviewed and edited the
content as needed and takes full responsibility for the publication’s content.
[8] Lee Y-N, Walsh J P, Wang J., Creativity in scientific teams: Unpacking novelty and impact,</p>
          <p>Research Policy 44 (2015) 684–697.
[9] Foster J G, Rzhetsky A, Evans J A. Evans, Tradition and innovation in scientists’ research
strategies, American Sociological Review 80 (2015) 875–908.
[10] Wang J, Veugelers R, Stephan P. Stephan, Bias against novelty in science: A cautionary tale for
users of bibliometric indicators, Research Policy 46 (2017) 1416–1436.
[11] Veugelers R, Wang J , Scientific novelty and technological impact, Research Policy 48 (2019)
1362–1372.
[12] Zins C., Knowledge map of information science, Journal of the American Society for</p>
          <p>Information Science and Technology 58 (2007) 526–535.
[13] Meng J, Chen X., Transnational Knowledge Transfer and Innovation Based on Academic
Subjects: The Patterns and Characteristics of Knowledge Transfer and Innovation by Chinese
Scholars Returned from the United States, 2017 International Conference on Innovations in
Economic Management and Social Science, Zhejiang Hangzhou, 2017, pp. 47–53.
[14] Mishra S, Torvik V I, Quantifying conceptual novelty in the biomedical literature, D-Lib</p>
          <p>
            Magazine 22 (2016) 9-10.
[15] Liu M., Bu Y, Chen C, et al., Pandemics are catalysts of scientific novelty: Evidence from
COVID‐19, Journal of the Association for Information Science and Technology 73 (2022)
10651078.
[16] Chen Z., Zhang C., et al.Exploring the Relationship Between Team Institutional Composition
and Novelty in Academic Articles Based on Fine-Grained Knowledge Entities, The Electronic
Library,42(
            <xref ref-type="bibr" rid="ref7">6</xref>
            )(2024)905-930.
[17] Zhang Y, Tsai F S., Tsai, Chinese novelty mining, in: Proceedings of the 2009 Conference on
          </p>
          <p>Empirical Methods in Natural Language Processing, Singapore, 2009, pp. 1561-1570.
[18] He J, Chen C., Predictive effects of novelty measured by temporal embeddings on the growth
of scientific literature, Frontiers in Research Metrics and Analytics 3 (2018) 9-24.
[19] Mörchen F, Dejori M, Fradkin D, et al., Anticipating annotations and emerging trends in
biomedical literature, in: Proceedings of the 14th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, Las Vegas, Nevada, 2008, pp. 954-962.
[20] He Q, Chen B, Pei J, et al., Detecting topic evolution in scientific literature: how can citations
help?, in: Proceedings of the 18th ACM Conference on Information and Knowledge
Management, Hong Kong, 2009, pp. 957-966.
[21] Yan E., Research dynamics: Measuring the continuity and popularity of research topics,</p>
          <p>
            Journal of Informetrics 8(
            <xref ref-type="bibr" rid="ref1 ref2">1</xref>
            ) (2014) 98-110.
[22] Small H, Boyack K W, Klavans R., Identifying emerging topics in science and technology,
          </p>
          <p>Research Policy 43(8) (2014) 1450-1467.
[23] Choi H, Woo J., Investigating emerging hydrogen technology topics and comparing national
level technological focus: Patent analysis using a structural topic model, Applied Energy 313
(2022) 118898.
[24] Tu Y-N, Seng J-L. Seng, Indices of novelty for emerging topic detection, Information</p>
          <p>
            Processing &amp; Management 48(
            <xref ref-type="bibr" rid="ref3">2</xref>
            ) (2012) 303–325.
[25] Wang B., A bibliometrical analysis of interpreting studies in China: Based on a database of
articles published in the CSSCI/CORE journals in recent years, Babel: International Journal of
Translation 61(
            <xref ref-type="bibr" rid="ref1 ref2">1</xref>
            ) (2015) 62-77.
[26] Chu H, Ke Q., Research methods: What’s in the name?, Library &amp; Information Science
          </p>
          <p>
            Research 39(
            <xref ref-type="bibr" rid="ref5">4</xref>
            ) (2017) 284–294.
[27] Wang Y., Zhang C., Using the full-text content of academic articles to identify and evaluate
algorithm entities in the domain of natural language processing, Journal of Informetrics 14(
            <xref ref-type="bibr" rid="ref5">4</xref>
            )
(2020) 101091.
[28] McHugh M L. Interrater reliability: the kappa statistic, Biochemia medica 22(
            <xref ref-type="bibr" rid="ref4">3</xref>
            ) (2012) 276-282.
[29] Zhou X, Huang H, Chi Z, et al. RS-BERT: Pre-training radical enhanced sense embedding for
Chinese word sense disambiguation. Information Processing &amp; Management, 61(
            <xref ref-type="bibr" rid="ref5">4</xref>
            )(2024)
103740.
[30] Contreras K, Verbel G, Sanchez J, et al., Using topic modelling for analyzing Panamanian
parliamentary proceedings with neural and statistical methods, in: 2022 IEEE 40th Central
America and Panama Convention, Panama, IEEE, 2022, pp. 1-6.
[31] Grootendorst M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. (2022)
arXiv preprint arXiv:2203.05794.
[32] Fontana, M., Iori, M., Montobbio, F., &amp; Sinatra, R. New and atypical combinations: An
assessment of novelty and interdisciplinarity. Research Policy, 49(
            <xref ref-type="bibr" rid="ref8">7</xref>
            ) (2020), 104063.
[33] Terrell G R, Scott D W., Variable Kernel Density Estimation, The Annals of Statistics 20(
            <xref ref-type="bibr" rid="ref4">3</xref>
            )
(1992) 1236-1265.
[34] Zhao R., Huang Y., Ma W. et al., Insights and reflections of the impact of ChatGPT on
intelligent knowledge services in libraries, Journal of Library and Information Science in
Agriculture 35(
            <xref ref-type="bibr" rid="ref1 ref2">1</xref>
            ) (2023) 29-38.
          </p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Horton F W.</surname>
          </string-name>
          <article-title>Information resources management: Concept and cases</article-title>
          , Cleveland : Association for Systems Management,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. A. F. C.</surname>
          </string-name>
          ,
          <article-title>Building consensus and promoting the first-level discipline construction of information resource management</article-title>
          ,
          <source>Journal of Information Resources Management</source>
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <fpage>4</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Yang</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>Y.</given-names>
          </string-name>
          , et al.
          <article-title>Unveiling novelty evolution in the field of library and informationscience in China,The Electronic Library</article-title>
          ,
          <volume>42</volume>
          (
          <issue>6</issue>
          )(
          <year>2024</year>
          )
          <fpage>854</fpage>
          -
          <lpage>878</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Wang</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao</surname>
            <given-names>J</given-names>
          </string-name>
          , et al.,
          <article-title>Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities</article-title>
          ,
          <source>Journal of the Association for Information Science and Technology</source>
          ,
          <volume>74</volume>
          (
          <year>2023</year>
          )
          <fpage>150</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Fontana</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iori</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montobbio</surname>
            <given-names>F</given-names>
          </string-name>
          , et al.,
          <article-title>New and atypical combinations: An assessment of novelty and interdisciplinarity</article-title>
          ,
          <source>Research Policy</source>
          <volume>49</volume>
          (
          <year>2020</year>
          )
          <fpage>104063</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Azoulay</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graff Zivin</surname>
          </string-name>
          J S,
          <string-name>
            <surname>Manso</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <article-title>Incentives and creativity: evidence from the academic life sciences</article-title>
          ,
          <source>The RAND Journal of Economics</source>
          <volume>42</volume>
          (
          <year>2011</year>
          )
          <fpage>527</fpage>
          -
          <lpage>554</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Liu</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Xie Z.</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          <string-name>
            <surname>A J</surname>
          </string-name>
          et al.,
          <article-title>The prominent and heterogeneous gender disparities in scientific novelty: Evidence from biomedical doctoral theses</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>61</volume>
          (
          <year>2024</year>
          )
          <fpage>103743</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Uzzi</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mukherjee</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stringer</surname>
            <given-names>M</given-names>
          </string-name>
          , et al .,
          <article-title>Atypical combinations and scientific impact</article-title>
          ,
          <source>Science</source>
          <volume>342</volume>
          (
          <year>2013</year>
          )
          <fpage>468</fpage>
          -
          <lpage>472</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>