<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>and Its Correlation with Cumulative Citation Count</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zihe Wang</string-name>
          <email>wang.14629@osu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jian Wu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ohio State University</institution>
          ,
          <addr-line>Columbus, OH</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Old Dominion University</institution>
          ,
          <addr-line>Norfolk, VA</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>20</volume>
      <issue>2024</issue>
      <fpage>60</fpage>
      <lpage>68</lpage>
      <abstract>
        <p>In this paper, we revisit cognitive extent, originally defined as the number of unique phrases in a quota. We introduce Freshness and Informative Weighted Cognitive Extent (FICE), calculated based on two novel weighting factors, the lifetime ratio and informativity of scientific entities. We model the lifetime of each scientific entity as the time-dependent document frequency, which is fit by the composition of multiple Gaussian profiles. The lifetime ratio is then calculated as the cumulative document frequency at the publication time  0 divided by the cumulative document frequency over its entire lifetime. The informativity is calculated by normalizing the document frequency across all scientific entities recognized in a title. Using the ACL Anthology, we verified the trend formerly observed in several other domains that the number of unique scientific entities per quota increased gradually at a slower rate. We found that FICE exhibits a strong correlation with the average cumulative citation count within a quota. Our code is available at https://github.com/ZiheHerzWang/Freshness-and-InformativityWeighted-Cognitive-Extent ∗Corresponding author. †These authors contributed equally.</p>
      </abstract>
      <kwd-group>
        <kwd>cognitive extent</kwd>
        <kwd>citation impact</kwd>
        <kwd>entity recognition</kwd>
        <kwd>document frequency</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Cognitive extent is an approach to quantify the extent of cognitive domains of scientific fields based on
the concept of lexical diversity [1]. The metric was originally calculated by counting the number of
unique concepts (phrases) appearing in the titles of statistically large unit quotas of scientific literature,
which reflects the extent of the cognitive territory covered in that literature. Cognitive extent has been
used as a representation of knowledge gained by scientists. It has been shown that cognitive extent
calculated in multiple academic fields (Physics, Astronomy, and Biomedicine) grew at a slower rate.</p>
      <p>However, this definition of cognitive extent has two limitations. First, it only accounts for the
occurrence of a phrase as a dichotomous value within a quota and ignores when the phrase occurs.
Specifically, a phrase is novel when it appears the first time in the title of Paper A, but when it appears
again in the title of Paper B, it still maintains a level of freshness, and thus still reflects the scientist
of Paper B’s cognitive knowledge except that the knowledge is no longer new. Second, the definition
treats all phrases with the same weight. However, certain phrases may be more informative than others.
For example, at a particular time the phrase “entity recognition” occurred in many titles but the phrase
“nuclearity rhetorical relation” occurred in only a small number of titles. From a reader’s perspective,
the latter phrase is more informative because the former phrase has been seen in many papers.</p>
      <p>To overcome the limitations, we propose Freshness and Informativity Weighted Cognitive Extend
(FICE), which is calculated as a weighted occurrence of disambiguated unique scientific entities extracted
from paper titles in a quota. Diferent from the original cognitive extent, FICE accounts for contributions
of freshness and informativity of scientific entities extracted from documents within a corpus. The
freshness is based on the lifetime ratio and the informativity is based on time-dependent document
https://www.cs.odu.edu/~jwu/ (J. Wu)</p>
      <p>CEUR
Workshop
Proceedings</p>
      <p>ceur-ws.org
ISSN1613-0073
frequency across scientific entities in a document. Here the document can be any form of scholarly text.
Throughout this paper, we focus our study on paper titles.</p>
      <p>We verified a previous finding using ACL Anthology papers that although the number of papers
increased exponentially, the unique number of scientific entities per quota gradually increased with
time at a slower rate. One property of FICE is its relationship with citation impact factors. We found
that FICE exhibits a strong correlation with the logarithm of 5-year average cumulative citation counts
for papers in the ACL Anthology.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The definition of cognitive extent is closely relevant to lexical diversity, defined as the extent of
vocabulary disparity within a given language sample [2]. Early metrics determined the lexical diversity
only by single words[3, 4]. Berube et al. proposed the type-token ratio as a metric of lexical diversity
[2]. Both methods were based on word-level tokens and ignored their connections, therefore they do
not necessarily reflect the knowledge, which is better captured by phrases and entities.</p>
      <p>Milojević was the first to propose using concepts (phrases) to quantify the extent of cognitive domains
of scientific fields [ 1]. This method was later applied to study the properties of paper titles in various
domains [5]. Recently, a method was proposed to use neural embedding of paper titles to represent
the cognitive content of a cluster of papers [6]. Although neural embedding has been widely used to
capture the semantics of text and compare the semantic similarities between two pieces of text, the
embedding itself cannot be directly converted to a numeral that represents the cognitive extent of a
single or a corpus of documents.</p>
      <p>Bibliographic impact factors have been extensively studied and several citation impact factors have
been proposed [7]. The total number of citations (raw citations) is usually criticized as a good indicator
because of domain discrepancies, data completeness, and other random factors. The average number of
citations per publication of a research unit is frequently used but also criticized because the average
value can be biased due to the skewness of citation distribution within the research unit [8]. However,
it was argued that short-term citations can be considered as currency on the research front and
longterm citations can contribute to the codification of knowledge claims into bodies of knowledge [ 9].
Determining the exact boundary between short-term and long-term is non-trivial. Here we adopt  5( ) ,
which is the 5-year average cumulative citation count of papers within a quota as an estimate of the
quota’s average impact over the short- and long-term.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>3.1. Scientific Entity Recognition and Disambiguation
A scientific entity is defined as a noun phrase that delivers domain knowledge of interest [ 10, 11].
Scientific entities can be extracted using sequence tagging models, and constructing knowledge graphs,
e.g., [12, 11]. Recently, large language models (LLMs) have shown superior performance on named
entity recognition, e.g., [13, 14].</p>
      <p>Two scientific entities may have similar semantics. To investigate the impact of entity disambiguation
on the results, we conflate entities with similar semantic meanings. This was treated as a classification
problem using a thresholded method on the similarity scores calculated using the Cross Encoder [15], a
model that takes two entity names and outputs a similarity score. The threshold was calibrated based
on the classification performance evaluated on a set of manually labeled entity pairs.</p>
      <sec id="sec-3-1">
        <title>3.2. Lifetime Ratio</title>
        <p>The lifetime of a scientific entity is defined as the period during which it appears in at least one document
(in our case, a paper title). Here, we assume that all scientific entities have a finite lifetime, meaning
there is a time point   when a scientific entity first appears and another time point   after which it no
longer presents in any documents. We borrowed the concept of document frequency from information
retrieval to indicate whether a scientific entity  is still within its lifetime ( (, ) &gt; 0
ends ( (, ) ≤ 0
). Here, we define the time-dependent document frequency  (, )
) or its lifetime
, which is the
number of documents that contain the scientific entity  and are published at time  (e.g., a certain year).
The lifetime ratio of a scientific entity  at  0 is then defined as the number of accumulated documents
up to  0 divided by the total number of documents that contain  over its entire lifetime [  ,   ],
 (,  0) =





∑0  (, )


∑
 (, )
.</p>
        <p>The freshness of  is then calculated as 1 −  (,  0). Because any corpus can cover a limited period, the
lifetime ratio can only be calculated based on the observable period covered by the corpus. We then
model  (, )</p>
        <p>for each  as a composite of analytical profiles. By definition, the lifetime ratio provides
an estimation of the relative freshness of a scientific entity. Specifically, a low lifetime ratio indicates a
scientific entity is relatively new (so 1 −  (,  0) is large) and a high lifetime ratio indicates a scientific
entity is not new anymore. The model fitting provides a prediction of  (, )
beyond the observable
period based on data in the observable period.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. Informativity Weight</title>
        <p>
          In linguistics, informativity concerns the extent to which the contents of a text are already known or
expected as compared to unknown or unexpected [16]. The informativity of paper titles has been
commonly calculated by counting the number of “substantive” words. A diachronic analysis of informativity
was conducted on chemical paper titles [
          <xref ref-type="bibr" rid="ref1">17</xref>
          ], in which non-substantive words were defined as words
that convey little or no information, such as articles, prepositions, conjunctions, pronouns, and auxiliary
verbs. This objective approach was then extensively used by scholars to study title informativity. We
argue that whether a text (word, phrase, entity name) is informative or not depends not only on its
semantics but also on the relative frequency it appears in existing papers. At a certain time point  , a
scientific entity that appears in a large number of documents conveys relatively less new information
than an entity that appears only in a few documents. We calculate informativity as the cumulative
time-dependent document frequency 
normalized by its range of [
min,  max] across all scientific
entities in a document,
 (,  0) = 1 −
 min = min { (
 max = max { (
        </p>
        <p>− 
 max −  min</p>
        <p>min ,  (, 
 ,  0),   ∈ }
 ,  0),   ∈ }
0) = ∑  (, )

0


in which  is the set of scientific entities extracted from a document. The FICE of documents in a quota
 is calculated as
   =
∑</p>
        <p>∑  (,   ) (1 −  (,   )) ,
∈ ∈
in which   is the time when document  was published.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Data Processing</title>
      <sec id="sec-4-1">
        <title>4.1. Data Collection</title>
        <p>
          We downloaded the metadata of all papers in the ACL Anthology [
          <xref ref-type="bibr" rid="ref2">18</xref>
          ] in BibTeX format1. The publication
year and title of each paper were extracted from the BibTeX file. The number of papers published each
year from 1952 to 2020 is shown in Figure 1.
1–9
(1)
(2)
(3)
16000
14000
12000
t10000
n
u
oC8000
6000
4000
2000
0
3-degree Polynomial Fit
Un-disambiguated (Scientific Entity)
Disambiguated (Scientific Entity)
First Time Appearence Disambiguated (Scientific Entity)
Number of Papers
1960
1970
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Scientific Entity Recognition</title>
        <p>
          We compared three of-the-shelf models for scientific entity recognition. (1) GPT-4 [
          <xref ref-type="bibr" rid="ref3">19</xref>
          ]. We construct
a zero-shot prompting template to extract scientific entities using GPT-4. The temperature is set to zero
to ensure consistent outputs. (2) SciBERT [
          <xref ref-type="bibr" rid="ref4">20</xref>
          ]. We used the named entity recognition implementation
from the Hugging Face library developed by AllenAI2. (3) SpaCy [
          <xref ref-type="bibr" rid="ref5">21</xref>
          ]. Entity recognition is performed
by invoking the entity recognition module from the Hugging Face library3.
        </p>
        <p>
          To compare the performance of these models, we built a small benchmark dataset by manually
annotating 200 titles randomly selected from all ACL Anthology papers, following the annotation
guidelines in Wu et al. [
          <xref ref-type="bibr" rid="ref6">22</xref>
          ]. The F1-scores achieved by GPT-4, SciBERT, and SpaCy are 0.66, 0.05,
and 0.07, respectively, indicating that GPT-4 outperforms the other two models, so we adopt GPT-4
for recognizing sciientific entities of all paper titles. The numbers of papers and scientific entities
recognized are shown in Figure 1. The average number of scientific entities per title is about 3.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Scientific Entity Disambiguation</title>
        <p>Using the method described in Section 3.1, the threshold was calibrated based on the classification
performance evaluated against 180 manually labeled entity pairs. Entity pairs extracted from titles
within a quota were automatically labeled as “similar” or “not similar” based on the calculated similarity
scores and a threshold of 0.5.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Document Frequency Curve Fitting</title>
        <p>For each scientific entity  , we count the number of paper titles that contain  at Year  and plot the
document frequency chart. An example is shown in Figure 2.</p>
        <p>The document frequency chart for each scientific entity is fit using a composite of Gaussian profiles
each having three parameters, peak, mean, and dispersion. Our fitting program employs a dynamic
2https://huggingface.co/allenai/scibert_scivocab_cased
3https://huggingface.co/spacy/en_core_web_sm
n
i
B1500
r
e
sp1250
e
i
t
it1000
n
E
eu 750
q
i
nU 500
f
o
re 250
b
um 0
N</p>
        <p>Disambiguated (Dots, bin=125)
Disambiguated (Line, bin=125)
Disambiguated (Dots, bin=250)
Disambiguated (Line, bin=250)
Disambiguated (Triangles, bin=500)
Disambiguated (Line, bin=500)</p>
        <p>Un-disambiguated
1970
1980
1990
2000
2010</p>
        <p>2020</p>
        <p>
          Year
tuning approach, where the number of peaks is inferred from the data using an algorithm based on
a comparison of neighboring values4. The center of each Gaussian is initialized at the detected peak
position; the amplitude is initialized randomly within a defined range, and the width of each Gaussian
is initialized based on the year range divided by the number of peaks. The fitting process iteratively
updates the parameters using gradient descent. We used ADAM as the optimizer [
          <xref ref-type="bibr" rid="ref7">23</xref>
          ] and the mean
squared error (MSE) as the loss function. To prevent overfitting, a regularization term is added to the
loss function, which penalizes excessively large or narrow peaks. The number of epochs and parameters
was automatically adjusted by the fitting algorithm.
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Calculating FICE</title>
        <p>After fitting the document frequency chart for a scientific entity  , the starting point for   is determined
as the year when  first appeared and   is determined when the predicted document frequency is less
than 1, which may be beyond the time span of the observable period. We calculated the lifetime ratio
for each  in a title  using Eq. (1) and the informativeness weight using Eq. (2). The FICE for a given
quota  is calculated using Eq. (3).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4.6. Cumulative Citation Count  5</title>
      <p>
        We obtain citations for each paper each year from the Semantic Scholarly Graph API [
        <xref ref-type="bibr" rid="ref8">24</xref>
        ]. Therefore,
2019
the average 5-year cumulative citation count in 2015 is  5(2015) = ∑=2015 ( ) , in which ( ) is the
citation received by a paper in Year .
4https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html
Year Range |Q| = 125 |Q| = 250 |Q| = 500
1980–2000
      </p>
    </sec>
    <sec id="sec-6">
      <title>5. Results</title>
      <sec id="sec-6-1">
        <title>5.1. Growth of Scientific Entity Diversity</title>
        <p>To distinguish from the original cognitive extent, we define entity-based cognitive extent as the unique
number of scientific entities extracted from paper titles within a quota. Similar to lexical diversity,
entity-based cognitive extent can be seen as a measure of the scientific entity diversity. By plotting
entity-based cognitive extent over time, we found that it gradually increases with time, which is
consistent with the trend of the original cognitive extent based on paper titles in Astronomy, Physics,
and Biomedical [1]. We compared the trends calculated using disambiguated and undisambiguated
scientific entities (Figure 4) and found that undisambiguated entities linearly shifted the curve up but
did not significantly change the growth rate.</p>
        <p>To investigate how the growth rate of entity-based cognitive extent increases with time, we fit the
disambiguated entity-based data points using a linear function from 1980 to 2000 and another linear
function from 2000 to 2020, respectively. The slopes obtained for various quotas are tabulated in Table 1.
The results indicate that the entity-based cognitive extent increases at a slower rate.</p>
        <p>The entity-based cognitive extent vs. year relations for three quota sizes are illustrated in Figure 4.
Similar to the original cognitive extent, calculating the entity-based cognitive extent within a quota is
important to generate a consistent measure of the metric. Our quota is diferent from the ones used in
Milojević in two aspects. First, instead of a fixed number of phrases, we use a fixed number of titles.
This is because the informativity weight is normalized within scientific entities of a particular title
before weights of multiple titles are aggregated. Second, the quota sizes used in our study are smaller
than the quota used in the original cognitive extent, which is 3000 – 10000. For example, || = 500
converts to about 1500 scientific entities. Low quota data points may sufer a saturation, meaning that
cognitive extent value increases with quota size for a given time point, which is seen in Figure 4. To fix
the problem, a linear correction factor can be applied to “lift” data points in low quota to be aligned with
unsaturated data points. The exact correction factor may be domain-dependent and will be investigated
with a larger corpus in future studies.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>5.2. Correlation Between FICE with  5</title>
      <p>We found that FICE exhibits a strong positive correlation with log  5 as shown in Figure 4. The plots
were made by first ranking papers by  5(2015) in ascending order and binning the sequence by a given
quota. For each data point, the  -coordinate is calculated as the logarithmic value of the average of
 5(2015) of papers in a quota, and the  -coordinate is the arithmetic average FICE of papers in the same
quota. The error bars are calculated as the standard deviation assuming a Gaussian distribution. The
Spearman correlation coeficients between FICE and log( 5) for three quota are shown in Table 2. To test
the influence of disambiguation on the correlation. We calculated FICE without entity disambiguation
and obtained similar Spearman correlation coeficients.</p>
      <p>Note that Figure 4 reveals a collective instead of an individual correlation because FICE represents the
weighted cognitive extent in a quota. This correlation does not apply to individual papers because of
the small number of scientific entities in a title. Therefore, the correlation does not imply that one could
increase the citation impact by simply using never-existing entity names. If the paper lacks true novelty
and significant contributions, newly introduced entity names are unlikely to be adopted in subsequent
research, rendering them “transient” with a short lifetime and a minimal contribution to FICE.
0.7 BCionrrseilzaeti=on2c5o0efficient = 0.748 (p=0.008)
0.6</p>
      <p>We compare FICE with three simplified versions and demonstrate the contribution of the lifetime
ratio and informativity weight in the correlation above. These simplified versions are (1) Dichotomous
Entity-based Cognitive Extent, calculated by adding the number of disambiguated unique scientific
entities in a quota. (2) Weight Only, calculated by simply summing up the normalized weights (Eq.(2))
of all scientific entities. (3) Lifetime Ratio Only, calculated by summing up the unweighted lifetime
ratios 1 −  (, ) of all scientific entities. Table 2 indicates that FICE exhibits the strongest correlation
against all baseline models. Both the lifetime ratio and the informativity weight contribute to this
strong correlation. The quota size will influence the correlation coeficients. In particular, all simplified
versions exhibit a strong correlation with || = 500 . The lifetime ratio consistently exhibits a strong
correlation with log  5.</p>
    </sec>
    <sec id="sec-8">
      <title>6. Conclusion</title>
      <p>We proposed FICE, which extends the original cognitive extent. FICE is calculated based on the lifetime
ratio and informativity of scientific entities extracted from paper titles within a quota. Using ACL
Anthology, we found the number of unique scientific entities per quota increased with time, consistent
with previous observations in other disciplines. FICE exhibits a strong positive correlation with the
average 5-year cumulative citation count, which may be used for predicting collective citations for
trending topics.</p>
    </sec>
    <sec id="sec-9">
      <title>7. Acknowledgments</title>
      <p>The paper is presented at the second Workshop on “Innovation Measurement for Scientific
Communication (IMSC) in the Era of Big Data” at 2024 ACM/IEEE Joint Conference on Digital Libraries
(JCDL).</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
1–9</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Tocatlian</surname>
          </string-name>
          ,
          <article-title>Are titles of chemical papers becoming more informative?</article-title>
          ,
          <source>Journal of the American Society for Information Science</source>
          <volume>21</volume>
          (
          <year>1970</year>
          )
          <fpage>345</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Radev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Muthukrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Qazvinian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abu-Jbara</surname>
          </string-name>
          ,
          <article-title>The acl anthology network corpus</article-title>
          ,
          <source>Language Resources and Evaluation</source>
          <volume>47</volume>
          (
          <year>2013</year>
          )
          <fpage>919</fpage>
          -
          <lpage>944</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [19]
          <string-name>
            <surname>T. B. Brown</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>14165</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Cohan,
          <article-title>SciBERT: A pretrained language model for scientific text</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>3615</fpage>
          -
          <lpage>3620</lpage>
          . URL: https: //www.aclweb.org/anthology/D19-1371. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          - 1371.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Matthew</surname>
            <given-names>Honnibal</given-names>
          </string-name>
          ,
          <source>spaCy's NER Model</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [22]
          <string-name>
            <surname>J. Wu</surname>
            ,
            <given-names>M. R. Ul</given-names>
          </string-name>
          <string-name>
            <surname>Hoque</surname>
            ,
            <given-names>G. W.</given-names>
          </string-name>
          <string-name>
            <surname>Reiske</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          <string-name>
            <surname>Weigle</surname>
            ,
            <given-names>B. T.</given-names>
          </string-name>
          <string-name>
            <surname>Bradshaw</surname>
            ,
            <given-names>H. D.</given-names>
          </string-name>
          <string-name>
            <surname>Gaf</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Kwan</surname>
          </string-name>
          ,
          <article-title>A comparative study of sequence tagging methods for domain knowledge entity recognition in biomedical papers</article-title>
          ,
          <source>in: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in</source>
          <year>2020</year>
          , JCDL '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>397</fpage>
          -
          <lpage>400</lpage>
          . URL: https://doi.org/10.1145/3383583.3398602. doi:
          <volume>10</volume>
          .1145/3383583.3398602.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Adam: A method for stochastic optimization</article-title>
          , in: Y. Bengio, Y. LeCun (Eds.),
          <source>3rd International Conference on Learning Representations, ICLR</source>
          <year>2015</year>
          , San Diego, CA, USA, May 7-
          <issue>9</issue>
          ,
          <year>2015</year>
          , Conference Track Proceedings,
          <year>2015</year>
          . URL: http://arxiv.org/abs/1412.6980.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [24]
          <string-name>
            <surname>AllenAI</surname>
          </string-name>
          ,
          <string-name>
            <surname>Semantic Scholar</surname>
            <given-names>API</given-names>
          </string-name>
          , https://www.semanticscholar.org/product/api,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>