<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>George Tsatsaronis[</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Metrics and Trends in Assessing the Scienti c Impact</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Trends</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natural Language</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Elsevier BV Radarweg 29</institution>
          ,
          <addr-line>1043NX, Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>0000</year>
      </pub-date>
      <volume>0003</volume>
      <fpage>5</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>The economy of science has been traditionally shaped around the design of metrics that attempt to capture several di erent facets of the impact of scienti c works. Analytics and mining around (co-)citation and co-authorship graphs, taking into account also parameters such as time, scienti c output per eld, and active years, are often the fundamental pieces of information that are considered in most of the well adopted metrics. There are, however, many other aspects that can contribute further to the assessment of scienti c impact, as well as to the evaluation of the performance of individuals, and organisations, e.g., university departments and research centers. Such facets may cover for example the measurement of research funding raised, the impact of scienti c works in patented ideas, or even the extent to which a scienti c work constituted the basis for the birth of a new discipline or a new scienti c (sub)area. In this work we are going to present an overview of the most recent trends in novel metrics for assessing scienti c impact and performance, as well as the technical challenges faced by integrating a plethora of heterogeneous data sources in order to be able to shape the necessary views for these metrics, and the novel information extraction techniques employed to facilitate the process.</p>
      </abstract>
      <kwd-group>
        <kwd>Scienti c Impact</kwd>
        <kwd>Processing</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Metrics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Measuring the impact of science has been traditionally approached by means
of measuring the impact that scienti c publications have. Though the notion
of a scienti c publication being the primary vessel of communicating science
has its roots in the 17th century, the roots of scientometrics originate from the
eld of bibliometrics which appeared for the rst time several centuries later; in
fact many attribute the origin of the eld to Paul Otlet, one of the founders of
information science [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
2
      </p>
      <p>G. Tsatsaronis</p>
      <p>
        However, it is the way we interpret the word \impact ", that has driven the
research of scientometrics, almost ever since its birth. In this paper we will not
attempt to add more interpretations to the existing ones; there is already a very
comprehensive set of such interpretations, which have resulted into a number of
academic and alternative metrics [
        <xref ref-type="bibr" rid="ref19 ref21">19, 21</xref>
        ]. The aim of this paper is to summarize
the information needs of the di erent stakeholders served by scientometrics and
to point to some recent research directions on how we can serve some of the
unaddressed needs. Below we discuss the most important users served by the
outcomes of scientometrics, as well as some of their most representative
information needs. Eventually, serving all of these information needs entails combining
state-of-the-art data analytics, data visualization, natural language processing,
machine learning and information retrieval [12{14].
      </p>
      <p>Researchers: The primary users of such metrics, with the major need being the
awareness of their standing in their scienti c elds. They also want to know the
most important journals in their eld of research, the most prominent researchers
for collaborations, as well as the top universities and scienti c (sub-) elds in
their areas. Furthermore, they would like to know the trends, as well as the
top funded areas, and the respective funders in their eld, in order to look for
funding opportunities.</p>
      <p>Universities: Their overall, and eld-speci c, standing in the academic
landscape is their primary need, which is used in turn for the national assessments
and ranking. For their undergraduate and graduate study programs they need
to be constantly aware of how the di erent scienti c elds and trends evolve
over time. Also, being aware of who the most prominent scientists are for each
research eld is important for shaping hiring plans. Monitoring of the funding
landscape, funding trends and opportunities is also important as it a ects the
shaping of their research strategy.</p>
      <p>Funders: Their most important information need is their ability to trace back
the research outcomes of their grants, as well as the overall impact these brought
to society. They are also interested in the top funded areas, the emerging scienti c
elds and trends, as well as in knowing the overall standing of universities and
individual researchers, per eld, which can be used, among other criteria, to the
assessment of research grant proposals.</p>
      <p>Journal Editors: They would like to know what are the most prominent
researchers in the scienti c scopes of their journals, as well as how that scope
evolves over time. This helps editors manage the editorial board, and having
potentially the top experts in the respective elds, included. Analysis of the trends
might also lead to the creation of special issues, in order to give emphasis in the
most recent impactful works. They also need to be aware of the overall standing
of the journal in the journal's elds of research.</p>
      <p>Reviewers: They need to be aware of the most important and impactful
articles in their scienti c eld. An analysis of the standing/ranking of the di erent
journals per eld also helps assess and compare potentially relevant references
or material with impact, that is at the core of the research described in the
reviewed article.</p>
      <p>Publishers: The ability to monitor the trends across all elds of science, as well
as an overview of the journals' rankings, top researchers, and universities are the
primary information needs that publishers have from scientometrics.
Science Journalists: Bridging the gap between the research community and
the rest of the society, science journalists have as primary information needs the
impact of individual scienti c articles. Trends, as well as journals' and
universities' rankings are also very important.</p>
      <p>Tax Payer: Tax payers often need to understand the scienti c and societal
impact of the research that was funded by state/public resources.
Global Community/General Public: For instance patients interested in
understanding novel research on diseases, or, understanding context and authority
(top institutions, journals, experts) and being able to distinguish the high quality
research work among all the noisy information out there.</p>
      <p>It is evident that many of the aforementioned information needs require the
linking of multiple sources. For instance, being able to provide an analysis of the
top funded research (sub) elds, entails the ability to annotate scienti c articles
with domain labels in di erent granularities, the capacity to automatically
extract funding information from articles, as well as from the reports of the funders'
research outcomes, and combine these pieces of information together. Further to
that, if besides volume of funded articles the information need pertains to actual
amounts in di erent currencies, then, in addition to the aforementioned sources,
one would need to be able to scrape grants' information from funders' sites, and
link the grants' metadata with the rest, to draw sums per eld.</p>
      <p>As complicated as it appears to be, the communities of natural language
processing, machine learning, analytics and visualization combined already have the
answers to the advanced techniques required to answer such complex
information needs. In the remaining of the paper we will rst provide an overview of the
current best practices in measuring scienti c impact (Section 2), as well as
examples of novel, experimental technologies developed by Elsevier 1, in collaboration
with research institutions and universities across the globe, to address the
complex landscape of scientometrics, serving all of the aforementioned stakeholders
(Section 3).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approaches</title>
      <p>The scienti c impact in academia is primarily measured using citation-based
metrics. The principle behind all of these metrics is to model how knowledge
disseminates among scientists and their communities. There are also metrics
that capture the impact of scienti c works by looking outside academia, e.g.,
alternative metrics that examine social media, news articles and the attention
that scienti c works draw by the non-scienti c public. In the following we give</p>
      <sec id="sec-2-1">
        <title>1 https://www.elsevier.com/</title>
        <p>4</p>
        <p>
          G. Tsatsaronis
a high level overview of the most common such metrics used, and we conclude
this section with some interesting experimental research works which utilize
alternative views of this data. For a more thorough overview of existing metrics,
the reader might wish to consult survey articles in the elds, e.g., [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
2.1
        </p>
        <sec id="sec-2-1-1">
          <title>Author-level Metrics</title>
          <p>
            Some of the most common author-level metrics include the number of citations,
the author's h-index, the i-10 index, and an incredibly large number of variations
with increasing complexity (e.g., a comprehensive survey can be found at [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ]),
most often weighed with regards to the scienti c eld or portfolio of the author.
2.2
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Article-level Metrics</title>
          <p>
            Article-level metrics (ALMs) quantify the reach and impact of published research
articles. Well established citation databases, such as Scopus2[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ], integrate data
from various sources. For example, Scopus integrates the PlumX Metrics 3, which
is a wide family of article-level metrics, along with traditional measures (such
as citations), to present a richer and more comprehensive picture of an
individual article's impact. Examples include citations, not only from other scienti c
articles, but also from clinical studies, patents and policies, usage (e.g., article
downloads, views, video plays), captures (e.g., bookmarks, code forks), mentions
(e.g., wiki mentions, news mentions), and social media (e.g., tweets).
2.3
          </p>
        </sec>
        <sec id="sec-2-1-3">
          <title>Journal-level Metrics</title>
          <p>At the journal level, one can compute some of the traditional metrics, e.g.,
hindex for the whole journal, or any of its variations. However, some additional
metrics, with time bounds, have been more adopted for the assessment of a
journal. CiteScore metrics for example, are a suite of indicators calculated from data
in Scopus. At its basis, CiteScore averages the sum of the citations received in a
given year to publications published in the previous three years, to the sum of
publications in the same previous three years. The rest of the CiteScore metrics
are calculated based on this indicator. The SCImago Journal Rank (SJR) is
based on the concept of a transfer of prestige between journals via their citation
links. Drawing on a similar approach to Google's PageRank, SJR weights each
incoming citation to a journal by the SJR of the citing journal, with a citation
from a high-SJR source counting for more than a citation from a low-SJR source.
Like CiteScore, SJR accounts for journal size by averaging across recent
publications and is calculated annually. Source Normalized Impact per Paper (SNIP )
is a sophisticated metric that intrinsically accounts for eld-speci c di erences
in citation practices. It does so by comparing each journal's citations per
publication with the citation potential of its eld, de ned as the set of publications</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2 http://scopus.com/</title>
      </sec>
      <sec id="sec-2-3">
        <title>3 https://plumanalytics.com/learn/about-metrics/</title>
        <p>citing that journal. SNIP therefore measures contextual citation impact and
enables direct comparison of journals in di erent subject elds, since the value of
a single citation is greater for journals in elds where citations are less likely,
and vice versa. Last but not least, Journal Impact Factor (JIF ) is calculated
by Clarivate Analytics as the average of the sum of the citations received in a
given year to a journal's previous two years of publications divided by the sum
of citable publications in the previous two years.
2.4</p>
        <sec id="sec-2-3-1">
          <title>Experimental Methods</title>
          <p>
            The potential of working with the citation, co-citation, and co-authorship graphs
in the eld of scientometrics has given birth to a number of novel ideas, primarily
by repurposing successful graph mining techniques. In many of such research
works, e.g., [
            <xref ref-type="bibr" rid="ref18 ref3">3, 18</xref>
            ] the authors attempt to predict trends in the respective graphs,
e.g., citations, collaborations, and in general how these graphs are going to evolve
over time. Such methods enable detecting earlier impactful articles, as well as
authors whose collaboration network and citations are growing fast (also known
in the literature as rising stars ). Lately, there is also attention in attempting
to model the performance of universities and research institutions, and make
predictions for their future state regarding funding, ranking and other factors,
e.g. [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ].
3
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Filling the Information Needs Gap</title>
      <p>In this section we are presenting three novel research directions that enable
more granularity to some of the aforementioned metrics, and they also support
addressing some of the information needs mentioned earlier, which the current
metrics cannot serve.
3.1</p>
      <sec id="sec-3-1">
        <title>Funding</title>
        <p>Within the economy of the research market, funding bodies need to ensure that
they are awarding funds to the right research teams and topics so that they can
maximize the impact of the associated available funds. At the same time,
funding organisations require public access to funded research adopting, for instance,
the US Government's policy that all federal funding agencies must ensure public
access to all articles and data which result from federally-funded research. As
a result, institutions and researchers are required to report on funded research
outcomes, and acknowledge the funding source and grants. In parallel, funding
bodies should be in a position to trace back these acknowledgements and justify
the impact and results of their research allocated funds to their stakeholders
and the tax-payers alike. Researchers should also be able to have access to such
information, which can help them make better educated decisions during their
careers, and help them discover appropriate funding opportunities for their
scienti c interests, experience and pro le.
6</p>
        <p>
          This situation creates unique opportunities for publishers, and more widely,
the a liated industry, to coordinate and develop novel solutions that can serve
funding agencies and researchers. A fundamental problem that needs to be
addressed is, however, the ability to extract automatically the funding information
from scienti c articles, which can in turn become searchable in bibliographic
databases. We have addressed this problem by developing a novel technology
to automatically extract funding information from scienti c articles [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], using
natural language processing and machine learning techniques. The pipeline is
carefully engineered to accept a scienti c article as input in raw text format,
and provide the detected funding bodies and associated grants as output
annotations. For the engineering of the nal solution we have exhaustively tested a
number of state-of-the-art approaches for named entity recognition and
information extraction. The advantage of the developed technology lies in its ability
to learn how to combine a number of base classi ers, among which many are
open source and publicly available, in order to create an ensemble mechanism
that selects the best annotations from each approach.
        </p>
        <p>The problem can be formulated as follows: given a scienti c article as raw text
input, denoted as T , the automated extraction of funding information from text
translates in two separate tasks. First, identify all text segments t 2 T , which
contain funding information, and, second, process all the funding text segments
t, in order to detect the set of the funding bodies, denoted as FB, and the set of
grants, denoted as GR that appear in the text. Provided that there is training
data available, the former problem can be seen as a binary text classi cation
problem, where, given T and the set of all non-overlapping text segments ti, such
that the [iti = T (where ti 2 T ), a trained binary classi er can decide for the
class label of ti, i.e., Cti = 1, if ti contains funding information, or Cti = 0 if not.
The latter task can be mapped to a named entity recognition (NER) problem,
where given all ti for which Cti = 1, the objective is to recognize within them all
strings s, such that either s 2 FB, i.e., it represents a funding body, or s 2 GR,
i.e., it represents a grant. There is a number of additional dimensions that one
may consider in the formulation of this problem, such as additional entities like
Programs or Projects, or detecting and labelling the funding relation between the
funding bodies and the authors, e.g., Monetary, or In-kind. We argue that such
a technology can be used in combination with existing metrics, to su ciently
address a signi cant portion of the funders' and researchers' information needs
around funded articles and funding, respectively.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Colouring of Citations</title>
        <p>
          As discussed earlier in the overview of the current most common scientometrics,
impact is primarily quanti ed, and not necessarily quali ed, e.g., by counting
for example number of citations. These metrics have raised some criticism as
they don't account for di erent qualitative aspects of the citations. Negative or
self-citations [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] should be weighted in a di erent way compared, for example, to
a rmative or methodological citations. The question of qualitative bibliometrics
Metrics and Trends in Assessing the Scienti c Impact
7
is, therefore, gaining more interest in literature and researchers are suggesting
di erent approaches to the problem, e.g., [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>
          The qualitative analysis of citations functions is not only important for
bibliometrics purposes; it can also help researchers in their daily work. Browsing
references and lists of cited works is a time consuming activity which can be
made easier by automatically highlighting those aspects a scholar is looking for.
This might be the case of a PhD student who is interested only in those works
cited because they use the same methods of the experiment she is studying, or in
those works cited because they agree on a speci c theory. Having those speci c
papers highlighted with a simple click would save precious time from the daily
routine of researchers. One of the rst step in this direction is the delineation
of a citation functions schema which works as a basis for an automatic citation
characterisation tool. This is not an easy task considering the di erent features
and aspects that one has to take into account. Despite the indisputable value
of author's motivations for citation, these might not be the only
characterizations a user is looking for, while surveying references and lists of citations. For
this purpose, in collaboration with University of Bologna4 we have conducted
a study to assess which of these functions are deemed important by scholars
[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], and we have further developed a deep machine learning approach that can
automatically classify the type of each citation made in an article. The approach
is based on the fusion of sentence embeddings, section type semantic encodings,
main verb embeddings, and SciCite's predictions [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], into a transformer-based
model. As a result, citations can be actually quali ed with this approach, and
respective retrieval lters can be applied in production facing platforms, to lter
on papers cited for speci c reasons/intents.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Novelty and Trends</title>
        <p>
          Elsevier's Scival's Topics of Prominence5, provide a very comprehensive view
of how science can be organized into topics, by creating a topic modeling which
is primarily based on citations (e.g., [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]). Motivated by the interest that such
mining and analysis attracts, we are also exploring novel ways of addressing
the very important need of measuring trends and capturing new terminology
appearing in the various scienti c elds.
        </p>
        <p>
          For this purpose, we have developed a deep learning approach [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], and a topic
analysis-based approach [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], as research prototypes. Combined they can provide
a thorough scanning of the latest, novel and in uential terminology across all,
or selected, scienti c elds. The former approach learns feature representations
from a target document (whose terminological novelty is to be inferred) with
respect to the source document(s) using a Convolutional Neural Network (CNN ),
and is based on a recent sentence embedding paradigm [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. We leverage their
idea and create a representation of the relevant target document relative to the
        </p>
        <sec id="sec-3-3-1">
          <title>4 https://www.unibo.it/en</title>
        </sec>
        <sec id="sec-3-3-2">
          <title>5 https://www.elsevier.com/solutions/scival/releases/</title>
          <p>topic-prominence-in-science</p>
          <p>G. Tsatsaronis
designated source document(s) and call it the Relative Document Vector (RDV ).
We can then train a CNN with the RDV of the target documents, and, nally,
classify a document as terminologically novel or non-novel with respect to its
source documents.</p>
          <p>
            Next, we can apply the topic attentionality approach [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] in these documents,
to extract speci c novel terminology per area. The motivation behind this
approach is to understand the velocity of the changes in the Inverse Document
Frequency (IDF ) of terms, as shown in Figure 1. At some point in time, the
topic appears for the rst time in the literature. Since it has not been discussed
before, at that point in time its IDF score will be high. After that point in time,
there might be a period where the topic acquires attention. During this period,
its IDF score will be dropping, as the topic will be discussed more over time.
During this period also, one can observe a negative velocity in the IDF curve,
since the score is becoming gradually smaller. The area below such a negative
velocity curve is in fact a positive area for the topic, as it describes the volume of
the attention the topic is receiving; an attention which is gradually increasing.
Further in time, the topic might be saturated by the research community, and
then in the topic's IDF curve the reverse phenomenon might be observed:
positive velocity IDF curve, since the topic is being discussed less over time from
that point and on, meaning that it does not receive so much attention anymore.
Metrics and Trends in Assessing the Scienti c Impact
9
The area under this positive velocity IDF curve is in fact a negative area of the
topic, as it quanti es the volume of the attention the topic lost over time. In
principle, these two patterns, namely negative IDF velocity (topic attracts
attention) and positive IDF velocity (topic loses attention) might alternate for the
same topic over time, and are the two main motifs of the IDF values of the topic
measures over time. The ability to compute such metrics across all candidate
novel terms, and across elds, can address su ciently the problem of detecting
(sub) eld trends, and one could also trace back the origin/main contributors of
the shaping of new areas. One can also notice the relation of this idea to the
notion of delayed recognition in science as well [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
4
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Summary</title>
      <p>In this paper we have provided an overview of the major users and recipients of
scientometrics output and analyses, along with their most representative
information needs. We have noted that there are still signi cant gaps in addressing
these needs, and we have discussed a few directions that can add more clarity
and granularity to existing metrics. The three directions, namely mining and
linking funding information, qualifying citations and classifying citation intent,
and detecting novelty and trends in scienti c terminology, can enable the
development of novel scientometrics, and can help close the gap by addressing the
remaining information needs.</p>
      <p>G. Tsatsaronis</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abu-Jbara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ezra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Purpose and polarity of citation: Towards nlp-based bibliometrics</article-title>
          . In: Vanderwende,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>III</surname>
          </string-name>
          , H.D.,
          <string-name>
            <surname>Kirchho</surname>
            ,
            <given-names>K</given-names>
          </string-name>
          . (eds.)
          <article-title>Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics</article-title>
          , Proceedings, June 9-14,
          <year>2013</year>
          , Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA. pp.
          <volume>596</volume>
          {
          <fpage>606</fpage>
          .
          <article-title>The Association for Computational Linguistics (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schotten</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plume</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          , Co^te, G.,
          <string-name>
            <surname>Karimi</surname>
          </string-name>
          , R.:
          <article-title>Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies</article-title>
          .
          <source>Quant. Sci. Stud</source>
          .
          <volume>1</volume>
          (
          <issue>1</issue>
          ),
          <volume>377</volume>
          {
          <fpage>386</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bai</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Predicting the citations of scholarly paper</article-title>
          .
          <source>J. Informetrics</source>
          <volume>13</volume>
          (
          <issue>1</issue>
          ),
          <volume>407</volume>
          {
          <fpage>418</fpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cohan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ammar</surname>
            , W., van Zuylen,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cady</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Structural sca olds for citation intent classi cation in scienti c publications</article-title>
          . In: Burstein,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Doran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Solorio</surname>
          </string-name>
          , T. (eds.)
          <article-title>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis</article-title>
          , MN, USA, June 2-7,
          <year>2019</year>
          , Volume
          <volume>1</volume>
          (Long and Short Papers). pp.
          <volume>3586</volume>
          {
          <fpage>3596</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Conneau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiela</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwenk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrault</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bordes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Supervised learning of universal sentence representations from natural language inference data</article-title>
          . In: Palmer,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Riedel</surname>
          </string-name>
          , S. (eds.)
          <source>Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP</source>
          <year>2017</year>
          , Copenhagen, Denmark, September 9-
          <issue>11</issue>
          ,
          <year>2017</year>
          . pp.
          <volume>670</volume>
          {
          <fpage>680</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ghosal</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edithal</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ekbal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhattacharyya</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chivukula</surname>
            ,
            <given-names>S.S.S.K.</given-names>
          </string-name>
          :
          <article-title>Novelty goes deep. A deep neural solution to document level novelty detection</article-title>
          . In: Bender,
          <string-name>
            <given-names>E.M.</given-names>
            ,
            <surname>Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Isabelle</surname>
          </string-name>
          , P. (eds.)
          <source>Proceedings of the 27th International Conference on Computational Linguistics</source>
          ,
          <string-name>
            <surname>COLING</surname>
          </string-name>
          <year>2018</year>
          ,
          <string-name>
            <given-names>Santa</given-names>
            <surname>Fe</surname>
          </string-name>
          , New Mexico, USA,
          <year>August</year>
          20-
          <issue>26</issue>
          ,
          <year>2018</year>
          . pp.
          <volume>2802</volume>
          {
          <fpage>2813</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Iorio</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Limpens</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peroni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rotondi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Achtsivassilis</surname>
          </string-name>
          , J.:
          <article-title>Investigating facets to characterise citations for scholars</article-title>
          . In: Beltran,
          <string-name>
            <given-names>A.G.</given-names>
            ,
            <surname>Osborne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Peroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Vahdati</surname>
          </string-name>
          , S. (eds.) Semantics, Analytics, Visualization - 3rd
          <source>International Workshop</source>
          , SAVE-SD
          <year>2017</year>
          , Perth, Australia, April 3,
          <year>2017</year>
          , and 4th International Workshop, SAVE-SD
          <year>2018</year>
          , Lyon, France, April
          <volume>24</volume>
          ,
          <year>2018</year>
          ,
          <source>Revised Selected Papers. Lecture Notes in Computer Science</source>
          , vol.
          <volume>10959</volume>
          , pp.
          <volume>150</volume>
          {
          <fpage>160</fpage>
          . Springer (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kacem</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flatt</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Tracking self-citations in academic publishing</article-title>
          .
          <source>Scientometrics</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kayal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Afzal</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doornenbal</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Katrenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gregory</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A framework to automatically extract funding information from text</article-title>
          . In: Nicosia,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Pardalos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.M.</given-names>
            ,
            <surname>Giu</surname>
          </string-name>
          <string-name>
            <given-names>rida</given-names>
            , G.,
            <surname>Umeton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Sciacca</surname>
          </string-name>
          , V. (eds.)
          <source>Machine Learning, Optimization, and Data Science - 4th International Conference, LOD</source>
          <year>2018</year>
          , Volterra, Italy,
          <source>September 13-16</source>
          ,
          <year>2018</year>
          ,
          <source>Revised Selected Papers. Lecture Notes in Computer Science</source>
          , vol.
          <volume>11331</volume>
          , pp.
          <volume>317</volume>
          {
          <fpage>328</fpage>
          . Springer (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kayal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Groth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gregory</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Scienti c topic attentionality: In uential and trending topics in science</article-title>
          .
          <source>In: Machine Learning, Optimization, and Data Science - 4th International Conference, LOD</source>
          <year>2018</year>
          , Volterra, Italy,
          <source>September 13-16</source>
          ,
          <year>2018</year>
          ,
          <string-name>
            <given-names>Revised</given-names>
            <surname>Selected Papers</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Klavans</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyack</surname>
          </string-name>
          , K.W.:
          <article-title>Research portfolio analysis and topic prominence</article-title>
          .
          <source>J. Informetrics</source>
          <volume>11</volume>
          (
          <issue>4</issue>
          ),
          <volume>1158</volume>
          {
          <fpage>1174</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frommholz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cabanac</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <source>Report on the 7th international workshop on bibliometric-enhanced information retrieval (BIR</source>
          <year>2018</year>
          ).
          <source>SIGIR Forum</source>
          <volume>52</volume>
          (
          <issue>1</issue>
          ),
          <volume>135</volume>
          {
          <fpage>139</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frommholz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cabanac</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandrasekaran</surname>
            ,
            <given-names>M.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaidka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolfram</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Introduction to the special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL)</article-title>
          .
          <source>Int. J. on Digital Libraries</source>
          <volume>19</volume>
          (
          <issue>2-3</issue>
          ),
          <volume>107</volume>
          {
          <fpage>111</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scharnhorst</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Scientometrics and information retrieval: weak-links revitalized</article-title>
          .
          <source>Scientometrics</source>
          <volume>102</volume>
          (
          <issue>3</issue>
          ),
          <volume>2193</volume>
          {
          <fpage>2199</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Min</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Measuring delayed recognition for papers: Uneven weighted summation and total citations</article-title>
          .
          <source>J. Informetrics</source>
          <volume>10</volume>
          (
          <issue>4</issue>
          ),
          <volume>1153</volume>
          {
          <fpage>1165</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Mingers</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leydesdor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A review of theory and practice in scientometrics</article-title>
          .
          <source>Eur. J. Oper. Res</source>
          .
          <volume>246</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>19</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Otlet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Traite de documentation :
          <article-title>le livre sur le livre theorie</article-title>
          et pratique / par Paul Otlet ; pref. de Robert Estivals, av.-pr. de Andre Canonne. Centre de lecture publique de la Communaute francaise de Belgique Ed.
          <article-title>Mundaneum-Palais mondial (</article-title>
          <year>1934</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Panagopoulos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varlamis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Detecting rising stars in dynamic collaborative networks</article-title>
          .
          <source>J. Informetrics</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>198</volume>
          {
          <fpage>222</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Ravenscroft</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liakata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clare</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duma</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Measuring scienti c impact beyond academia: An assessment of existing impact metrics and proposed improvements</article-title>
          .
          <source>PLoS One</source>
          <volume>12</volume>
          (
          <issue>3</issue>
          ),
          <year>e0173152</year>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Rouse</surname>
            ,
            <given-names>W.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lombardi</surname>
            ,
            <given-names>J.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craig</surname>
          </string-name>
          , D.D.:
          <article-title>Modeling research universities: Predicting probable futures of public vs. private and large vs. small research universities</article-title>
          .
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>115</volume>
          (
          <issue>50</issue>
          ),
          <volume>12582</volume>
          {
          <fpage>12589</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Todeschini</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baccini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Handbook of Bibliometric Indicators: Quantitative Tools for Studying and Evaluating Research</article-title>
          . Wiley-VCH (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Wildgaard</surname>
            ,
            <given-names>L.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larsen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A review of the characteristics of 108 author-level bibliometric indicators</article-title>
          .
          <source>Scientometrics</source>
          <volume>101</volume>
          (
          <issue>1</issue>
          ),
          <volume>125</volume>
          {
          <fpage>158</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>