<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Preprint abstracts in times of crisis: a comparative study with the pre-pandemic period</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Frederique Bordignon</string-name>
          <email>frederique.bordignon@enpc.fr</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liana Ermakova</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marianne Noel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ecole des Ponts ParisTech</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marne-La-Vallée</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Univ Brest</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brest</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>LISIS</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>INRAE</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Université Gustave Eiffel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marne-la-Vallée</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>37</fpage>
      <lpage>44</lpage>
      <abstract>
        <p>The urgency to respond to the COVID-19 outbreak has driven an unprecedented surge in preprints that aim to speed up knowledge dissemination as they are available much sooner than peer-reviewed publications. In this study we consider abstracts of research articles and preprints as main entry points that draw attention to the most important information of the document and that try to entice us to read the whole article. In this paper, we try to capture and examine shifts in scientific abstract writing produced at the very beginning of the pandemic. We made a comparative study of abstracts in terms of their informativeness associated with preprints issued in response to the COVID-19 pandemic and those produced in 2019, the closest pre-pandemic period. Our results clearly differ from one preprint server to another and show that there are community-centered habits as regards writing and reporting results. The preprints issued from the arXiv, ChemRxiv and Research Square servers tend to have more informative (generous) abstracts than the ones submitted to the other servers. In four servers, the ratio of structured abstracts decreases with the pandemic.</p>
      </abstract>
      <kwd-group>
        <kwd>Scientific abstract</kwd>
        <kwd>preprints</kwd>
        <kwd>academic writing</kwd>
        <kwd>informativeness</kwd>
        <kwd>COVID-19</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The urgency to respond to the COVID-19 pandemic (declared on March 11, 2020 by the WHO2)
has driven an unprecedented surge in preprints [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] that aim to speed up knowledge dissemination as
they are available much sooner than peer-reviewed publications. The International Committee of
Medical Journal Editors stated that pre-publication dissemination of information critical to public health
would not prejudice journal publication in the context of health emergencies declared by the WHO [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Although researchers respond quickly to these emergencies, as Zhang et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] show in their
comparative study of the response patterns of academia to the outbreaks of four viruses (Ebola, H1N1,
Zika and SARS), most articles are published after an epidemic is over. This has been highlighted by a
number of studies about the academic response to different epidemic outbreaks: Xing et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] on the
2003 SARS epidemic, Rabaan et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] on the MERS-CoV disease in Saudi Arabia from 2013 to 2015,
and Kobres et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] on the Zika outbreak. For Xing et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], possible reasons for this publication delay
include “the time taken by authors to prepare and undertake their studies, to write and submit their
papers, and, possibly, their tendency to first submit their results to high profile journals”.
      </p>
      <p>A preprint is the version of an academic article before it has been submitted for peer review and
has been accepted for publication. Preprints related to the COVID-19 crisis are characterized by the
urgency of the expected response, unlike papers published several months after the end of a crisis. Here,
therefore, we are interested in the study of a preprint rather than a published paper.</p>
      <p>2021 Copyright for this paper by its authors.</p>
    </sec>
    <sec id="sec-2">
      <title>2. State-of-the-art</title>
    </sec>
    <sec id="sec-3">
      <title>2.1. Preprints as a response to the crisis</title>
      <p>
        Even though the problem of overfill is important with the COVID-19 pandemic, it has already arisen
in the same terms with the previous health crises mentioned above. In an editorial published in 2010
entitled “Journals, Academics, and Pandemics”, the PLoS Medicine Editors highlighted an “inherent
limitation in the journal publication system with regard to rapid dissemination of results in a time of
crisis” [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Johansson et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] suggested that preprints could provide a solution: they showed that
preprints posted online during the Ebola and Zika outbreaks proposed novel analyses and new data, and
sped up knowledge dissemination, as most of those that were matched to later peer-reviewed
publications were available more than 100 days before the publication. Less than 5% of Ebola and Zika
journal articles were posted as preprints prior to publication in journals, and thus without being
peerreviewed. Although many have warned against considering such papers as more than just a “work in
progress” [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ], an “interim research product” [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], preprint is the preferred solution researchers
have chosen to contribute to the research on COVID-19. It also provides us with an opportunity to
access a unique written output, produced during the crisis itself, and to seek to identify differences in
writing practices. Researchers’ guidance evolved with the data and science, the pace of which is rapid
during a pandemic of a novel disease. This led to an interdisciplinary response with preprints deposited
in various servers. At present, a range of platform types exist from either for-profit or non-profit entities.
These include discipline-specific platforms (e.g., arXiv3, bioRxiv4, ChemRxiv5, medRxiv6), and generic
platforms (e.g., Preprints.org), the latter hosting articles from across a range of disciplines.
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.2. Abstract informativeness metrics</title>
      <p>
        To answer the question of whether it is worth the effort to read full texts or whether the abstract
(along with the title and keywords, all of which are freely available) could be sufficient to gain a clear
idea of a scientific paper, researchers compare the content of the abstract with the content of the
associated full-text. They have established two types of metrics to estimate the quantity of the
information given in a summary: (1) questionnaire-based metrics and (2) overlap-based metrics [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        In the case of questionnaire-based metrics, to compute the level of the retained information, a set of
questions issued from the input texts is built and assessors answer these questions reading only the
summaries [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Otherwise, an assessor may be asked to evaluate the importance of each
      </p>
      <sec id="sec-4-1">
        <title>3 https://arxiv.org/ 4 https://www.biorxiv.org/ 5 https://chemrxiv.org/ 6 https://www.medrxiv.org/</title>
        <p>
          sentence/passage. One example of questionnaire-based measures is the Responsiveness metric
introduced at the Document Understanding Conference (DUC) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. The expert nature of the metrics
of this type makes further re-use impossible.
        </p>
        <p>
          A Pyramid score is in the middle between the questionnaire based and overlap-based metrics since
its idea is to calculate the number of repetitions of information units of variable length inside a sentence
labeled by experts in their own words [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. However, since the Pyramid score is based on manual
assessment not only of the reference summaries, but also of the candidate ones, it can not be re-used to
measure the information quantity in new summaries.
        </p>
        <p>
          The main idea of overlap-based measures is to estimate the proportion of shared words between the
gold-standard (i.e. reference) summary and the summary under consideration. One of the most widely
used measures is the family of ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics
proposed at the DUC Conference [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. While ROUGE is recall oriented, BLEU (Bilingual Evaluation
Understudy) is a modified form of precision of a candidate against multiple references [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. The
METEOR (Metric for Evaluation of Translation with Explicit ORdering) metric is similar to BLEU but
it is able to treat spelling variants and synonyms [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
        </p>
        <p>
          As argued in some papers [
          <xref ref-type="bibr" rid="ref17 ref24">17, 24</xref>
          ], metrics based on vocabulary overlap are not suitable for
measuring the quantity of the information retained in a summary with regard to its corresponding full
text since they do not measure the importance of the information presented in different article sections.
These metrics are designed to rank candidate summaries (i.e. answer the question: Which summary is
better?), but they fail to deal with the comparison of an isolated summary with the full text or with the
comparison of metric scores for summaries of different documents. Thus, they are not able to answer
the question: Is this summary a true representation of the content of the full text? The second reason
why the overlap based metrics fail to answer this question is the lack of interpretability of their output
values since in practice, the values tend to be small, e.g. usually ROUGE score is less than 0.2 (see [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
for more details).
        </p>
        <p>
          Thus, to overcome these issues we introduced a metric called GEM (GEnerosity Measure) [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. The
GEM metric considers the importance of the different sections of a scientific paper based on the
comparison of sections from the full paper and sentences from the abstract. The metric GEM is similar
to the classification of sentences from medical publication abstracts proposed by Dernoncourt and Lee
[
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] who released a PubMed dataset for sequential sentence classification where "each sentence of each
abstract is labeled with their role in the abstract using one of the following classes: background,
objective, method, result, or conclusion". The automatic classification of sentences in medical scientific
abstracts was also addressed in Jin and Szolovits' work [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. In contrast to these two works, GEM is
not limited to biomedical texts. Thus, we decided to use GEM for our study as (1) it is designed for the
analysis of scientific abstracts and is not limited to medical research, (2) it provides interpretable results,
and (3) is publicly available [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>3. Data and Methodology</title>
      <p>
        We used a corpus of 23,957 preprints available online [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Indeed, this corpus of preprints is
designed to allow the comparison of abstracts before and during the crisis. It is based on data indexed
by Dimensions7 and Lens8. These preprints come from the following seven servers: SSRN9, arXiv,
medRxiv, bioRxiv, Research Square10, Preprints.org and ChemRxiv. Similar to Fraser [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], to extract
the subset of preprints related to COVID-19 (COVID-19 corpus) the following query was used:
"coronavirus" OR "COVID-19" OR "sars-cov" OR "ncov-2019" OR "2019-ncov". This corpus contains
3,341 preprints (and their metadata) deposited since January 1, 2020 and retrieved on April 12, 2020,
from the seven different preprint servers mentioned above. The control corpus (Pre-pandemic
Corpus) also contained preprints taken from the same preprint servers published in 2019 with similar
subjects and which we could be almost sure were written by the same communities. The control corpus
was built so as to be comparable, i.e. to deal with comparable topics to those of the COVID-19 corpus
      </p>
      <sec id="sec-5-1">
        <title>7 https://www.dimensions.ai/dimensions-apis/ 8 https://www.lens.org/ 9 https://www.ssrn.com/index.cfm/en 10 https://www.researchsquare.com/</title>
        <p>
          (e.g.: virology, immunology, health policies). This makes it possible to exclude preprints about new or
different topics arisen by the pandemic and to ensure that we are assessing similar written content. All
queries are available online in the dataset [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ].
        </p>
        <p>We tried to use Unpaywall API11 to find the URLs of the preprints full-texts, but too many files were
not available via this tool, especially preprints on ChemRxiv, Research Square and SSRN. As a
consequence, we semi-manually retrieved a maximum of full texts in HTML or PDF format.</p>
        <p>
          In order to assess whether the COVID-19 crisis changed writing habits, we decided to evaluate the
informativeness of the abstracts of the retrieved preprints, and therefore, in the present study used the
GEM score : the methodology and the rules are fully described in [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] but to sum up, we can say that
the GEM score is calculated as the sum of the weights of the section classes retrieved both from a
summary (the abstract in our case) and from a full-text normalized over the total sum of weights of
section classes of a full text. Thus, a higher GEM score (i.e. close to 1) corresponds to a higher level of
abstract generosity (informativeness), while 0 corresponds to an ungenerous abstract and -1 is assigned
when the GEM calculation is unreliable. As in [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], we consider that the GEM score is reliable if at
least four out of the seven section classes (Introduction, Methods, Results, Conclusion, Objectives,
Limits and Perspectives) are automatically identified in the full text using the GROBID tool [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] for
section splitting and their classification algorithm. The section weights were obtained from an online
survey conducted among the scientific community. For each sentence in an abstract, the section class
is assigned according to the class of the sentence from the full-text with the maximal cosine similarity.
        </p>
        <p>
          We computed GEM scores (i.e. GEM ≥ 0) for 74% of the preprints in the whole corpus, with
differences in the proportions among the servers (see Table 1 and the dataset available for reuse [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]).
        </p>
        <p>
          The comparison of abstracts with papers’ sections is related to the analysis of their structure.
Structured abstracts are an emerging trend since they tend to be informative [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] . A structured abstract
is an abstract with distinct, labeled sections for rapid comprehension (Medline/Pubmed 2018). The
IMRAD format (Introduction, Methods, Results, and Discussion) or the CONSORT guidelines for
reporting randomized controlled trials (RCTs) are commonly used. Journal guidelines describe how to
prepare contributions for submission. Some journals have precise guidelines for what an abstract must
include and how it should be structured. Most journals ask for between 150 and 200 words for traditional
abstracts (i.e., those without subheadings). Structured abstracts, which are divided into a number of
named sections, can be longer than traditional ones [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ].
        </p>
        <p>For each abstract, we determine whether it is structured or not by considering the presence of one of
the following words within the first 50 characters of the abstract: "background, purpose, objective, aim,
introduction, rationale, importance". We assume that the existence of these words is a very reliable
indication that the abstract is structured or not, but this method needs further evaluation.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Results</title>
      <p>11 https://unpaywall.org/products/api</p>
      <p>As far as structure is concerned, our first overview of abstracts leads us to find that they are very
different from one server to another, and therefore probably from one community to another (fig 1).
There are less than 5% of structured abstracts in bioRxiv, arXiv and ChemRxiv and there is no big
difference when comparing the two corpuses. For instance, most chemistry journals that feed from
ChemRXiv require graphical abstracts rather than structured ones In contrast, abstracts of preprints
deposited on Research Square are usually structured, which was the case for up to 97% of pre-pandemic
abstracts. But since the crisis, there are less structured abstracts (69%). On medRxiv, SSRN and
Preprints.org, there is also a decline in the number of structured abstracts. The decrease of the proportion
of structured abstracts could be explained by the fact that authors tried to share their results as soon as
possible and as such, may have privileged preprint servers. In contrast to journals, those venues do not
request structured abstracts.
19 corpora)</p>
      <p>
        Shah et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] showed that even though abstracts display many keywords in a small space there is
much more relevant information in the rest of the article. Thus, we decided to calculate GEM score
along with abstract structure analysis. The structured abstracts can be viewed as an attempt to
summarize each section of the document. In contrast, the GEM score shows which sentence is the
closest in the full text. Fig 2 gives a comprehensive synthesis of our results showing differences among
the servers that reveal writing habits specific to scientific communities. This trend is evident for four of
the seven preprint servers. GEM score varies depending on the server, with a greater increase for the
abstracts of arXiv preprints. In contrast to the results of the abstract structure analysis, the GEM score
is higher for the COVID-19 preprints. One possible explanation is that the authors of the preprints' tried
to share more information in order to attract potential readers to their full text. This contrasts with
studies that consider structured abstracts to be generally more informative [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. This contradiction
needs further study to be explained.
      </p>
      <p>The GEM score of abstracts varies a great deal between servers, as shown in Table 2. On
Preprints.org, abstracts have been less generous since the pandemic started: -12,20%. On medRxiv,
SSRN and bioRxiv, there is not much decrease in the GEM score. In contrast, on ChemRxiv and
Research Square, the abstracts are clearly more generous (+7.07% and +12.28%) than they used to be,
and on arXiv, we note a significant increase of +17.49%.</p>
      <p>With the exception of SSRN, for which the GEM score remained stable, it can be seen that the
servers with the lowest rates in 2019 tended to increase and those with the highest rates tended to
decrease: the range of its distribution narrowed, going from [0.466; 0.679] to [0.526; 0.637]. This
indicates shift towards the homogenization of abstracts so as to make them sufficiently informative for
a larger number of readers and people beyond the scientific community.
Research Square
arXiv
0.679
0.651
0.534
0.631
0.589
0.541
0.464</p>
    </sec>
    <sec id="sec-7">
      <title>5. Conclusion</title>
      <p>We found that the general trend for preprint servers is a decrease in the ratio of structured abstracts
during the pandemic. We suppose that authors privileged preprint servers to speed up knowledge
dissemination. As a consequence, structured abstracts are less frequent. Indeed, these servers do not
request structured abstracts in their submission guidelines.</p>
      <p>Our study shows that the rate of abstract generosity ranged widely depending on the server. The
highest increase was found on the arXiv server, whose readers were certainly limited to a community
composed of scientists used to posting there. However, the COVID-19 crisis has attracted a range of
new readers to arXiv (researchers developing modelling and predictive works, journalists tracking the
news, etc.), the authors probably became aware of this at a very early stage and, as a precaution, made
their abstracts more informative considering that it was probably the only piece of text that would be
read by most of the readers. Our work also shows that the results are clearly different from one server
to another. This is important for two reasons: it shows 1) that there are community-centered habits of
writing and reporting results, and 2) it also presents preprints and, more precisely, preprint servers, as
possible bases for other types of analyses that would examine communities or even disciplinary
distinctions.</p>
      <p>The strength of our study is that we considered the peak in production in the very early months of
2020, but also a limitation that calls for work to continue over a longer period of time and to await a
return to normality once the crisis is over.</p>
      <p>We obtain somewhat contradictory results from the abstract structure analysis and the GEM scores,
and this requires further study. The difference in trends of the GEM score and the share of structured
abstracts on various preprint servers also requires further analysis.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fraser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Polka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palfy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Coates</surname>
          </string-name>
          , '
          <article-title>Preprinting a pandemic: the role of preprints in the COVID-19 pandemic'</article-title>
          ,
          <source>BioRxiv Sci. Commun</source>
          . Educ.,
          <year>2020</year>
          , doi: 10.1101/
          <year>2020</year>
          .05.22.111294.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Moorthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Henao Restrepo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.-P.</given-names>
            <surname>Preziosi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Swaminathan</surname>
          </string-name>
          , '
          <article-title>Data sharing for novel coronavirus (COVID-19)'</article-title>
          , Bull. World Health Organ., vol.
          <volume>98</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>150</lpage>
          , Mar.
          <year>2020</year>
          , doi: 10.2471/BLT.20.251561.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          , and W. Glänzel, '
          <article-title>How scientific research reacts to international public health emergencies: a global analysis of response patterns'</article-title>
          ,
          <source>Scientometrics</source>
          , vol.
          <volume>124</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>747</fpage>
          -
          <lpage>773</lpage>
          , juillet
          <year>2020</year>
          , doi: 10.1007/s11192-020-03531-4.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Xing</surname>
          </string-name>
          , G. Hejblum,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Leung</surname>
          </string-name>
          , and A.
          <string-name>
            <surname>-J. Valleron</surname>
          </string-name>
          , '
          <article-title>Anatomy of the Epidemiological Literature on the 2003 SARS Outbreaks in Hong Kong and Toronto: A Time-Stratified Review'</article-title>
          ,
          <source>PLoS Med</source>
          ., vol.
          <volume>7</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>e1000272</fpage>
          -
          <lpage>e1000272</lpage>
          , mai
          <year>2010</year>
          , doi: 10.1371/journal.pmed.
          <volume>1000272</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Rabaan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Al-Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Bazzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Al-Tawfiq</surname>
          </string-name>
          , '
          <article-title>Dynamics of scientific publications on the MERS-CoV outbreaks in Saudi Arabia'</article-title>
          ,
          <source>J. Infect. Public Health</source>
          , vol.
          <volume>10</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>702</fpage>
          -
          <lpage>710</lpage>
          , Nov.
          <year>2017</year>
          , doi: 10.1016/j.jiph.
          <year>2017</year>
          .
          <volume>05</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. Y.</given-names>
            <surname>Kobres</surname>
          </string-name>
          et al.,
          <article-title>'A systematic review and evaluation of Zika virus forecasting and prediction research during a public health emergency of international concern'</article-title>
          ,
          <source>PLoS Negl. Trop. Dis.</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>10</issue>
          , pp.
          <fpage>e0007451</fpage>
          -
          <lpage>e0007451</lpage>
          ,
          <year>2019</year>
          , doi: 10.1371/journal.pntd.
          <volume>0007451</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Orasan</surname>
          </string-name>
          , '
          <article-title>Patterns in scientific abstracts'</article-title>
          ,
          <source>in Proceedings of Corpus Linguistics 2001 Conference</source>
          , Lancaster,
          <year>2001</year>
          , pp.
          <fpage>433</fpage>
          -
          <lpage>443</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , 'Automatic abstracting research',
          <source>Libr. Rev.</source>
          , vol.
          <volume>44</volume>
          , no.
          <issue>8</issue>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>36</lpage>
          ,
          <year>December 1995</year>
          , doi: 10.1108/00242539510102574.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Berkenkotter</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Huckin</surname>
          </string-name>
          , 'Genre Knowledge in Disciplinary Communication: Cognition/Culture/Power',
          <year>1995</year>
          , [Online]. Available: https://experts.umn.edu/en/publications/genreknowledge-in
          <article-title>-disciplinary-communication-cognitionculturepow.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hyland</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Tse</surname>
          </string-name>
          , '
          <article-title>Hooking the reader: a corpus study of evaluative that in abstracts'</article-title>
          ,
          <source>Engl. Specif. Purp.</source>
          , vol.
          <volume>24</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>139</lpage>
          , Jan.
          <year>2005</year>
          , doi: 10.1016/j.esp.
          <year>2004</year>
          .
          <volume>02</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pl</surname>
          </string-name>
          . M. Editors, 'Journals, Academics, and Pandemics',
          <source>PLoS Med</source>
          ., vol.
          <volume>7</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>e1000282</fpage>
          -
          <lpage>e1000282</lpage>
          , mai
          <year>2010</year>
          , doi: 10.1371/journal.pmed.
          <volume>1000282</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Johansson</surname>
          </string-name>
          , N. G. Reich,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Meyers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lipsitch</surname>
          </string-name>
          , '
          <article-title>Preprints: An underutilized mechanism to accelerate outbreak science'</article-title>
          ,
          <source>PLOS Med</source>
          ., vol.
          <volume>15</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>e1002549</fpage>
          -
          <lpage>e1002549</lpage>
          , avril
          <year>2018</year>
          , doi: 10.1371/journal.pmed.
          <volume>1002549</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Desjardins-Proulx</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. P.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Adamson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Poisot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Gravel</surname>
          </string-name>
          , '
          <article-title>The Case for Open Preprints in Biology'</article-title>
          ,
          <source>PLoS Biol</source>
          ., vol.
          <volume>11</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>e1001563</fpage>
          -
          <lpage>e1001563</lpage>
          , mai
          <year>2013</year>
          , doi: 10.1371/journal.pbio.
          <volume>1001563</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>J. A.</surname>
          </string-name>
          <article-title>Teixeira da Silva, 'The preprint debate: What are the issues?', Med</article-title>
          . J.
          <source>Armed Forces India</source>
          , vol.
          <volume>74</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>162</fpage>
          -
          <lpage>164</lpage>
          , avril
          <year>2018</year>
          , doi: 10.1016/j.mjafi.
          <year>2017</year>
          .
          <volume>08</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D.</given-names>
            <surname>Poremski</surname>
          </string-name>
          et al.,
          <article-title>'Moving from “personal communication” to “available online at”: Preprint servers enhance the timeliness of scientific exchange', Child Adolesc</article-title>
          .
          <source>Psychiatry Ment. Health</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>42</fpage>
          -
          <lpage>42</lpage>
          , Oct.
          <year>2019</year>
          , doi: 10.1186/s13034-019-0301-4.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ermakova</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>GEM: measure of the generosity of the abstract comparing to the full text</article-title>
          .
          <source>doi: 10</source>
          .5281/zenodo.1162951
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Cossu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Mothe</surname>
          </string-name>
          , '
          <article-title>A survey on evaluation of summarization methods', Inf</article-title>
          . Process. Manag., vol.
          <volume>56</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>1794</fpage>
          -
          <lpage>1814</lpage>
          , Sep.
          <year>2019</year>
          , doi: 10.1016/j.ipm.
          <year>2019</year>
          .
          <volume>04</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Seki</surname>
          </string-name>
          , '
          <article-title>Automatic Summarization Focusing on Document Genre and Text Structure'</article-title>
          ,
          <source>ACM SIGIR Forum</source>
          , vol.
          <volume>39</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Owczarzak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Conroy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. T.</given-names>
            <surname>Dang</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Nenkova</surname>
          </string-name>
          , '
          <article-title>An Assessment of the Accuracy of Automatic Evaluation in Summarization'</article-title>
          ,
          <source>in Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization</source>
          , Stroudsburg, PA, USA,
          <year>2012</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          , [Online]. Available: http://dl.acm.org/citation.cfm?id=
          <volume>2391258</volume>
          .
          <fpage>2391259</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nenkova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Passonneau</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>McKeown</surname>
          </string-name>
          , '
          <article-title>The Pyramid Method: Incorporating human content selection variation in summarization evaluation'</article-title>
          ,
          <source>ACM Trans Speech Lang Process</source>
          , vol.
          <volume>4</volume>
          , no.
          <issue>2</issue>
          ,
          <string-name>
            <surname>May</surname>
            <given-names>2007</given-names>
          </string-name>
          , doi: 10.1145/1233912.1233913.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>C.-Y. Lin</surname>
          </string-name>
          , '
          <article-title>ROUGE: A Package for Automatic Evaluation of Summaries'</article-title>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>K.</given-names>
            <surname>Papineni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Roukos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ward</surname>
          </string-name>
          , and W.-J. Zhu, '
          <article-title>BLEU: a method for automatic evaluation of machine translation'</article-title>
          ,
          <source>in Proceedings of the 40th annual meeting on association for computational linguistics</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Denkowski</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavie</surname>
          </string-name>
          , '
          <article-title>Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems'</article-title>
          ,
          <source>Proceedings of the EMNLP 2011 Workshop on Statistical Machine Translation</source>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>91</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Louis</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Nenkova</surname>
          </string-name>
          , '
          <article-title>What Makes Writing Great? First Experiments on Article Quality Prediction in the Science Journalism Domain'</article-title>
          ,
          <source>Trans. Assoc. Comput. Linguist.</source>
          , vol.
          <volume>1</volume>
          , pp.
          <fpage>341</fpage>
          -
          <lpage>352</lpage>
          , décembre
          <year>2013</year>
          , doi: 10.1162/tacl_a_
          <fpage>00232</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bordignon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Turenne</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Noel</surname>
          </string-name>
          , '
          <article-title>Is the Abstract a Mere Teaser? Evaluating Generosity of Article Abstracts in the Environmental Sciences'</article-title>
          ,
          <source>Front. Res. Metr. Anal.</source>
          , vol.
          <volume>3</volume>
          , mai
          <year>2018</year>
          , doi: 10.3389/frma.
          <year>2018</year>
          .
          <volume>00016</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>F.</given-names>
            <surname>Dernoncourt</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          , '
          <article-title>PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts'</article-title>
          ,
          <source>in Proceedings of the Eighth International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , Taipei, Taiwan, Nov.
          <year>2017</year>
          , pp.
          <fpage>308</fpage>
          -
          <lpage>313</lpage>
          , Accessed: Dec.
          <volume>04</volume>
          ,
          <year>2020</year>
          . [Online]. Available: https://www.aclweb.org/anthology/I17-2052.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jin</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Szolovits</surname>
          </string-name>
          , '
          <article-title>Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts'</article-title>
          ,
          <source>in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source>
          , Brussels, Belgium, Oct.
          <year>2018</year>
          , pp.
          <fpage>3100</fpage>
          -
          <lpage>3109</lpage>
          , doi: 10.18653/v1/
          <fpage>D18</fpage>
          - 1349.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bordignon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Noel</surname>
          </string-name>
          , '
          <article-title>A corpus designed to study preprints produced during the Covid-19 crisis and to make comparative studies with the pre-pandemic period'</article-title>
          .
          <source>Mendeley Data, V1</source>
          ,
          <year>2021</year>
          , [Online]. Available: DOI: 10.17632/rn9b93x5d4.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lopez</surname>
          </string-name>
          , 'GROBID:
          <article-title>Combining Automatic Bibliographic Data Recognition and Term Extraction for Scholarship Publications'</article-title>
          ,
          <year>2009</year>
          , pp.
          <fpage>473</fpage>
          -
          <lpage>474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bordignon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Noel</surname>
          </string-name>
          , '
          <article-title>Data for “Preprint abstracts in times of crisis: a comparative study with the pre-pandemic period”'</article-title>
          .
          <source>Mendeley Data, V1</source>
          ,
          <year>2021</year>
          , [Online].
          <source>Available: doi: 10.17632/nsr333t977.1</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fontelo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gavino</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Sarmiento</surname>
          </string-name>
          , '
          <article-title>Comparing data accuracy between structured abstracts and full-text journal articles: implications in their use for informing clinical decisions'</article-title>
          ,
          <source>Evid. Based Med</source>
          ., vol.
          <volume>18</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>207</fpage>
          -
          <lpage>211</lpage>
          , décembre
          <year>2013</year>
          , doi: 10.1136/eb-2013-101272.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hartley</surname>
          </string-name>
          , '
          <article-title>Current findings from research on structured abstracts</article-title>
          .',
          <string-name>
            <given-names>J.</given-names>
            <surname>Med</surname>
          </string-name>
          . Libr. Assoc., vol.
          <volume>92</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>368</fpage>
          -
          <lpage>371</lpage>
          ,
          <year>2004</year>
          , doi: 10.3163/
          <fpage>1536</fpage>
          -
          <lpage>5050</lpage>
          .
          <year>102</year>
          .3.002.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Perez-Iratxeta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bork</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Andrade</surname>
          </string-name>
          , '
          <article-title>Information extraction from full text scientific articles: Where are the keywords?', BMC Bioinformatics</article-title>
          , vol.
          <volume>4</volume>
          , p.
          <fpage>20</fpage>
          , May
          <year>2003</year>
          , doi: 10.1186/
          <fpage>1471</fpage>
          -2105-4-20.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>