<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding Temporal Trends of Scienti c Concepts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Farber</string-name>
          <email>michael.faerber@cs.uni-freiburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam Jatowt</string-name>
          <email>adam@kuis.db.kyoto-u.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Freiburg</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Social Informatics, Kyoto University</institution>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>132</fpage>
      <lpage>139</lpage>
      <abstract>
        <p>Science evolves very rapidly, and researchers have studied the evolution of coarse-grained research topics. However, to our knowledge, no analysis of the temporal trends of ne-grained scienti c concepts has been performed based on papers' full texts. For this paper, we extract noun phrases as concepts from all computer science papers of arXiv.org. We then identify positive and negative trends by means of simple linear regression, Mann-Kendall test, and Theil-Sen estimate. In our experiments, we obtain noteworthy ndings about trends using the Mann-Kendall test, while the Theil-Sen estimate and simple linear regression lead to many non-scienti c concepts. Our ndings are potentially relevant for both ordinary researchers and researchers working in bibliometrics and scientometrics.</p>
      </abstract>
      <kwd-group>
        <kwd>trend detection</kwd>
        <kwd>scholarly data</kwd>
        <kwd>bibliometrics</kwd>
        <kwd>time series</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        Science evolves very rapidly and the increasing number of researchers and
scienti c publications worldwide in various disciplines reinforce this e ect [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1,2,3</xref>
        ].
We argue that this phenomenon of scienti c evolution is worth investigating
in more detail [
        <xref ref-type="bibr" rid="ref10 ref4 ref5 ref6 ref7 ref8 ref9">4,5,6,7,8,9,10</xref>
        ]. Speci cally, ordinary researchers, as well as
researchers working in bibliometrics and scientometrics, might be interested in
the answers to the following questions:
      </p>
      <p>
        In this work, we target those questions by extracting noun phrases from
a corpus of scienti c papers, namely the contents of all computer science
papers of arXiv.org [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Then, positive and negative trends over time in the
set of extracted noun phrases are identi ed, contributing to answering Q1.
Furthermore, concepts that have replaced other concepts over time (re ected
in the usage statistics) are identi ed, contributing to answering Q2.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Trend Detection</title>
      <p>
        To nd positive and negative trends in time series data, a variety of algorithms
are available [
        <xref ref-type="bibr" rid="ref12 ref13">12,13</xref>
        ]. In the following, we focus on those we used in our analysis
(see Sec. 3).
      </p>
      <p>We consider years as intervals of the time series. Furthermore, we use the
normalized relative frequency of the concepts as the basis for our calculations.
Formally, let D = fd1; :::; djDjg be our document corpus. Let ci be a concept
in the concept set C occurring nci times in the corpus D. Let Dci be the set
of documents in which ci occurs at least once. Then, the normalized relative
frequency of ci on the document level is de ned as the ratio of documents
containing ci with respect to all documents: rfci = jDci j=jDj.</p>
      <p>Simple Linear Regression. For this basic trend detection method, we
calculate for a given concept ci the di erence between the relative frequencies
rfci;k, rfci;l of two time periods k, l (e.g., year 2007 and 2017): d = rfci;k rfci;l.</p>
      <p>
        Mann-Kendall . To obtain statistically signi cant trends in time series
data, the Mann-Kendall test [
        <xref ref-type="bibr" rid="ref13 ref14">14,13</xref>
        ] is commonly used. This test can be applied
as a non-parametric test for monotonic trends. The Mann-Kendall statistic can
be used as indication whether a trend exists statistically and whether it is
positive or negative. More formally, the null hypothesis of Kendall's is that
there is no trend (H0 : = 0). The alternative hypothesis is that there is a trend
(H1 : 6= 0). Given that we have the concepts in a temporal order, let Gi be the
number of data points after yi that are greater than yi. Let Li be the number of
data points after yi that are smaller than yi. Then, the Kendall's coe cient is
calculated as
and S is thereby de ned as
= 2S=n(n
      </p>
      <p>1)
S =
n 1
X (Gi
i=1</p>
      <p>
        Li)
and corresponds to the the sum of the di erences between Gi and Li along the
time series. Since we are dealing with a su ciently large number of time slots n,
we can assume normal distribution for the test statistic z [
        <xref ref-type="bibr" rid="ref13 ref15">13,15</xref>
        ] and write
z =
p2(2n + 5)=9n(n
1)
      </p>
      <p>
        Theil-Sen Trend Line. The Theil-Sen estimate [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] can be used to estimate
the slope of a trend. It can be considered a non-parametric alternative to the
parametric ordinary least squares regression line. A Theil-Sen line models how
the median value changes linearly with time [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Formally, let
      </p>
      <p>Bn = f
yj
xj
yi : xi 6= xj ; 1
xi
i &lt; j
ng
The Theil-Sen estimator ^n is then de ned as the median of all slopes in Bn,
i.e., ^n = med(Bn) with med standing for the median.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Trend Detection of Scienti c Concepts</title>
      <p>We now describe our approach for extracting concepts from scienti c papers and
identifying positive and negative trends.1
3.1</p>
      <p>
        Data Set and Extracting Concepts
We use the arXiv CS data set [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as our database. This data set contains the
plaintexts of all papers hosted at arXiv.org in the eld of computer science from
the early beginnings of arXiv.org until December 2017. As corpora covering the
contents of research papers are rare, and the usage of arXiv.org has become
increasingly common in recent years, we believe that this data set is a valid
basis for concept evolution analyses. In total, the data set covers about 90,000
papers, resulting in about 16 million plaintext sentences. Note that in this data
set, formulas have been replaced by placeholders for easier text processing.
      </p>
      <p>
        Given the papers' fulltexts, we are interested in the concepts mentioned in
these papers. For this paper, we use case-insensitive noun phrases as concept
representations. Thus, we extract noun phrases from the papers' fulltexts.
Our approach uses an extended rule set of [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] (in total, eight rules) on the
part-of-speech tags assigned by the Stanford parser.
      </p>
      <p>Given the 15.5M sentences from the initial data set, we obtained 10.67M
unique noun phrases (76.7M non-unique noun phrases). When ordering
the extracted noun phrases by absolute frequency, we can observe that
domain-speci c concepts, which are in the focus of this paper, occur particularly
in the mid range, while functional words and phrases common for writing papers
(e.g., "number," "section," " gure") appear at the top.2 We use this observation
to lter out irrelevant concepts during trend detection.
3.2</p>
      <p>
        Sparsity and Thresholds
The set of extracted noun phrases still contains many irrelevant, non-scienti c
noun phrases. Processing all of them would result in large databases, unnecessary
trend calculations, and declined querying performance of indices. Thus, we
analyzed the e ectiveness of several ltering methods (following the similar
procedure of [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]): (1) each concept needs to appear in at least 100 documents
within the whole corpus; (2) each concept needs to appear in at least three
di erent years; (3) the combination of methods 1 and 2. Table 1 shows the
results. We can observe that using threshold 1 (i.e., each concept must appear in
at least 100 documents) allows a considerable decrease in the number of concepts.
However, threshold 2 (i.e., the number of years in which each concept needs to
appear) also seems to be very e ective. Ultimately, we followed [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and used
the combination of (1) and (2).
1 See https://github.com/michaelfaerber/scholarly-trends for our source code
and [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] for a demonstration system based on our trend detection approach.
2 The data set of extracted noun phrases is available at https://github.com/
michaelfaerber/scholarly-trends.
Given the set of ltered noun phrase series, we apply the trend detection
algorithms outlined in Sec. 2, namely the simple linear regression, the
Mann-Kendall test, and the Theil-Sen estimate. We thereby obtained the
following ndings:
      </p>
      <p>Simple Linear Regression. We list the top 15 positively and negatively
trending noun phrases using the simple linear regression in Table 2 and 3.3 Very
general concepts (e.g., "experiments," "dataset," and "training") show a strong
increase in the usage over time in our data set. This might be surprising, but can
be partially explained by the fact that our considered concepts are from 2007
and 2017; rather general concepts remain over such a long time span. Given
the negatively trending concepts, it becomes apparent that concepts concerning
formalism and theories were much more important in 2007 than in 2017. Overall,
we can state that the simple linear regression leads partially to already relevant
3 The full list is available online at https://github.com/michaelfaerber/
scholarly-trends.
ndings about trends of concepts, although many abstract concepts are also
found to be trending.</p>
      <p>
        Mann-Kendall test. Table 4 and Table 5 list the top 15 positively and
negatively trending noun phrases using the Mann-Kendall test (see Sec. 2).
Out of all 47,759 indexed noun phrases, 19,525 of them are found to have
a statistically signi cant (positive or negative) trend over the years (using
Kendall's jzj &gt; 3 as in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). This value might appear high. However, note that
we have applied a strong lter for obviously irrelevant concepts (see Sec. 3.2).
      </p>
      <p>Considering all trending noun phrases, we can recognize that the
Mann-Kendall test appears to be a reasonable trend detection method for our
case. We obtained noteworthy ndings concerning the trending concepts:
{ Among the positively trending concepts are many machine
learning-associated concepts, such as "gradient," "deep neural networks,"
"convolutional neural networks," "convolutional layer," and "gpus." The
metrics "ROC" and "AUC" (capitalized for better readability) are also
trending.
{ "One-shot learning" and "data science" are identi ed as positively trending
and render the general orientation of computer science research in recent
years.
{ Negatively trending noun phrases are particularly from the area of formal
(i.e., theoretical) computer science, such as the area of information theory.
Representative, negatively trending concepts are "block length," "bits,"
"shannon," and "message," but also "decision problem" and "Turing
machine." It is quite obvious that arXiv was predominated by theoretical
computer science, while nowadays machine learning is the predominant eld.
Ultimately, this means that our database is, to some extent, unbalanced.
However, we believe that it is acceptable, as it re ects the general orientation
of computer science research overall over the years.
{ Also, the concepts "DBScan" and "LDA" have been used with increasing
frequency (statistically proven) and have remained on a stable level in recent
years. This may appear surprising, as those concepts are believed to have
been established for a long time and therefore might be expected to decrease.
{ "Quantum computing" and "PageRank" have not been identi ed as trending
but show a strong increase and then a plateau when being visualized over
time. These concepts have a very volatile time series.
{ The programming language "Scala" was on the rise and then became stable,
while "Python" is still increasing in recent years.</p>
      <p>Theil-Sen Estimate. Table 6 and Table 7, respectively, list the top positive
and negative trending noun phrases according to the Theil-Sen's estimate (see
Sec. 2 for its de nition). We can observe that using Theil-Sen leads to many
very generic concepts in the lists of trending concepts, such as "experiments"
and "dataset." Thus, this trend detection method can be used to generate an
upper ontology instead of showing trends of speci c scienti c concepts.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Trend Detection Based on Scienti c Papers. Various papers presenting
approaches and demonstration systems deal with the evolution of research topics
over time [
        <xref ref-type="bibr" rid="ref10 ref4 ref5 ref6 ref7 ref8 ref9">4,5,6,7,8,9,10</xref>
        ]. Apart from the visualization frameworks for paper
collections (e.g., via maps or hierarchical views) [
        <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
        ], the approach-describing
papers di er from our paper as follows: (1) the authors cluster topics and, thus,
rather consider high-level concepts [
        <xref ref-type="bibr" rid="ref5 ref6 ref9">5,6,9</xref>
        ]; (2) they do not apply content-based
methods, but methods based on graphs and networks, such as the citation
information [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">10,9,8</xref>
        ] and the author information [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; (3) they use purely the
papers' titles or abstracts but no papers' full texts [
        <xref ref-type="bibr" rid="ref5 ref7 ref8">5,7,8</xref>
        ], which makes it hard
to cover also long-tail concepts.
      </p>
      <p>
        Information Extraction from Scienti c Papers. In the past, several
kinds of information extraction techniques have been applied to scienti c
papers, ranging from noun phrase extraction over entity linking to relation
extraction. Noteworthy in this context are also the SemEval tasks based on
scienti c papers as data sets (see SemEval 2010 Task 5 `Automatic Keyphrase
Extraction from Scienti c Articles" [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and SemEval 2017 Task 10: \ScienceIE {
Extracting Keyphrases and Relations from Scienti c Publications" [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]). While
the extraction of words and bigrams has already been applied to papers [
        <xref ref-type="bibr" rid="ref7 ref8">7,8</xref>
        ],
no paper dedicated to the analysis of scienti c phrases in the papers' full texts
has been presented to our knowledge.
      </p>
      <p>
        Time-Series Analysis and Trend Detection. Among the most frequently
used methods for trend detection are the Mann-Kendall test and Sen's slope [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Related to our work is the analysis of Daniel et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] concerning trending
multi-word expressions in the Google Books data set. However, the domain
of books di ers from our domain-speci c use case. Furthermore, multi-word
expressions cover not only noun phrases, but also proverbs, greetings, etc.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we have presented an analysis concerning positively and
negatively trending scienti c concepts. We identi ed statistically trending
concepts included in all computer science papers of arXiv.org based on several
trend detection methods. We thereby found that the Mann-Kendall test performs
well for this task, while the simple regression and Theil-Sen estimate have
de cits, such as detecting rather general, non-scienti c concepts. Based on the
trending concepts, we not only found that arXiv.org has a strong orientation
towards machine learning and deep learning, but we also identi ed rather
surprising usage patterns.</p>
      <p>
        For the future, we plan to consider various scienti c disciplines based on
the new arXiv data set presented in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Moreover, we plan to perform a deeper
linguistic analysis of the arXiv papers' content. For instance, extracting, storing,
and testing scienti c hypotheses [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] might be a worthy task.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bornmann</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mutz</surname>
          </string-name>
          , R.:
          <article-title>Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>66</volume>
          (
          <issue>11</issue>
          ) (
          <year>2015</year>
          )
          <volume>2215</volume>
          {
          <fpage>2222</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Fortunato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergstrom</surname>
            ,
            <given-names>C.T.</given-names>
          </string-name>
          , Borner,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.A.</given-names>
            ,
            <surname>Helbing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Milojevic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.M.</given-names>
            ,
            <surname>Radicchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Sinatra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Uzzi</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          , et al.:
          <source>Science of science. Science</source>
          <volume>359</volume>
          (
          <issue>6379</issue>
          ) (
          <year>2018</year>
          ) eaao0185
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ware</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mabe</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The STM Report: An overview of scienti c and scholarly journal publishing</article-title>
          . (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , Zhang, J.:
          <article-title>A survey on visualization for scienti c literature topics</article-title>
          .
          <source>J. Visualization</source>
          <volume>21</volume>
          (
          <issue>2</issue>
          ) (
          <year>2018</year>
          )
          <volume>321</volume>
          {
          <fpage>335</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Wang</surname>
            , X., Cheng,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Analyzing evolution of research topics with NEViewer: a new method based on dynamic co-word networks</article-title>
          .
          <source>Scientometrics</source>
          <volume>101</volume>
          (
          <issue>2</issue>
          ) (
          <year>2014</year>
          )
          <volume>1253</volume>
          {
          <fpage>1271</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Salatino</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Osborne</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.:
          <article-title>AUGUR: Forecasting the Emergence of New Research Topics</article-title>
          .
          <source>In: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries</source>
          . JCDL'
          <volume>18</volume>
          (
          <year>2018</year>
          )
          <volume>303</volume>
          {
          <fpage>312</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bolelli</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ertekin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giles</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          :
          <article-title>Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation</article-title>
          .
          <source>In: Proceedings of the 31th European Conference on Information Retrieval</source>
          . (
          <year>2009</year>
          )
          <volume>776</volume>
          {
          <fpage>780</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lagoze</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giles</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          :
          <article-title>Detecting Research Topics via the Correlation between Graphs and Texts</article-title>
          .
          <source>In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          . KDD'
          <volume>07</volume>
          (
          <year>2007</year>
          )
          <volume>370</volume>
          {
          <fpage>379</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Small</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyack</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klavans</surname>
          </string-name>
          , R.:
          <article-title>Identifying emerging topics in science and technology</article-title>
          .
          <source>Research Policy</source>
          <volume>43</volume>
          (
          <issue>8</issue>
          ) (
          <year>2014</year>
          )
          <volume>1450</volume>
          {
          <fpage>1467</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Popescul</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flake</surname>
            ,
            <given-names>G.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lawrence</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giles</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          :
          <article-title>Clustering and Identifying Temporal Trends in Document Databases</article-title>
          .
          <source>In: Proceedings of IEEE Advances in Digital Libraries</source>
          <year>2000</year>
          . ADL'
          <volume>00</volume>
          (
          <year>2000</year>
          )
          <volume>173</volume>
          {
          <fpage>182</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Farber,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Thiemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Jatowt</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>A High-Quality Gold Standard for Citation-based Tasks</article-title>
          .
          <source>In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation</source>
          . LREC'
          <volume>18</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>K.L.</given-names>
          </string-name>
          :
          <article-title>Comparison of Trend Detection Methods</article-title>
          .
          <source>PhD thesis</source>
          , University of Montana,
          <source>Department of Mathematical Sciences, Missoula</source>
          ,
          <string-name>
            <surname>MT</surname>
          </string-name>
          , USA (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Interstate</given-names>
            <surname>Technology</surname>
          </string-name>
          and Regulatory Council: Groundwater Statistics and
          <string-name>
            <given-names>Monitoring</given-names>
            <surname>Compliance</surname>
          </string-name>
          .
          <article-title>Statistical Tools for the Project Life Cycle</article-title>
          . (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Gilbert</surname>
            ,
            <given-names>R.O.</given-names>
          </string-name>
          :
          <article-title>Statistical Methods for Environmental Pollution Monitoring</article-title>
          . John Wiley &amp; Sons (
          <year>1987</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Daniel</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Last</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Exploring Long-Term Temporal Trends in the Use of Multiword Expressions</article-title>
          .
          <source>In: Proceedings of the 12th Workshop on Multiword Expressions. MWE@ACL'16</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sen</surname>
            ,
            <given-names>P.K.</given-names>
          </string-name>
          :
          <article-title>Estimates of the regression coe cient based on Kendall's tau</article-title>
          .
          <source>Journal of the American statistical association</source>
          <volume>63</volume>
          (
          <issue>324</issue>
          ) (
          <year>1968</year>
          )
          <volume>1379</volume>
          {
          <fpage>1389</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. Farber,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Nishioka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Jatowt</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>ScholarSight: Visualizing Temporal Trends of Scienti c Concepts</article-title>
          .
          <source>In: Proceedings of the 19th ACM/IEEE on Joint Conference on Digital Libraries</source>
          . JCDL'
          <volume>19</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Ma,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Ma</surname>
          </string-name>
          , D.:
          <article-title>Combining POS Tagging, Lucene Search and Similarity Metrics for Entity Linking</article-title>
          .
          <source>In: Proceedings of the 14th International Conference on Web Information Systems Engineering</source>
          . WISE'
          <volume>13</volume>
          (
          <year>2013</year>
          )
          <volume>503</volume>
          {
          <fpage>509</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medelyan</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldwin</surname>
          </string-name>
          , T.:
          <article-title>SemEval-2010 Task 5: Automatic Keyphrase Extraction from Scienti c Articles</article-title>
          .
          <source>In: Proceedings of the 5th International Workshop on Semantic Evaluation. SemEval@ACL'10</source>
          (
          <year>2010</year>
          )
          <volume>21</volume>
          {
          <fpage>26</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Augenstein</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riedel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vikraman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SemEval 2017 Task 10: ScienceIE { Extracting Keyphrases and Relations from Scienti c Publications</article-title>
          .
          <source>In: Proceedings of the 11th International Workshop on Semantic Evaluation. SemEval@ACL'17</source>
          (
          <year>2017</year>
          )
          <volume>546</volume>
          {
          <fpage>555</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Saier</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Farber, M.:
          <article-title>Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks</article-title>
          .
          <source>In: Proceedings of the 8th International Workshop on Bibliometric-enhanced Information Retrieval</source>
          . BIR'
          <volume>19</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>N.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hemminger</surname>
            ,
            <given-names>B.M.:</given-names>
          </string-name>
          <article-title>Mining connections between chemicals, proteins, and diseases extracted from Medline annotations</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>43</volume>
          (
          <issue>4</issue>
          ) (
          <year>2010</year>
          )
          <volume>510</volume>
          {
          <fpage>519</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>